从文本文件中删除重复项
- 作者: 橙熟与柚稚
- 来源: 51数据库
- 2022-10-27
问题描述
我想从文本文件中删除重复的单词.
I want to remove duplicate word from a text file.
我有一些文本文件,其中包含如下内容:
i have some text file which contain such like following:
None_None ConfigHandler_56663624 ConfigHandler_56663624 ConfigHandler_56663624 ConfigHandler_56663624 None_None ColumnConverter_56963312 ColumnConverter_56963312 PredicatesFactory_56963424 PredicatesFactory_56963424 PredicateConverter_56963648 PredicateConverter_56963648 ConfigHandler_80134888 ConfigHandler_80134888 ConfigHandler_80134888 ConfigHandler_80134888
结果输出需要是:
None_None ConfigHandler_56663624 ColumnConverter_56963312 PredicatesFactory_56963424 PredicateConverter_56963648 ConfigHandler_80134888
我只使用了这个命令:en=set(open('file.txt')但它不起作用.
I have used just this command: en=set(open('file.txt') but it does not work.
谁能帮我从文件中提取唯一的集合
Could anyone help me with how to extract only the unique set from the file
谢谢
推荐答案
这里是关于保留顺序的选项(与集合不同),但仍然具有相同的行为(请注意,EOL 字符被故意剥离并忽略空行)...
Here's about option that preserves order (unlike a set), but still has the same behaviour (note that the EOL character is deliberately stripped and blank lines are ignored)...
from collections import OrderedDict
with open('/home/jon/testdata.txt') as fin:
lines = (line.rstrip() for line in fin)
unique_lines = OrderedDict.fromkeys( (line for line in lines if line) )
print unique_lines.keys()
# ['None_None', 'ConfigHandler_56663624', 'ColumnConverter_56963312',PredicatesFactory_56963424', 'PredicateConverter_56963648', 'ConfigHandler_80134888']
那么你只需要将上面的内容写入你的输出文件.
Then you just need to write the above to your output file.
推荐阅读
热点文章
Discord.py(重写)on_member_update 无法正常工作
0
Discord.py 在 vc 中获取用户分钟数
0
discord.py 重写 |为我的命令出错
0
Discord.py rewrite 如何 DM 命令?
0
播放音频时,最后一部分被切断.如何解决这个问题?(discord.py)
0
在消息删除消息 Discord.py
0
如何使 discord.py 机器人私人/直接消息不是作者的人?
0
(Discord.py) 如何获取整个嵌入内容?
0
Discord bot 尽管获得了许可,但不能提及所有人
0
Discord.py discord.NotFound 异常
0
