我想阅读一些文本文件并找出每个单词每行重复多少次?这是我的文本文件
并做出这样的输出
line# word#1 word#2 word#3 ......
1 2 0 1
2 0 0 2
.
.
.
我想创建一个函数来执行此操作,我不能将 countvectorizer 函数用于波斯语
例子:
line_counter = 1
with open("text.txt", "r") as opened_file:
lines = opened_file.readlines()
for line in lines:
repeated_elem = {}
words = line.split()
for word in words:
if word in repeated_elem:
repeated_elem[word] += 1
continue
repeated_elem[word] = 1
print("{line}. line. Words: {words}".format(line=line_counter, words=repeated_elem))
line_counter += 1
我的文本文件的内容:
hello hi aloha hello bye
one two three four five two
yes no yes no yes no yes
输出:
>>> python3 test.py
1. line. Words: {'hello': 2, 'hi': 1, 'aloha': 1, 'bye': 1}
2. line. Words: {'one': 1, 'two': 2, 'three': 1, 'four': 1, 'five': 1}
3. line. Words: {'yes': 4, 'no': 3}
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句