In file:
aaa 012 345
abc deg hij
hij aaa 075
aaa 345 658
I tried:
filer = file.read().split('\n')
count = 0
for line in filer:
lines = line.split(' ')
for words in lines:
#print words, lines.count(words)
if words in set(lines):
count = count + 1
print words, ', count line: ', count
The results showed:
aaa , count line: 1
012 , count line: 2
345 , count line: 3
abc , count line: 4
deg , count line: 5
hij , count line: 6
hij , count line: 7
aaa , count line: 8
075 , count line: 9
aaa , count line: 10
345 , count line: 11
658 , count line: 12
I want to count and print total number of line that contain each word in line. (Sorry about my explanation.)
Expected results:
aaa , count line: 3
012 , count line: 1
345 , count line: 2
abc , count line: 1
deg , count line: 1
hij , count line: 2
hij , count line: 2
aaa , count line: 3
075 , count line: 1
aaa , count line: 3
345 , count line: 2
658 , count line: 1
Any suggestion to print the expect result in order with the original line?
As I need them to be in order for using to calculate "the term frequency of the word that used in line frequency".
For example: the frequency of 'aaa' will be calculated by using the total number of lines divide by the numbers of lines that contain word 'aaa'.
collections.Counter
is made for exacly this purpose:
from collections import Counter
counter = Counter()
with open('data.txt') as data:
for line in data:
counter.update(line.split())
for item, count in counter.items():
print "%s , count: %s" % (item, count)
Output:
abc, count: 1
aaa, count: 3
345, count: 2
012, count: 1
075, count: 1
hij, count: 2
658, count: 1
deg, count: 1
Edit: I'm still a bit unclear about what end result you're looking for, but this produces the exact output you asked for:
from collections import Counter
line_frequencies = Counter()
with open('data.txt') as data:
lines = [line.split() for line in data]
for line in lines:
unique_line = set(line)
line_frequencies.update(unique_line)
for line in lines:
for term in line:
print "%s , count line: %s" % (term, line_frequencies[term])
print "\n"
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments