Find word in line and count number of line

ThanaDaray

In file:

aaa 012 345
abc deg hij
hij aaa 075
aaa 345 658

I tried:

filer = file.read().split('\n')
count = 0
for line in filer:
    lines = line.split(' ')
    for words in lines:
        #print words, lines.count(words)
        if words in set(lines):
            count = count + 1
            print words, ', count line: ', count

The results showed:

aaa , count line:  1
012 , count line:  2
345 , count line:  3
abc , count line:  4
deg , count line:  5
hij , count line:  6
hij , count line:  7
aaa , count line:  8
075 , count line:  9
aaa , count line:  10
345 , count line:  11
658 , count line:  12

I want to count and print total number of line that contain each word in line. (Sorry about my explanation.)

Expected results:

aaa , count line: 3
012 , count line: 1
345 , count line: 2

abc , count line: 1
deg , count line: 1
hij , count line: 2

hij , count line: 2
aaa , count line: 3
075 , count line: 1

aaa , count line: 3
345 , count line: 2
658 , count line: 1

Any suggestion to print the expect result in order with the original line?

As I need them to be in order for using to calculate "the term frequency of the word that used in line frequency".

For example: the frequency of 'aaa' will be calculated by using the total number of lines divide by the numbers of lines that contain word 'aaa'.

Lukas Graf

collections.Counter is made for exacly this purpose:

from collections import Counter

counter = Counter()

with open('data.txt') as data:
    for line in data:
        counter.update(line.split())

for item, count in counter.items():
    print "%s , count: %s" % (item, count)

Output:

abc, count: 1
aaa, count: 3
345, count: 2
012, count: 1
075, count: 1
hij, count: 2
658, count: 1
deg, count: 1

Edit: I'm still a bit unclear about what end result you're looking for, but this produces the exact output you asked for:

from collections import Counter

line_frequencies = Counter()

with open('data.txt') as data:
    lines = [line.split() for line in data]

for line in lines:
    unique_line = set(line)
    line_frequencies.update(unique_line)


for line in lines:
    for term in line:
        print "%s , count line: %s" % (term, line_frequencies[term])
    print "\n"

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

vim scripting - count number of matches on a line

From Dev

replace word in line only if line number start with + csv file

From Dev

Find to Json line number

From Dev

Find the line number of console application?

From Dev

Find a string and display the line number and the line itself?

From Dev

Count number of tabs per line of text file

From Dev

Bash find line number

From Dev

Scala:Splitting a line and count the number of words

From Dev

Searching for word and finding it's line number LINQ

From Dev

Find Error Line Number in VBA

From Dev

regex to match the last line with a number or word before empty line

From Dev

sed + find value before word in line

From Dev

SQL Find the number of orders in each line count

From Dev

Word Macro Find Last Line of current page

From Dev

Python find word and replace line in file

From Dev

Find the line number of console application?

From Dev

Linux: How to count the number of symbols in a line, print the number, then the line?

From Dev

Command line method to find repeat-word typos, with line numbers

From Dev

How to find unique word in a single line

From Dev

Find a word in a line based on next word

From Dev

How to count the number of appearances of a word in each line

From Dev

copy lines with same word count for a line

From Dev

Replace the first word of every line in a file with the line number

From Dev

How do I count the number of occurrences of a word in a text file with the command line?

From Dev

One line command with variable, word count and zcat

From Dev

Count number of word, line, character in a file

From Dev

Program that verifies if a word is in a file and prints the number of the line and the line itself

From Dev

Output Line Number of Search Word

From Dev

Count number of words in input line. Where word is just consistency, where first character is only letter

Related Related

  1. 1

    vim scripting - count number of matches on a line

  2. 2

    replace word in line only if line number start with + csv file

  3. 3

    Find to Json line number

  4. 4

    Find the line number of console application?

  5. 5

    Find a string and display the line number and the line itself?

  6. 6

    Count number of tabs per line of text file

  7. 7

    Bash find line number

  8. 8

    Scala:Splitting a line and count the number of words

  9. 9

    Searching for word and finding it's line number LINQ

  10. 10

    Find Error Line Number in VBA

  11. 11

    regex to match the last line with a number or word before empty line

  12. 12

    sed + find value before word in line

  13. 13

    SQL Find the number of orders in each line count

  14. 14

    Word Macro Find Last Line of current page

  15. 15

    Python find word and replace line in file

  16. 16

    Find the line number of console application?

  17. 17

    Linux: How to count the number of symbols in a line, print the number, then the line?

  18. 18

    Command line method to find repeat-word typos, with line numbers

  19. 19

    How to find unique word in a single line

  20. 20

    Find a word in a line based on next word

  21. 21

    How to count the number of appearances of a word in each line

  22. 22

    copy lines with same word count for a line

  23. 23

    Replace the first word of every line in a file with the line number

  24. 24

    How do I count the number of occurrences of a word in a text file with the command line?

  25. 25

    One line command with variable, word count and zcat

  26. 26

    Count number of word, line, character in a file

  27. 27

    Program that verifies if a word is in a file and prints the number of the line and the line itself

  28. 28

    Output Line Number of Search Word

  29. 29

    Count number of words in input line. Where word is just consistency, where first character is only letter

HotTag

Archive