Python - Finding the top 5 rows containing a word in a dataframe

harry04

I'm trying to make a function that prints the top 5 products and their prices, and the bottom 5 products and their prices of the product listings that contain words from a wordlist. I've tried making it like this -

def wordlist_top_costs(filename, wordlist):
    xlsfile = pd.ExcelFile(filename)
    dframe = xlsfile.parse('Sheet1')    
    dframe['Product'].fillna('', inplace=True)
    dframe['Price'].fillna(0, inplace=True)
    price = {}
    for word in wordlist:
        mask = dframe.Product.str.contains(word, case=False, na=False)
        price[mask] = dframe.loc[mask, 'Price']

    top = sorted(Score.items(), key=operator.itemgetter(1), reverse=True)
    print("Top 10 product prices for: ", wordlist.name)
    for i in range(0, 5):
        print(top[i][0], "  |  ", t[i][1])  

    bottom = sorted(Score.items(), key=operator.itemgetter(1), reverse=False)
    print("Bottom 10 product prices for: ", wordlist.name)
    for i in range(0, 5):
        print(top[i][0], "  |  ", t[i][1])

However, the above function throws an error at line price[mask] = dframe.loc[mask, 'Price in AUD'] that says - TypeError: 'Series' objects are mutable, thus they cannot be hashed Any help to correct/modify this appreciated. Thanks!

Edit - For eg. wordlist - alu, co, vin

Product | Price

  • Aluminium Crown - 22.20

  • Coca Cola - 1.0

  • Brass Box - 28.75

  • Vincent Kettle - 12.00

  • Vinyl Stickers - 0.50

  • Doritos - 2.0

  • Colin's Hair Oil - 5.0

  • Vincent Chase Sunglasses - 75.40

  • American Tourister - $120.90

Output :

Top 3 Product Prices:

Vincent Chase Sunglasses - 75.40

Aluminium Crown - 22.20

Vincent Kettle - 12.0

Bottom 3 Product Prices:

Vinyl Stickers - 0.50

Coca Cola - 1.0

Colin's Hair Oil - 5.0

jezrael

You can use nlargest and nsmallest:

#remove $ and convert column Price to floats
dframe['Price'] = dframe['Price'].str.replace('$', '').astype(float)

#filter by regex - joined all values of list by |
wordlist = ['alu', 'co', 'vin'] 
pat = '|'.join(wordlist)
mask = dframe.Product.str.contains(pat, case=False, na=False)
dframe = dframe.loc[mask, ['Product','Price']]

top = dframe.nlargest(3, 'Price')
#top = dframe.sort_values('Price', ascending=False).head(3)
print (top)
                    Product  Price
7  Vincent Chase Sunglasses   75.4
0           Aluminium Crown   22.2
3            Vincent Kettle   12.0

bottom = dframe.nsmallest(3, 'Price')
#bottom = dframe.sort_values('Price').head(3)
print (bottom)
            Product  Price
4    Vinyl Stickers    0.5
1         Coca Cola    1.0
6  Colin's Hair Oil    5.0

Setup:

dframe = pd.DataFrame({'Price': ['22.20', '1.0', '28.75', '12.00', '0.50', '2.0', '5.0', '75.40', '$120.90'], 'Product': ['Aluminium Crown', 'Coca Cola', 'Brass Box', 'Vincent Kettle', 'Vinyl Stickers', 'Doritos', "Colin's Hair Oil", 'Vincent Chase Sunglasses', 'American Tourister']}, columns=['Product','Price'])
print (dframe)
                    Product    Price
0           Aluminium Crown    22.20
1                 Coca Cola      1.0
2                 Brass Box    28.75
3            Vincent Kettle    12.00
4            Vinyl Stickers     0.50
5                   Doritos      2.0
6          Colin's Hair Oil      5.0
7  Vincent Chase Sunglasses    75.40
8        American Tourister  $120.90

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Grouping, splitting and picking top rows in a dataframe

From Dev

Finding non-numeric rows in dataframe in pandas?

From Dev

Finding rows containing a value (or values) in any column

From Dev

finding similarities in rows for a pandas dataframe

From Dev

Finding Rows containing % symbol in a table

From Dev

How to select top 5 after 20 rows

From Dev

picking rows in pandas dataframe with top scores

From Dev

Python remove word containing "l"

From Dev

Finding links containing a search word with Beautiful Soup

From Dev

finding longest word in a list python

From Dev

Finding the common columns when comparing two rows in a dataframe in python

From Dev

Finding an explicit word in a list using python

From Dev

Finding top 10 in a dataframe in Pandas

From Dev

Finding rows in a Julia dataframe where substrings are NOT found

From Dev

Finding a word in just a sentence and not in a word (python)

From Dev

Python - Finding and multiplying an integer after specific word

From Dev

print top 3 Pandas DataFrame rows to JSON

From Dev

SQLite Finding all documents containing a word

From Dev

Using python/pandas to search dataframe rows containing both a user-specified integer and approximated float value

From Dev

finding number of occurences of any word in Rows

From Dev

finding a word in a read only box selenium and python

From Dev

Finding Word Stems in nltk python

From Dev

Top 5 rows by month

From Dev

Finding links containing a search word with Beautiful Soup

From Dev

Finding an explicit word in a list using python

From Dev

Finding top 5% in Excel

From Dev

Finding the Head Word in Python

From Dev

Removing dataframe rows containing a certain type

From Dev

Filter out all rows in a dataframe containing '**'

Related Related

  1. 1

    Grouping, splitting and picking top rows in a dataframe

  2. 2

    Finding non-numeric rows in dataframe in pandas?

  3. 3

    Finding rows containing a value (or values) in any column

  4. 4

    finding similarities in rows for a pandas dataframe

  5. 5

    Finding Rows containing % symbol in a table

  6. 6

    How to select top 5 after 20 rows

  7. 7

    picking rows in pandas dataframe with top scores

  8. 8

    Python remove word containing "l"

  9. 9

    Finding links containing a search word with Beautiful Soup

  10. 10

    finding longest word in a list python

  11. 11

    Finding the common columns when comparing two rows in a dataframe in python

  12. 12

    Finding an explicit word in a list using python

  13. 13

    Finding top 10 in a dataframe in Pandas

  14. 14

    Finding rows in a Julia dataframe where substrings are NOT found

  15. 15

    Finding a word in just a sentence and not in a word (python)

  16. 16

    Python - Finding and multiplying an integer after specific word

  17. 17

    print top 3 Pandas DataFrame rows to JSON

  18. 18

    SQLite Finding all documents containing a word

  19. 19

    Using python/pandas to search dataframe rows containing both a user-specified integer and approximated float value

  20. 20

    finding number of occurences of any word in Rows

  21. 21

    finding a word in a read only box selenium and python

  22. 22

    Finding Word Stems in nltk python

  23. 23

    Top 5 rows by month

  24. 24

    Finding links containing a search word with Beautiful Soup

  25. 25

    Finding an explicit word in a list using python

  26. 26

    Finding top 5% in Excel

  27. 27

    Finding the Head Word in Python

  28. 28

    Removing dataframe rows containing a certain type

  29. 29

    Filter out all rows in a dataframe containing '**'

HotTag

Archive