Pythonic way to regroup a pandas dataframe using max of a column

Cybernetician

I have the following data frame that has been obtained by applying df.groupby(['category', 'unit_quantity']).count()

category unit_quantity Count
banana 1EA 5
eggs 100G 22
100ML 1
full cream milk 100G 5
100ML 1
1L 38

Let's call this latter dataframe as grouped. I want to find a way to regroup using columns unit_quantity and Count it and get

category unit_quantity Count Most Frequent unit_quantity
banana 1EA 5 1EA
eggs 100G 22 100G
100ML 1 100G
full cream milk 100G 5 1L
100ML 1 1L
1L 38 1L

Now, I tried to apply grouped.groupby(level=1).max() which gives me

unit_quantity
100G 22
100ML 1
1EA 5
1L 38

Now, because the indices of the latter and grouped do not coincide, I cannot join it using .merge. Does someone know how to solve this issue?

Thanks in advance

tlentali

Starting from your DataFrame :

>>> import pandas as pd

>>> df = pd.DataFrame({'category': ['banana', 'eggs', 'eggs', 'full cream milk', 'full cream milk', 'full cream milk'], 
...                    'unit_quantity': ['1EA', '100G', '100ML', '100G', '100ML', '1L'], 
...                    'Count': [5, 22, 1, 5, 1, 38],}, 
...                   index = [0, 1, 2, 3, 4, 5]) 
>>> df
    category    unit_quantity   Count
0   banana                1EA       5
1   eggs                 100G      22
2   eggs                100ML       1
3   full cream milk      100G       5
4   full cream milk     100ML       1
5   full cream milk        1L      38

You can use the transform method applied on max of the column Count in order to keep your category and unit_quantity values :

>>> idx = df.groupby(['unit_quantity'])['Count'].transform(max) == df['Count']
>>> df[idx]
    category    unit_quantity   Count
0   banana                1EA       5
1   eggs                 100G      22
2   eggs                100ML       1
4   full cream milk     100ML       1
5   full cream milk        1L      38

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Efficient/Pythonic way to create lists from pandas Dataframe column

From Java

Pythonic way for calculating length of lists in pandas dataframe column

From Dev

Pythonic way to create pairs of values in a column in dataframe

From Dev

pythonic way to parse/split URLs in a pandas dataframe

From Dev

more pythonic way - pandas dataframe manipulation

From Dev

Pythonic way to calculate streaks in pandas dataframe

From Dev

A Pythonic way to reshape Pandas.DataFrame's

From Dev

pandas - Pythonic way to slicing DataFrame with DateTimeIndex

From Dev

How to Invert column values in pandas - pythonic way?

From Dev

pythonic way to detect specific pandas column type

From Dev

Pandas: a Pythonic way to create a hyperlink from a value stored in another column of the dataframe

From Dev

Pythonic way of calculating difference between nth and n-1th value in a large dataframe using Pandas?

From Java

Regroup column values in a pandas df

From Dev

pythonic way to find column values of a dataframe in a given string

From Dev

How to mutate a column of a grouped dataframe using pandas in a more readable way?

From Dev

Is there a way of using isin() as calculator function for another column in pandas dataframe?

From Dev

Efficient/Pythonic way to Filter pandas DataFrame based on priority

From Dev

Pythonic way to use an 'slicer' and a 'where'-equivalent on a pandas dataframe

From Dev

Pythonic way to convert Pandas dataframe from wide to long

From Dev

Pythonic way of obtaining serial correlation of elements in pandas dataframe

From Dev

Pandas dataframe, each cell into list - more pythonic way?

From Dev

Remove nans from lists in all columsn of a pandas dataframe (pythonic way)

From Java

A pythonic and uFunc-y way to turn pandas column into "increasing" index?

From Dev

Most Pythonic way to remove special characters from rows in a column in Pandas

From Dev

What is the most efficient & pythonic way to recode a pandas column?

From Dev

Python : Adding conditional column to pandas dataframe, more pythonic solution?

From Dev

What is the fastest way to find the group by max in a column in a Python Pandas dataframe AND mark it?

From Dev

Normalize pandas dataframe column by the max observed to date

From Dev

Pandas DataFrame get column combined max values

Related Related

  1. 1

    Efficient/Pythonic way to create lists from pandas Dataframe column

  2. 2

    Pythonic way for calculating length of lists in pandas dataframe column

  3. 3

    Pythonic way to create pairs of values in a column in dataframe

  4. 4

    pythonic way to parse/split URLs in a pandas dataframe

  5. 5

    more pythonic way - pandas dataframe manipulation

  6. 6

    Pythonic way to calculate streaks in pandas dataframe

  7. 7

    A Pythonic way to reshape Pandas.DataFrame's

  8. 8

    pandas - Pythonic way to slicing DataFrame with DateTimeIndex

  9. 9

    How to Invert column values in pandas - pythonic way?

  10. 10

    pythonic way to detect specific pandas column type

  11. 11

    Pandas: a Pythonic way to create a hyperlink from a value stored in another column of the dataframe

  12. 12

    Pythonic way of calculating difference between nth and n-1th value in a large dataframe using Pandas?

  13. 13

    Regroup column values in a pandas df

  14. 14

    pythonic way to find column values of a dataframe in a given string

  15. 15

    How to mutate a column of a grouped dataframe using pandas in a more readable way?

  16. 16

    Is there a way of using isin() as calculator function for another column in pandas dataframe?

  17. 17

    Efficient/Pythonic way to Filter pandas DataFrame based on priority

  18. 18

    Pythonic way to use an 'slicer' and a 'where'-equivalent on a pandas dataframe

  19. 19

    Pythonic way to convert Pandas dataframe from wide to long

  20. 20

    Pythonic way of obtaining serial correlation of elements in pandas dataframe

  21. 21

    Pandas dataframe, each cell into list - more pythonic way?

  22. 22

    Remove nans from lists in all columsn of a pandas dataframe (pythonic way)

  23. 23

    A pythonic and uFunc-y way to turn pandas column into "increasing" index?

  24. 24

    Most Pythonic way to remove special characters from rows in a column in Pandas

  25. 25

    What is the most efficient & pythonic way to recode a pandas column?

  26. 26

    Python : Adding conditional column to pandas dataframe, more pythonic solution?

  27. 27

    What is the fastest way to find the group by max in a column in a Python Pandas dataframe AND mark it?

  28. 28

    Normalize pandas dataframe column by the max observed to date

  29. 29

    Pandas DataFrame get column combined max values

HotTag

Archive