I have the following data frame that has been obtained by applying df.groupby(['category', 'unit_quantity']).count()
category | unit_quantity | Count |
---|---|---|
banana | 1EA | 5 |
eggs | 100G | 22 |
100ML | 1 | |
full cream milk | 100G | 5 |
100ML | 1 | |
1L | 38 |
Let's call this latter dataframe as grouped
. I want to find a way to regroup using columns unit_quantity
and Count
it and get
category | unit_quantity | Count | Most Frequent unit_quantity |
---|---|---|---|
banana | 1EA | 5 | 1EA |
eggs | 100G | 22 | 100G |
100ML | 1 | 100G | |
full cream milk | 100G | 5 | 1L |
100ML | 1 | 1L | |
1L | 38 | 1L |
Now, I tried to apply grouped.groupby(level=1).max()
which gives me
unit_quantity | |
---|---|
100G | 22 |
100ML | 1 |
1EA | 5 |
1L | 38 |
Now, because the indices of the latter and grouped
do not coincide, I cannot join it using .merge
. Does someone know how to solve this issue?
Thanks in advance
Starting from your DataFrame :
>>> import pandas as pd
>>> df = pd.DataFrame({'category': ['banana', 'eggs', 'eggs', 'full cream milk', 'full cream milk', 'full cream milk'],
... 'unit_quantity': ['1EA', '100G', '100ML', '100G', '100ML', '1L'],
... 'Count': [5, 22, 1, 5, 1, 38],},
... index = [0, 1, 2, 3, 4, 5])
>>> df
category unit_quantity Count
0 banana 1EA 5
1 eggs 100G 22
2 eggs 100ML 1
3 full cream milk 100G 5
4 full cream milk 100ML 1
5 full cream milk 1L 38
You can use the transform
method applied on max
of the column Count
in order to keep your category
and unit_quantity
values :
>>> idx = df.groupby(['unit_quantity'])['Count'].transform(max) == df['Count']
>>> df[idx]
category unit_quantity Count
0 banana 1EA 5
1 eggs 100G 22
2 eggs 100ML 1
4 full cream milk 100ML 1
5 full cream milk 1L 38
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments