Is there a way in pandas to groupby and then count unique where another column has a specified value?

greenstamp :

I have a pandas dataframe with numerous columns. For simplicity, let's say the columns are, 'country', 'time_bucket', 'category' and 'id'. The 'category' can be either 'staff' or 'student'.

import pandas as pd
    data = {'country':  ['A', 'A', 'A', 'B', 'B',],
            'time_bucket': ['8', '8', '8', '8', '9'],
            'category': ['staff', 'staff', 'student','student','staff'],
            'id': ['101', '172', '122', '142', '132'],
            }
        
        df = pd.DataFrame (data, columns = ['country','time_bucket', 'category', 'id'])
df


country time_bucket category    id
0   A      8      staff        101
1   A      8      staff        172
2   A      8      student      122
3   B      8      student      142
4   B      9      staff        132

I want to find out the total number of staff and the total number of students in a country at a particular time interval and add these as new columns.

I can get the total number of people in a country at a particular time interval:

df['persons_count'] = df.groupby(['time_bucket','country'])['id'].transform('nunique')

country time_bucket category    id  persons_count
0   A      8         staff      101    3
1   A      8         staff      172    3
2   A      8         student    122    3
3   B      8         student    142    1
4   B      9         staff      132    1

However, I cannot work out how to take 'type' into account and add this in to my code.

I want something like this:

country time_bucket category    id  staff_count student_count
0   A     8          staff      101     2           1  
1   A     8          staff      172     2           1
2   A     8          student    122     2           1
3   B     8          student    142     0           1
4   B     9          staff      132     1           0

Any advice would be much appreciated!


Adding a new example which shows the need for unique 'id' count

import pandas as pd
data = {'country':  ['A', 'A', 'A', 'A','B', 'B',],
                'time_bucket': ['8', '8', '8', '8', '8','9'],
                'category': ['staff', 'staff', 'student','student','student','staff'],
                'id': ['101', '172', '122', '122','142', '132'],
                }
        
df = pd.DataFrame (data, columns = ['country','time_bucket', 'category', 'id'])
df

country time_bucket category    id
0   A     8         staff       101
1   A     8         staff       172
2   A     8         student     122
3   A     8         student     122
4   B     8         student     142
5   B     9         staff       132

I want something like this:

country time_bucket category    id  staff_count student_count
0   A     8          staff      101     2           1  
1   A     8          staff      172     2           1
2   A     8          student    122     2           1
3   A     8          student    122     2           1
4   B     8          student    142     0           1
5   B     9          staff      132     1           0
Chris :
import pandas as pd
data = {'country':  ['A', 'A', 'A', 'B', 'B',],
    'time_bucket': ['8', '8', '8', '8', '9'],
    'category': ['staff', 'staff', 'student','student','staff'],
    'id': ['101', '172', '122', '142', '132'],
    }

df = pd.DataFrame (data, columns = ['country','time_bucket', 'category', 'id'])


df['persons_count'] = df.groupby(['time_bucket','country', 'category'])['id'].transform('nunique')

df = df.pivot_table(index=['country','time_bucket','id'], columns='category',values='persons_count').fillna(0)

Output

                     category   staff   student
country time_bucket        id       
      A           8       101     2.0       0.0
                          122     0.0       1.0
                          172     2.0       0.0
      B           8       142     0.0       1.0
                  9       132     1.0       0.0

  

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

How to count the values corresponding to each unique value in another column in a dataframe?

分類Dev

Finding Duplicate Data Only Where Another Column Has a Different Value

分類Dev

SQL: count rows where column = a value AND another column is the same as values in the group where the first condition is true?

分類Dev

pandas groupby where you get the max of one column and the min of another column

分類Dev

How to aggregate categorical data where the unique id has multiple rows with different values for another column?

分類Dev

Rename column names of groupby and count result with Pandas

分類Dev

Count combination of variables based on unique column value

分類Dev

pandas groupby with condition on one column to populate another column

分類Dev

Pandas: groupby value_count filter by frequency and subcategory

分類Dev

Groupby and descendingly rank one column based on another one in Pandas

分類Dev

pandas groupby count rate

分類Dev

Counting the number of times a group in a pandas groupby object has a specific value

分類Dev

Create a new column with value in another in pandas

分類Dev

Excel - Get value of a column in a row where match found in another column

分類Dev

Get the value of a column where one column is max and another is min

分類Dev

Pandas: groupby and get tail based on some column value

分類Dev

Pandas Groupby except column and take that columns first value

分類Dev

pandas groupby with sum and unique values

分類Dev

Pandas Groupby nlargest(unique nlargest)

分類Dev

Pandas Groupby nlargest(unique nlargest)

分類Dev

Selecting rows where column in child row has particular value?

分類Dev

Pandas, If value isin column then make change to another column

分類Dev

Python Pandas return DataFrame where value count is above a set number

分類Dev

pandas groupby with count, sum and avg

分類Dev

pandas groupby count based on conditions

分類Dev

Pandas groupby and calculate 1/count

分類Dev

Count how many times a column contains a certain value in Pandas

分類Dev

Pandas: How to find number of unique elements for one column coming from another column?

分類Dev

pandas dataframe count unique list

Related 関連記事

  1. 1

    How to count the values corresponding to each unique value in another column in a dataframe?

  2. 2

    Finding Duplicate Data Only Where Another Column Has a Different Value

  3. 3

    SQL: count rows where column = a value AND another column is the same as values in the group where the first condition is true?

  4. 4

    pandas groupby where you get the max of one column and the min of another column

  5. 5

    How to aggregate categorical data where the unique id has multiple rows with different values for another column?

  6. 6

    Rename column names of groupby and count result with Pandas

  7. 7

    Count combination of variables based on unique column value

  8. 8

    pandas groupby with condition on one column to populate another column

  9. 9

    Pandas: groupby value_count filter by frequency and subcategory

  10. 10

    Groupby and descendingly rank one column based on another one in Pandas

  11. 11

    pandas groupby count rate

  12. 12

    Counting the number of times a group in a pandas groupby object has a specific value

  13. 13

    Create a new column with value in another in pandas

  14. 14

    Excel - Get value of a column in a row where match found in another column

  15. 15

    Get the value of a column where one column is max and another is min

  16. 16

    Pandas: groupby and get tail based on some column value

  17. 17

    Pandas Groupby except column and take that columns first value

  18. 18

    pandas groupby with sum and unique values

  19. 19

    Pandas Groupby nlargest(unique nlargest)

  20. 20

    Pandas Groupby nlargest(unique nlargest)

  21. 21

    Selecting rows where column in child row has particular value?

  22. 22

    Pandas, If value isin column then make change to another column

  23. 23

    Python Pandas return DataFrame where value count is above a set number

  24. 24

    pandas groupby with count, sum and avg

  25. 25

    pandas groupby count based on conditions

  26. 26

    Pandas groupby and calculate 1/count

  27. 27

    Count how many times a column contains a certain value in Pandas

  28. 28

    Pandas: How to find number of unique elements for one column coming from another column?

  29. 29

    pandas dataframe count unique list

ホットタグ

アーカイブ