How to filter a pandas dataframe to just the rows that show change within a group?

mdrishan

I have a dataframe that I would like to filter down to only the rows that first show change in a certain column within a group.

For example, my dataframe looks like this:

GROUP DATE QUANTITY
A 2020-01-01 2
A 2020-01-02 2
A 2020-01-03 3
A 2020-01-04 2
B 2020-01-01 1
B 2020-01-04 2
C 2020-01-01 3
C 2020-01-06 2
C 2020-01-07 2

I would like to be able to produce the table below:

GROUP DATE QUANTITY
A 2020-01-01 2
A 2020-01-03 3
A 2020-01-04 2
B 2020-01-01 1
B 2020-01-04 2
C 2020-01-01 3
C 2020-01-06 2

So that we only keep the first row when QUANTITY changes within the group when sorted by date.

How can I achieve this without resorting to an inefficient for loop?

ALollz

Convert to a datetime and sort the values. Then using shift create a mask that keeps rows where the group changes (i.e. first row within group) or the value changes; logically equivalent to keeping rows within group where the quantity changes.

df['DATE'] = pd.to_datetime(df['DATE'])
df = df.sort_values(['GROUP', 'DATE'])

m = (df['QUANTITY'].ne(df['QUANTITY'].shift())   # Quanity Changes
    | df['GROUP'].ne(df['GROUP'].shift()))       # Group Changes

df[m]

  GROUP       DATE  QUANTITY
0     A 2020-01-01         2
2     A 2020-01-03         3
3     A 2020-01-04         2
4     B 2020-01-01         1
5     B 2020-01-04         2
6     C 2020-01-01         3
7     C 2020-01-06         2

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to filter rows that fall within 1st and 3rd quartile of a particular column in pandas dataframe?

From Dev

Pandas Dataframe: get average of first rows of each subgroup within a group

From Dev

How to filter Pandas rows by another Dataframe columns?

From Java

How to group dataframe rows into list in pandas groupby

From Dev

How to filter rows by group

From Dev

Calculation within Pandas dataframe group

From Dev

pandas DataFrame filter by rows and columns

From Java

How to filter rows containing a string pattern from a Pandas dataframe

From Dev

How to filter a pandas dataframe by string values and matching integers in rows?

From Dev

How do you filter rows in a pandas dataframe conditional on columns existing?

From Dev

Pandas number rows within group in increasing order

From Dev

How to get log rate of change between rows in Pandas DataFrame effectively?

From Java

How to change only the maximum value of a group in pandas dataframe

From Dev

Pandas: Group by, filter rows, get the mean

From Dev

add timedelta data within a group in pandas dataframe

From Dev

Row operations within a group of a pandas dataframe

From Java

pandas: filter rows of DataFrame with operator chaining

From Dev

Filter pandas dataframe rows by multiple column values

From Dev

Pandas filter dataframe rows with a specific year

From Dev

pandas dataframe enumerate rows that passed a filter

From Dev

Pandas: How to extract rows of a dataframe matching Filter1 OR filter2

From Dev

Pandas (0.16.2) Show 3 Rows of Dataframe

From Dev

Filter pandas dataframe by comparing column to multiindex group

From Dev

Filter Pandas DataFrame by group with tag values

From Dev

Is there a concise way to show all rows in pandas for just the current command?

From Dev

How to delete some rows within a group in SAS

From Dev

Using Pandas in Python 3, how do I filter out repeat strings in a column within a dataframe?

From Dev

R Dataframe: aggregating strings within column, across rows, by group

From Dev

How to subtract rows within groups of a dataframe?

Related Related

  1. 1

    How to filter rows that fall within 1st and 3rd quartile of a particular column in pandas dataframe?

  2. 2

    Pandas Dataframe: get average of first rows of each subgroup within a group

  3. 3

    How to filter Pandas rows by another Dataframe columns?

  4. 4

    How to group dataframe rows into list in pandas groupby

  5. 5

    How to filter rows by group

  6. 6

    Calculation within Pandas dataframe group

  7. 7

    pandas DataFrame filter by rows and columns

  8. 8

    How to filter rows containing a string pattern from a Pandas dataframe

  9. 9

    How to filter a pandas dataframe by string values and matching integers in rows?

  10. 10

    How do you filter rows in a pandas dataframe conditional on columns existing?

  11. 11

    Pandas number rows within group in increasing order

  12. 12

    How to get log rate of change between rows in Pandas DataFrame effectively?

  13. 13

    How to change only the maximum value of a group in pandas dataframe

  14. 14

    Pandas: Group by, filter rows, get the mean

  15. 15

    add timedelta data within a group in pandas dataframe

  16. 16

    Row operations within a group of a pandas dataframe

  17. 17

    pandas: filter rows of DataFrame with operator chaining

  18. 18

    Filter pandas dataframe rows by multiple column values

  19. 19

    Pandas filter dataframe rows with a specific year

  20. 20

    pandas dataframe enumerate rows that passed a filter

  21. 21

    Pandas: How to extract rows of a dataframe matching Filter1 OR filter2

  22. 22

    Pandas (0.16.2) Show 3 Rows of Dataframe

  23. 23

    Filter pandas dataframe by comparing column to multiindex group

  24. 24

    Filter Pandas DataFrame by group with tag values

  25. 25

    Is there a concise way to show all rows in pandas for just the current command?

  26. 26

    How to delete some rows within a group in SAS

  27. 27

    Using Pandas in Python 3, how do I filter out repeat strings in a column within a dataframe?

  28. 28

    R Dataframe: aggregating strings within column, across rows, by group

  29. 29

    How to subtract rows within groups of a dataframe?

HotTag

Archive