Search

Search

How to filter a pandas dataframe to just the rows that show change within a group?

debugcn Published at Dev

16

mdrishan

I have a dataframe that I would like to filter down to only the rows that first show change in a certain column within a group.

For example, my dataframe looks like this:

GROUP	DATE	QUANTITY
A	2020-01-01	2
A	2020-01-02	2
A	2020-01-03	3
A	2020-01-04	2
B	2020-01-01	1
B	2020-01-04	2
C	2020-01-01	3
C	2020-01-06	2
C	2020-01-07	2

I would like to be able to produce the table below:

GROUP	DATE	QUANTITY
A	2020-01-01	2
A	2020-01-03	3
A	2020-01-04	2
B	2020-01-01	1
B	2020-01-04	2
C	2020-01-01	3
C	2020-01-06	2

So that we only keep the first row when QUANTITY changes within the group when sorted by date.

How can I achieve this without resorting to an inefficient for loop?

ALollz

Convert to a datetime and sort the values. Then using shift create a mask that keeps rows where the group changes (i.e. first row within group) or the value changes; logically equivalent to keeping rows within group where the quantity changes.

df['DATE'] = pd.to_datetime(df['DATE'])
df = df.sort_values(['GROUP', 'DATE'])

m = (df['QUANTITY'].ne(df['QUANTITY'].shift())   # Quanity Changes
    | df['GROUP'].ne(df['GROUP'].shift()))       # Group Changes

df[m]

  GROUP       DATE  QUANTITY
0     A 2020-01-01         2
2     A 2020-01-03         3
3     A 2020-01-04         2
4     B 2020-01-01         1
5     B 2020-01-04         2
6     C 2020-01-01         3
7     C 2020-01-06         2

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-06-19

0

Comments

0 comments

Login to comment

Related

From Dev

How to filter rows that fall within 1st and 3rd quartile of a particular column in pandas dataframe?

From Dev

Pandas Dataframe: get average of first rows of each subgroup within a group

From Dev

How to filter Pandas rows by another Dataframe columns?

From Java

How to group dataframe rows into list in pandas groupby

From Dev

How to filter rows by group

From Dev

Calculation within Pandas dataframe group

From Dev

pandas DataFrame filter by rows and columns

From Java

How to filter rows containing a string pattern from a Pandas dataframe

From Dev

How to filter a pandas dataframe by string values and matching integers in rows?

From Dev

How do you filter rows in a pandas dataframe conditional on columns existing?

From Dev

Pandas number rows within group in increasing order

From Dev

How to get log rate of change between rows in Pandas DataFrame effectively?

From Java

How to change only the maximum value of a group in pandas dataframe

From Dev

Pandas: Group by, filter rows, get the mean

From Dev

add timedelta data within a group in pandas dataframe

From Dev

Row operations within a group of a pandas dataframe

From Java

pandas: filter rows of DataFrame with operator chaining

From Dev

Filter pandas dataframe rows by multiple column values

From Dev

Pandas filter dataframe rows with a specific year

From Dev

pandas dataframe enumerate rows that passed a filter

From Dev

Pandas: How to extract rows of a dataframe matching Filter1 OR filter2

From Dev

Pandas (0.16.2) Show 3 Rows of Dataframe

From Dev

Filter pandas dataframe by comparing column to multiindex group

From Dev

Filter Pandas DataFrame by group with tag values

From Dev

Is there a concise way to show all rows in pandas for just the current command?

From Dev

How to delete some rows within a group in SAS

From Dev

Using Pandas in Python 3, how do I filter out repeat strings in a column within a dataframe?

From Dev

R Dataframe: aggregating strings within column, across rows, by group

From Dev

How to subtract rows within groups of a dataframe?

Related Related

Article

HotTag

Archive