Group rows in dataframe by assigning values as a column in pandas dataframe

Shubham R

Consider this dataframe:

import pandas as pd
   df = pd.DataFrame({
    'id': [458,459,464,469,507,512,516,519,519,615]
})

i want to find the difference of 2nd row - 1st row so i implemented:

df['diff'] = df['id'] - df['id'].shift(-1)
df.fillna(1)

    id    diff
0   458   -1.0
1   459   -5.0
2   464   -5.0
3   469   -38.0
4   507   -5.0
5   512   -4.0
6   516   -3.0
7   519    0.0
8   519   -96.0
9   615    1.0

Now i want to group these diff column in such a way that, whenever the difference between the two rows is greater than 10, make a new column group and set all the above rows to 1, and so on.

As you can see in column diff diffrence between 4th row and 3rd

    id    diff    group
0   458   -1.0     1
1   459   -5.0     1
2   464   -5.0     1
3   469   -38.0    1
4   507   -5.0     2
5   512   -4.0     2
6   516   -3.0     2
7   519    0.0     2
8   519   -96.0    2
9   615    1.0     3

Any ideas how to achieve this?

jezrael

You can use diff, compare and then cumsum boolean mask, last add 1:

print (df['diff'].diff())
0     NaN
1    -4.0
2     0.0
3   -33.0
4    33.0
5     1.0
6     1.0
7     3.0
8   -96.0
9    97.0
Name: diff, dtype: float64

df['group'] = (df['diff'].diff() > 10).cumsum() + 1
print (df)
    id  diff  group
0  458  -1.0      1
1  459  -5.0      1
2  464  -5.0      1
3  469 -38.0      1
4  507  -5.0      2
5  512  -4.0      2
6  516  -3.0      2
7  519   0.0      2
8  519 -96.0      2
9  615   1.0      3

df = df.assign(group=df['diff'].diff().gt(10).cumsum().add(1))
print (df)
    id  diff  group
0  458  -1.0      1
1  459  -5.0      1
2  464  -5.0      1
3  469 -38.0      1
4  507  -5.0      2
5  512  -4.0      2
6  516  -3.0      2
7  519   0.0      2
8  519 -96.0      2
9  615   1.0      3

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Java

Efficient way to group pandas dataframe rows by a list of tags in a column

From Java

How to group dataframe rows into list in pandas groupby

From Java

Replacing column values in a pandas DataFrame

From Java

Select rows of pandas dataframe based on column values with duplicates

From Dev

Concatenate column values in Pandas DataFrame with "NaN" values

From Dev

Assigning multiple column values in a single row of pandas DataFrame, in one line

From Dev

Pandas DataFrame column values in to list

From Dev

assigning NumericVector to a column of DataFrame

From Dev

How to print rows if values appear in any column of pandas dataframe

From Dev

Demean column values of a pandas DataFrame

From Dev

selecting rows based on multiple column values in pandas dataframe

From Dev

How do I take rows in Pandas Dataframe and transform into values for a Column?

From Dev

Select CONSECUTIVE rows from a DataFrame based on values in a column in Pandas with Groupby

From Dev

Select rows from a DataFrame based on multiple values in a column in pandas

From Dev

How to find rows with column values having a particular datatype in a Pandas DATAFRAME

From Dev

Filter pandas dataframe rows by multiple column values

From Dev

Deleting DataFrame rows in Pandas based on column value - multiple values to remove

From Dev

Assigning values to dataframe columns

From Dev

Assigning column names while creating dataframe results in nan values

From Dev

Delete rows if there are null values in a specific column in Pandas dataframe

From Dev

Python/Pandas: Drop duplicate rows in dataframe, concatenate values in one column

From Dev

assigning NumericVector to a column of DataFrame

From Dev

Assigning values to dataframe columns

From Dev

Python: creating a group column in a Pandas dataframe based on an integer range of values

From Dev

Select rows from a DataFrame based on last characters of values in a column in pandas

From Dev

splitting a column and assigning header in pandas dataframe

From Dev

Pandas: Split a dataframe rows and re-arrange column values

From Dev

pandas dataframe add multiple rows for group of values with apply

From Dev

Group duplicate column names in pandas DataFrame with missing values

Related Related

  1. 1

    Efficient way to group pandas dataframe rows by a list of tags in a column

  2. 2

    How to group dataframe rows into list in pandas groupby

  3. 3

    Replacing column values in a pandas DataFrame

  4. 4

    Select rows of pandas dataframe based on column values with duplicates

  5. 5

    Concatenate column values in Pandas DataFrame with "NaN" values

  6. 6

    Assigning multiple column values in a single row of pandas DataFrame, in one line

  7. 7

    Pandas DataFrame column values in to list

  8. 8

    assigning NumericVector to a column of DataFrame

  9. 9

    How to print rows if values appear in any column of pandas dataframe

  10. 10

    Demean column values of a pandas DataFrame

  11. 11

    selecting rows based on multiple column values in pandas dataframe

  12. 12

    How do I take rows in Pandas Dataframe and transform into values for a Column?

  13. 13

    Select CONSECUTIVE rows from a DataFrame based on values in a column in Pandas with Groupby

  14. 14

    Select rows from a DataFrame based on multiple values in a column in pandas

  15. 15

    How to find rows with column values having a particular datatype in a Pandas DATAFRAME

  16. 16

    Filter pandas dataframe rows by multiple column values

  17. 17

    Deleting DataFrame rows in Pandas based on column value - multiple values to remove

  18. 18

    Assigning values to dataframe columns

  19. 19

    Assigning column names while creating dataframe results in nan values

  20. 20

    Delete rows if there are null values in a specific column in Pandas dataframe

  21. 21

    Python/Pandas: Drop duplicate rows in dataframe, concatenate values in one column

  22. 22

    assigning NumericVector to a column of DataFrame

  23. 23

    Assigning values to dataframe columns

  24. 24

    Python: creating a group column in a Pandas dataframe based on an integer range of values

  25. 25

    Select rows from a DataFrame based on last characters of values in a column in pandas

  26. 26

    splitting a column and assigning header in pandas dataframe

  27. 27

    Pandas: Split a dataframe rows and re-arrange column values

  28. 28

    pandas dataframe add multiple rows for group of values with apply

  29. 29

    Group duplicate column names in pandas DataFrame with missing values

HotTag

Archive