Calculation within Pandas dataframe group

renjl0810

I've Pandas Dataframe as shown below. What I'm trying to do is, partition (or groupby) by BlockID, LineID, WordID, and then within each group use current WordStartX - previous (WordStartX + WordWidth) to derive another column, e.g., WordDistance to indicate the distance between this word and previous word.

This post Row operations within a group of a pandas dataframe is very helpful but in my case multiple columns involved (WordStartX and WordWidth).

 *BlockID  LineID  WordID  WordStartX  WordWidth     WordDistance
0        0       0       0         275        150                 0
1        0       0       1         431         96   431-(275+150)=6        
2        0       0       2         642         90   642-(431+96)=115
3        0       0       3         746        104   746-(642+90)=14
4        1       0       0         273         69         ...
5        1       0       1         352        151         ...
6        1       0       2         510         92
7        1       0       3         647         90
8        1       0       4         752        105**
Psidom

The diff() and shift() functions are usually helpful for calculation referring to previous or next rows:

df['WordDistance'] = (df.groupby(['BlockID', 'LineID'])
        .apply(lambda g: g['WordStartX'].diff() - g['WordWidth'].shift()).fillna(0).values)

enter image description here

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

add timedelta data within a group in pandas dataframe

From Dev

Row operations within a group of a pandas dataframe

From Dev

Pandas Dataframe Complex Calculation

From Dev

pandas dataframe aggregate calculation

From Dev

Efficient calculation on a pandas dataframe

From Dev

pandas dataframe aggregate calculation

From Dev

update pandas.DataFrame within a group after .groupby()

From Dev

Pandas Dataframe: get average of first rows of each subgroup within a group

From Dev

How to filter a pandas dataframe to just the rows that show change within a group?

From Dev

Pandas DataFrame matrix based calculation

From Dev

Pandas: extract number from calculation within loop

From Dev

pandas sort within group then aggregation

From Dev

group by in Pandas DataFrame Python

From Dev

Group by fields in pandas dataframe

From Dev

within-group calculation of data.frame in R

From Dev

Calculation between groups in a Pandas multiindex dataframe

From Dev

propagate calculation along pandas dataframe rows

From Dev

Optimal Way to solve a pandas dataframe percentage calculation

From Dev

Pandas, multiple calculation per row of the dataframe

From Dev

reordering data within a pandas dataframe

From Dev

aggregating within multiindex dataframe pandas

From Dev

Converting Lists within Pandas Dataframe into New DataFrame

From Java

Extract the maximum value within each group in a dataframe

From Java

Group pandas dataframe in unusual way

From Dev

pandas dataframe group by uneven timestamp

From Dev

Multiple aggregation in group by in Pandas Dataframe

From Dev

Group by groups to Pandas Series/Dataframe

From Dev

Group by continuous indexes in Pandas DataFrame

From Java

group by pandas dataframe and select latest in each group

Related Related

  1. 1

    add timedelta data within a group in pandas dataframe

  2. 2

    Row operations within a group of a pandas dataframe

  3. 3

    Pandas Dataframe Complex Calculation

  4. 4

    pandas dataframe aggregate calculation

  5. 5

    Efficient calculation on a pandas dataframe

  6. 6

    pandas dataframe aggregate calculation

  7. 7

    update pandas.DataFrame within a group after .groupby()

  8. 8

    Pandas Dataframe: get average of first rows of each subgroup within a group

  9. 9

    How to filter a pandas dataframe to just the rows that show change within a group?

  10. 10

    Pandas DataFrame matrix based calculation

  11. 11

    Pandas: extract number from calculation within loop

  12. 12

    pandas sort within group then aggregation

  13. 13

    group by in Pandas DataFrame Python

  14. 14

    Group by fields in pandas dataframe

  15. 15

    within-group calculation of data.frame in R

  16. 16

    Calculation between groups in a Pandas multiindex dataframe

  17. 17

    propagate calculation along pandas dataframe rows

  18. 18

    Optimal Way to solve a pandas dataframe percentage calculation

  19. 19

    Pandas, multiple calculation per row of the dataframe

  20. 20

    reordering data within a pandas dataframe

  21. 21

    aggregating within multiindex dataframe pandas

  22. 22

    Converting Lists within Pandas Dataframe into New DataFrame

  23. 23

    Extract the maximum value within each group in a dataframe

  24. 24

    Group pandas dataframe in unusual way

  25. 25

    pandas dataframe group by uneven timestamp

  26. 26

    Multiple aggregation in group by in Pandas Dataframe

  27. 27

    Group by groups to Pandas Series/Dataframe

  28. 28

    Group by continuous indexes in Pandas DataFrame

  29. 29

    group by pandas dataframe and select latest in each group

HotTag

Archive