How can I calculate number of consecutive values in a column within a group in a pandas dataframe?

Joe

I have a dataframe with all of a fighter's fights, the fight number (i.e. if it is their first, second, etc.), and whether or not they won the fight. I would like to calculate the number of consecutive wins a fighter had gotten before their current fight (i.e. not including if they won the current fight). I am currently working with Python 3.7 in Spyder.

Suppose we have the following dataframe, where win = 1 if the fighter won the fight:

df = pd.DataFrame({'fighter' : ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C'], 
                  'fight_number' :  ['1', '2', '3', '4', '1', '2', '3', '1', '2'],
                  'win' : [0, 0, 1, 1, 1, 1, 0, 1, 1]})
  fighter  fight_number  win
0       A             1     0
1       A             2     0
2       A             3     1
3       A             4     1
4       B             1     1
5       B             2     1
6       B             3     0
7       C             1     1
8       C             2     1

I know that to calculate win streaks across all rows, I can implement the solution proposed here with:

grouper = (df.win != df.win.shift()).cumsum()
df['streak'] = df.groupby(grouper).cumsum()

which produces:

  fighter fight_number  win  streak
0       A            1    0       0
1       A            2    0       0
2       A            3    1       1
3       A            4    1       2
4       B            1    1       3
5       B            2    1       4
6       B            3    0       0
7       C            1    1       1
8       C            2    1       2

But what I need is to apply this approach to subgroups of the dataframe (i.e. to each fighter) and to not include the outcome of the current fight in the count of the streak. So, I am basically trying to have the current win streak of the fighter when they enter the fight.

The target output in this example would therefore be:

  fighter fight_number  win  streak
0       A            1    0       0
1       A            2    0       0
2       A            3    1       0
3       A            4    1       1
4       B            1    1       0
5       B            2    1       1
6       B            3    0       2
7       C            1    1       0
8       C            2    1       1

I appreciate any advice I can get on this, as I am pretty new to Python.

Joe

One solution I came up with was inspired by an earlier answer posted (but deleted) by jezrael:

grouper = (df.win != df.win.shift()).cumsum()
df['streak'] = df.groupby(['fighter', grouper]).cumsum()
df['streak'] = df.groupby('fighter')['streak'].shift(1).fillna(0)

which produces the target output:

  fighter fight_number  win  streak
0       A            1    0     0.0
1       A            2    0     0.0
2       A            3    1     0.0
3       A            4    1     1.0
4       B            1    1     0.0
5       B            2    1     1.0
6       B            3    0     2.0
7       C            1    1     0.0
8       C            2    1     1.0

and it also seems to work on other test examples:

df2 = pd.DataFrame({'fighter' : ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C'], 
                  'fight number' :  ["1", "2", "3", "4", "5", "6", "1", "2", "3", "1", "2"],
                  'win' : [1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1]}) 

grouper = (df2.win != df2.win.shift()).cumsum()
df2['streak'] = df2.groupby(['fighter', grouper]).cumsum()
df2['streak'] = df2.groupby('fighter')['streak'].shift(1).fillna(0)

   fighter fight number  win  streak
0        A            1    1     0.0
1        A            2    1     1.0
2        A            3    0     2.0
3        A            4    1     0.0
4        A            5    0     1.0
5        A            6    1     0.0
6        B            1    1     0.0
7        B            2    1     1.0
8        B            3    0     2.0
9        C            1    1     0.0
10       C            2    1     1.0

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How can I calculate the sum of 3 values from each number in a pandas dataframe including the first number?

From Dev

Python: Group and count number of consecutive repetitive values in a column in a dataframe

From Dev

PANDAS : How can I output duplicate values within a group?

From Dev

How can I calculate the number within a string?

From Dev

In a Pandas dataframe, how can I extract the difference between the values on separate rows within the same column, conditional on a second column?

From Dev

How to calculate ratio of values in a pandas dataframe column?

From Dev

How can I group by the count number of column values and sort it?

From Dev

How can I group by elements based on multiple columns in pandas dataframe and save the number of elements of each group in another column?

From Dev

How can I add a label in a new column in pandas based on two consecutive values of another column?

From Dev

Pandas DataFrame: How to calculate a new column with Price divided by number of lines of a group category?

From Dev

How can I count the number of floats or integers in a column of a Pandas dataframe?

From Dev

Change all values in column if a condition is met within a group in Pandas dataframe

From Dev

How can I figure out the average consecutive duration of "True" values in pandas df, per group?

From Dev

Pandas Dataframe: Can I fetch other column values along with the column on which group by clause has been applied?

From Dev

How can I find 5 consecutive rows in pandas Dataframe where a value of a certain column is at least 0.5

From Dev

Group identical consecutive values in pandas DataFrame

From Dev

How do I calculate the ratio of two values within a SQL group?

From Dev

Pandas dataframe: Calculate ratios within a group

From Dev

Counting a consecutive number of Null Values in a Pandas Dataframe

From Dev

Calculate percent change in a column within a group in Pandas

From Dev

Pandas dataframe, how can I group by single column and apply sum to multiple column and add new sum column?

From Java

How can I count the number of consecutive TRUEs in a DataFrame?

From Dev

How can I make pandas columns consecutive by group?

From Dev

How can I calculate difference between between 2 non-date values in same column in same group?

From Dev

How can I assign several consecutive values to a dataframe in a loop?

From Dev

pandas dataframe group by values of a column

From Dev

How can I group by a column then calculate a percentage of a column

From Dev

Pandas dataframe, how can I group by multiple columns and apply sum for specific column and add new count column?

From Dev

How can I create a legend in ggplot which assigns names and colors to columns and not to values within a column of a dataframe?

Related Related

  1. 1

    How can I calculate the sum of 3 values from each number in a pandas dataframe including the first number?

  2. 2

    Python: Group and count number of consecutive repetitive values in a column in a dataframe

  3. 3

    PANDAS : How can I output duplicate values within a group?

  4. 4

    How can I calculate the number within a string?

  5. 5

    In a Pandas dataframe, how can I extract the difference between the values on separate rows within the same column, conditional on a second column?

  6. 6

    How to calculate ratio of values in a pandas dataframe column?

  7. 7

    How can I group by the count number of column values and sort it?

  8. 8

    How can I group by elements based on multiple columns in pandas dataframe and save the number of elements of each group in another column?

  9. 9

    How can I add a label in a new column in pandas based on two consecutive values of another column?

  10. 10

    Pandas DataFrame: How to calculate a new column with Price divided by number of lines of a group category?

  11. 11

    How can I count the number of floats or integers in a column of a Pandas dataframe?

  12. 12

    Change all values in column if a condition is met within a group in Pandas dataframe

  13. 13

    How can I figure out the average consecutive duration of "True" values in pandas df, per group?

  14. 14

    Pandas Dataframe: Can I fetch other column values along with the column on which group by clause has been applied?

  15. 15

    How can I find 5 consecutive rows in pandas Dataframe where a value of a certain column is at least 0.5

  16. 16

    Group identical consecutive values in pandas DataFrame

  17. 17

    How do I calculate the ratio of two values within a SQL group?

  18. 18

    Pandas dataframe: Calculate ratios within a group

  19. 19

    Counting a consecutive number of Null Values in a Pandas Dataframe

  20. 20

    Calculate percent change in a column within a group in Pandas

  21. 21

    Pandas dataframe, how can I group by single column and apply sum to multiple column and add new sum column?

  22. 22

    How can I count the number of consecutive TRUEs in a DataFrame?

  23. 23

    How can I make pandas columns consecutive by group?

  24. 24

    How can I calculate difference between between 2 non-date values in same column in same group?

  25. 25

    How can I assign several consecutive values to a dataframe in a loop?

  26. 26

    pandas dataframe group by values of a column

  27. 27

    How can I group by a column then calculate a percentage of a column

  28. 28

    Pandas dataframe, how can I group by multiple columns and apply sum for specific column and add new count column?

  29. 29

    How can I create a legend in ggplot which assigns names and colors to columns and not to values within a column of a dataframe?

HotTag

Archive