How to calculate ratio of values in a pandas dataframe column?

Bankst

I'm new to pandas and decided to learn it by playing around with some data I pulled from my favorite game's API. I have a dataframe with two columns "playerId" and "winner" like so:

playerStatus:
______________________
   playerId   winner
0    1848      True
1    1988      False
2    3543      True
3    1848      False
4    1988      False
...

Each row represents a match the player participated in. My goal is to either transform this dataframe or create a new one such that the win percentage for each playerId is calculated. For example, the above dataframe would become:

playerWinsAndTotals
_________________________________________
   playerId   wins  totalPlayed   winPct
0    1848      1        2         50.0000
1    1988      0        2         0.0000
2    3543      1        1         100.0000
...

It took quite a while of reading pandas docs, but I actually managed to achieve this by essentially creating two different tables (one to find the number of wins for each player, one to find the total games for each player), and merging them, then taking the ratio of wins to games played.

Creating the "wins" dataframe:

temp_df = playerStatus[['playerId', 'winner']].value_counts().reset_index(name='wins')
onlyWins = temp_df[temp_df['winner'] == True][['playerId', 'wins']]
onlyWins
_________________________
    playerId    wins
1     1670       483
3     1748       474
4     2179       468
6     4006       434
8     1668       392
...

Creating the "totals" dataframe:

totalPlayed = playerStatus['playerId'].value_counts().reset_index(name='totalCount').rename(columns={'index': 'playerId'})
totalPlayed
____________________

   playerId   totalCount
0    1670        961
1    1748        919
2    1872        877
3    4006        839
4    2179        837
...

Finally, merging them and adding the "winPct" column.

playerWinsAndTotals = onlyWins.merge(totalPlayed, on='playerId', how='left')
playerWinsAndTotals['winPct'] = playerWinsAndTotals['wins']/playerWinsAndTotals['totalCount'] * 100
playerWinsAndTotals
_____________________________________________

   playerId   wins   totalCount     winPct
0    1670      483      961       50.260146
1    1748      474      919       51.577802
2    2179      468      837       55.913978
3    4006      434      839       51.728248
4    1668      392      712       55.056180
...

Now, the reason I am posting this here is because I know I'm not taking full advantage of what pandas has to offer. Creating and merging two different dataframes just to find the ratio of player wins seems unnecessary. I feel like I took the "scenic" route on this one.

To anyone more experienced than me, how would you tackle this problem?

Henry Ecker

We can take advantage of the way that Boolean values are handled mathematically (True being 1 and False being 0) and use 3 aggregation functions sum, count and mean per group (groupby aggregate). We can also take advantage of Named Aggregation to both create and rename the columns in one step:

df = (
    df.groupby('playerId', as_index=False)
        .agg(wins=('winner', 'sum'),
             totalCount=('winner', 'count'),
             winPct=('winner', 'mean'))
)
# Scale up winPct
df['winPct'] *= 100

df:

   playerId  wins  totalCount  winPct
0      1848     1           2    50.0
1      1988     0           2     0.0
2      3543     1           1   100.0

DataFrame and imports:

import pandas as pd

df = pd.DataFrame({
    'playerId': [1848, 1988, 3543, 1848, 1988],
    'winner': [True, False, True, False, False]
})

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to calculate ratio from two different pandas dataframe

From Dev

How to calculate the values of a pandas DataFrame column depending on the results of a rolling function from another column

From Dev

How to calculate unique values in pandas dataframe while grouping by values of particular column

From Dev

How to create a ratio score from values in a pandas DataFrame?

From Dev

Pandas: using groupby to calculate a ratio by specific values

From Dev

How to calculate new "normalized" column in a Pandas dataframe?

From Dev

How to calculate statistical values on Pandas dataframe?

From Dev

How to calculate with previous values in a Pandas MultiIndex DataFrame?

From Dev

pandas for each group calculate ratio of two categories, and append as a new column to dataframe using .pipe()

From Dev

How can I calculate number of consecutive values in a column within a group in a pandas dataframe?

From Dev

Calculate percentage change between values of column in Pandas dataframe

From Dev

pandas dataframe calculate multiple rows based on column ranges and values

From Dev

pandas DataFrame: Calculate Sum based on boolean values in another column

From Dev

Calculate the sum of values based on another column value in pandas dataframe?

From Dev

Easier way for distributing elements of list to a new pandas DataFrame column in a specific ratio conditional on other column values of same dataframe

From Dev

How to use column values as headers in Pandas DataFrame

From Java

How to select a range of values in a pandas dataframe column?

From Dev

How to iterate and edit values of a column in pandas dataframe

From Python

How to switch column values in the same Pandas DataFrame

From Dev

How to edit all values of a column in a pandas dataframe?

From Dev

How to Split the values of a column in a Pandas dataframe?

From Dev

How to add a column in a pandas dataframe with values that repeat?

From Dev

How to reshape pandas dataframe by column values?

From Dev

How to add values to a new column in pandas dataframe?

From Dev

How to slice column values in Python pandas DataFrame

From Dev

How to join column values in pandas MultiIndex DataFrame?

From Dev

How to check a type of column values in pandas DataFrame

From Dev

How to take values in the column as the columns in the DataFrame in pandas

From Dev

How extract values of dictionary column in pandas dataframe

Related Related

  1. 1

    How to calculate ratio from two different pandas dataframe

  2. 2

    How to calculate the values of a pandas DataFrame column depending on the results of a rolling function from another column

  3. 3

    How to calculate unique values in pandas dataframe while grouping by values of particular column

  4. 4

    How to create a ratio score from values in a pandas DataFrame?

  5. 5

    Pandas: using groupby to calculate a ratio by specific values

  6. 6

    How to calculate new "normalized" column in a Pandas dataframe?

  7. 7

    How to calculate statistical values on Pandas dataframe?

  8. 8

    How to calculate with previous values in a Pandas MultiIndex DataFrame?

  9. 9

    pandas for each group calculate ratio of two categories, and append as a new column to dataframe using .pipe()

  10. 10

    How can I calculate number of consecutive values in a column within a group in a pandas dataframe?

  11. 11

    Calculate percentage change between values of column in Pandas dataframe

  12. 12

    pandas dataframe calculate multiple rows based on column ranges and values

  13. 13

    pandas DataFrame: Calculate Sum based on boolean values in another column

  14. 14

    Calculate the sum of values based on another column value in pandas dataframe?

  15. 15

    Easier way for distributing elements of list to a new pandas DataFrame column in a specific ratio conditional on other column values of same dataframe

  16. 16

    How to use column values as headers in Pandas DataFrame

  17. 17

    How to select a range of values in a pandas dataframe column?

  18. 18

    How to iterate and edit values of a column in pandas dataframe

  19. 19

    How to switch column values in the same Pandas DataFrame

  20. 20

    How to edit all values of a column in a pandas dataframe?

  21. 21

    How to Split the values of a column in a Pandas dataframe?

  22. 22

    How to add a column in a pandas dataframe with values that repeat?

  23. 23

    How to reshape pandas dataframe by column values?

  24. 24

    How to add values to a new column in pandas dataframe?

  25. 25

    How to slice column values in Python pandas DataFrame

  26. 26

    How to join column values in pandas MultiIndex DataFrame?

  27. 27

    How to check a type of column values in pandas DataFrame

  28. 28

    How to take values in the column as the columns in the DataFrame in pandas

  29. 29

    How extract values of dictionary column in pandas dataframe

HotTag

Archive