calculate sum of rows in pandas dataframe grouped by date

hdries

I have a csv that I loaded into a Pandas Dataframe.

I then select only the rows with duplicate dates in the DF:

df_dups = df[df.duplicated(['Date'])].copy()

I'm trying to get the sum of all the rows with the exact same date for 4 columns (all float values), like this:

df_sum = df_dups.groupby('Date')["Received Quantity","Sent Quantity","Fee Amount","Market Value"].sum()

However, this does not give the desired result. When I examine df_sum.groups, I've noticed that it did not include the first date in the indices. So for two items with the same date, there would only be one index in the groups object.

pprint(df_dups.groupby('Date')["Received Quantity","Sent Quantity","Fee Amount","Market Value"].groups)

I have no idea how to get the sum of all duplicates.

I've also tried:

df_sum = df_dups.groupby('Date')["Received Quantity","Sent Quantity","Fee Amount","Market Value"].apply(lambda x : x.sum())

This gives the same result, which makes sense I guess, as the indices in the groupby object are not complete. What am I missing here?

Grinjero

Check the documentation for the method duplicated. By default duplicates are marked with True except for the first occurence, which is why the first date is not included in your sums.

You only need to pass in keep=False in duplicated for your desired behaviour.

df_dups = df[df.duplicated(['Date'], keep=False)].copy()

After that the sum can be calculated properly with the expression you wrote

df_sum = df_dups.groupby('Date')["Received Quantity","Sent Quantity","Fee Amount","Market Value"].apply(lambda x : x.sum())

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Add Sum to all grouped rows in pandas dataframe

From Dev

Calculate the sum of differences of all rows in pandas dataFrame

From Dev

Pandas to calculate date differences in dataframe rows

From Dev

Calculate medians of rows in a grouped dataframe

From Dev

Read a file into Pandas dataframe where rows are grouped by date

From Dev

Add new rows to calculate the sum and average from exiting pandas dataframe

From Dev

Pandas DataFrame: column cumsum divide by total sum of rows & columns to date

From Dev

Python pandas sum of rows grouped by multiple columns

From Dev

Sum Odd/Even rows of grouped data in pandas

From Dev

getting specific rows in a grouped dataframe pandas

From Dev

Summing rows in grouped pandas dataframe and return NaN

From Dev

pandas: sorting and dropping rows from a grouped dataframe

From Dev

Sum of specific rows in a dataframe (Pandas)

From Dev

sum of specific rows pandas dataframe

From Dev

Sum Rows at Bottom of Pandas Dataframe

From Dev

Using groupby to calculate cum sum in pandas dataframe

From Dev

Python pandas: Sum first n rows of grouped values x

From Dev

Python Pandas, Running Sum, based on previous rows value and grouped

From Dev

Pandas: Calculate diff column grouped by date and additional column

From Dev

pandas dataframe sum date range of another DataFrame

From Dev

Calculate new column in pandas dataframe based only on grouped records

From Dev

MySQL SUM With Grouped Date

From Dev

Calculate difference between rows in Pandas dataframe

From Dev

Pandas DataFrame: Calculate percentage difference between rows?

From Dev

Calculate activity interval for a pandas DataFrame with datetime rows

From Dev

How to calculate difference between rows in Pandas DataFrame?

From Dev

Pandas dataframe: Sum up rows by date and keep only one row per day without timestamp

From Dev

How to get sum of values grouped by the pandas DataFrame and make numpy matrix?

From Dev

Normalize column in pandas dataframe by sum of grouped values of another column

Related Related

  1. 1

    Add Sum to all grouped rows in pandas dataframe

  2. 2

    Calculate the sum of differences of all rows in pandas dataFrame

  3. 3

    Pandas to calculate date differences in dataframe rows

  4. 4

    Calculate medians of rows in a grouped dataframe

  5. 5

    Read a file into Pandas dataframe where rows are grouped by date

  6. 6

    Add new rows to calculate the sum and average from exiting pandas dataframe

  7. 7

    Pandas DataFrame: column cumsum divide by total sum of rows & columns to date

  8. 8

    Python pandas sum of rows grouped by multiple columns

  9. 9

    Sum Odd/Even rows of grouped data in pandas

  10. 10

    getting specific rows in a grouped dataframe pandas

  11. 11

    Summing rows in grouped pandas dataframe and return NaN

  12. 12

    pandas: sorting and dropping rows from a grouped dataframe

  13. 13

    Sum of specific rows in a dataframe (Pandas)

  14. 14

    sum of specific rows pandas dataframe

  15. 15

    Sum Rows at Bottom of Pandas Dataframe

  16. 16

    Using groupby to calculate cum sum in pandas dataframe

  17. 17

    Python pandas: Sum first n rows of grouped values x

  18. 18

    Python Pandas, Running Sum, based on previous rows value and grouped

  19. 19

    Pandas: Calculate diff column grouped by date and additional column

  20. 20

    pandas dataframe sum date range of another DataFrame

  21. 21

    Calculate new column in pandas dataframe based only on grouped records

  22. 22

    MySQL SUM With Grouped Date

  23. 23

    Calculate difference between rows in Pandas dataframe

  24. 24

    Pandas DataFrame: Calculate percentage difference between rows?

  25. 25

    Calculate activity interval for a pandas DataFrame with datetime rows

  26. 26

    How to calculate difference between rows in Pandas DataFrame?

  27. 27

    Pandas dataframe: Sum up rows by date and keep only one row per day without timestamp

  28. 28

    How to get sum of values grouped by the pandas DataFrame and make numpy matrix?

  29. 29

    Normalize column in pandas dataframe by sum of grouped values of another column

HotTag

Archive