Sum pandas dataframe column values based on condition of column name

Jimmy C

I have a DataFrame with column names in the shape of x.y, where I would like to sum up all columns with the same value on x without having to explicitly name them. That is, the value of column_name.split(".")[0] should determine their group. Here's an example:

import pandas as pd
df = pd.DataFrame({'x.1': [1,2,3,4], 'x.2': [5,4,3,2], 'y.8': [19,2,1,3], 'y.92': [10,9,2,4]})

df
Out[3]: 
   x.1  x.2  y.8  y.92
0    1    5   19    10
1    2    4    2     9
2    3    3    1     2
3    4    2    3     4

The result should be the same as this operation, only I shouldn't have to explicitly list the column names and how they should group.

pd.DataFrame({'x': df[['x.1', 'x.2']].sum(axis=1), 'y': df[['y.8', 'y.92']].sum(axis=1)})

   x   y
0  6  29
1  6  11
2  6   3
3  6   7
jezrael

You can first create Multiindex by split and then groupby by first level and aggregate sum:

df.columns = df.columns.str.split('.', expand=True)
print (df)
   x      y    
   1  2   8  92
0  1  5  19  10
1  2  4   2   9
2  3  3   1   2
3  4  2   3   4

df = df.groupby(axis=1, level=0).sum()
print (df)
   x   y
0  6  29
1  6  11
2  6   3
3  6   7

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Java

Pandas DataFrame: replace all values in a column, based on condition

From Dev

Filter pandas Dataframe based on max values in a column

From Dev

Replace values in pandas dataframe column with different replacement dict based on condition

From Dev

Sum up column values in Pandas DataFrame

From Dev

Replace values in a dataframe column based on condition

From Dev

Pandas cumulative sum on column with condition

From Dev

Find Pandas dataframe column based on values, in Python

From Dev

How to delete a column in pandas dataframe based on a condition?

From Dev

Replace values in pandas dataframe based on column names

From Dev

Python / Pandas: Renaming several column names in DataFrame based on condition/index

From Dev

Assign value to a pandas dataframe column based on string condition

From Dev

Reordering pandas dataframe based on multiple column and sum of one column

From Dev

How to replace a value in a pandas dataframe with column name based on a condition?

From Dev

Filter pandas dataframe based on column list values

From Dev

Subsetting a dataframe in pandas according to column name values

From Dev

R, Shiny: subset a dataframe based on condition with reactive column name

From Dev

Multiply row of pandas dataframe based on column name, using values of another dataframe

From Dev

Select column dynamically in Pandas dataframe based on values in a list or another column

From Dev

Pandas: Incrementing values in a column based on condition

From Dev

How to append a column to a dataframe with values based on condition

From Dev

Find Pandas dataframe column based on values, in Python

From Dev

How to delete a column in pandas dataframe based on a condition?

From Dev

adding a column to Pandas dataframe based on adjacent values of existing column

From Dev

How to assign values to a column of a dataframe based on a condition?

From Dev

Normalize column in pandas dataframe by sum of grouped values of another column

From Dev

Pandas overwrite values in column selectively based on condition from another column

From Dev

pandas dataframe create a new column whose values are based on groupby sum on another column

From Dev

Creating Dictionary from Pandas DataFrame Column Based on Unique Values in Column

From Dev

How to sum up values in a column based on the condition in another column?

Related Related

  1. 1

    Pandas DataFrame: replace all values in a column, based on condition

  2. 2

    Filter pandas Dataframe based on max values in a column

  3. 3

    Replace values in pandas dataframe column with different replacement dict based on condition

  4. 4

    Sum up column values in Pandas DataFrame

  5. 5

    Replace values in a dataframe column based on condition

  6. 6

    Pandas cumulative sum on column with condition

  7. 7

    Find Pandas dataframe column based on values, in Python

  8. 8

    How to delete a column in pandas dataframe based on a condition?

  9. 9

    Replace values in pandas dataframe based on column names

  10. 10

    Python / Pandas: Renaming several column names in DataFrame based on condition/index

  11. 11

    Assign value to a pandas dataframe column based on string condition

  12. 12

    Reordering pandas dataframe based on multiple column and sum of one column

  13. 13

    How to replace a value in a pandas dataframe with column name based on a condition?

  14. 14

    Filter pandas dataframe based on column list values

  15. 15

    Subsetting a dataframe in pandas according to column name values

  16. 16

    R, Shiny: subset a dataframe based on condition with reactive column name

  17. 17

    Multiply row of pandas dataframe based on column name, using values of another dataframe

  18. 18

    Select column dynamically in Pandas dataframe based on values in a list or another column

  19. 19

    Pandas: Incrementing values in a column based on condition

  20. 20

    How to append a column to a dataframe with values based on condition

  21. 21

    Find Pandas dataframe column based on values, in Python

  22. 22

    How to delete a column in pandas dataframe based on a condition?

  23. 23

    adding a column to Pandas dataframe based on adjacent values of existing column

  24. 24

    How to assign values to a column of a dataframe based on a condition?

  25. 25

    Normalize column in pandas dataframe by sum of grouped values of another column

  26. 26

    Pandas overwrite values in column selectively based on condition from another column

  27. 27

    pandas dataframe create a new column whose values are based on groupby sum on another column

  28. 28

    Creating Dictionary from Pandas DataFrame Column Based on Unique Values in Column

  29. 29

    How to sum up values in a column based on the condition in another column?

HotTag

Archive