Sum of columns based on range of values of other columns in a Pandas dataframe

debugcn 投稿 Dev

Amir

This is my dataframe:

df = pd.DataFrame({'sym': ['a', 'b'], 'vol_1': [100, 50], 'price_1': [5, 150], 'vol_2': [1500, 2000], 'price_2': [20, 175],
                   'vol_3': [123, 500], 'price_3': [22, 1000], 'min': [18, 150], 'max': [23, 176]})

I want to add a column that sums vol_1, vol_2, and vol_3 for each row if the price that is in the next column for each vol is in range of min and max cols. For example for the first row I want vol_2 and vol_3 because the prices are in range of min and max. My desired outcome looks like this:

 sym  vol_1  price_1  vol_2  price_2  vol_3  price_3  min  max  vol_sum
0   a    100        5   1500       20    123       22   18   23     1623
1   b     50      150   2000      175    500     1000  150  176     2050

sammywemmy

Reshape the data so you have individual columns for vol, price, min and max. Next, filter for only rows where price is between min and max, group by the sym column and append result to df.

df["vol_sum"] = (pd.wide_to_long(df,
                                 stubnames=["vol", "price"], 
                                 i=["sym", "min", "max"], 
                                 j="number", 
                                 sep="_")
                  .query("min <= price <= max", engine="python")
                  .groupby("sym")
                  .vol
                  .sum()
                  .array
                 )



   sym  vol_1   price_1 vol_2   price_2 vol_3   price_3 min max vol_sum
0   a   100      5      1500    20      123     22      18  23  1623
1   b   50      150     2000    175     500    1000     150 176 2050

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-13

コメントを追加

サインイン

分類Dev

Merging two pandas dataframes with common values that are presented in one dataframe as columns and on the other are in rows

分類Dev

Related 関連記事

記事