如果另一列中的值较低（没有for循环），则汇总一列

debugcn 发表于 Dev

数据框

我有很多项目的数据框。

这些物品由代码“类型”和重量来标识。

最后一栏表示数量。

|-|------|------|---------|
| | type |weight|quantity |
|-|------|------|---------|
|0|100010|   3  |  456    |
|1|100010|   1  |  159    |
|2|100010|   5  |  735    |
|3|100024|   3  |  153    |
|4|100024|   7  |  175    |
|5|100024|   1  |  759    |
|-|------|------|---------|

相容性规则

如果满足以下条件，则给定项目“ A”与其他项目“兼容”：

同类型
其他物品的重量等于或小于物品“ A”的重量

预期结果

我想为每一行添加一列“兼容数量”，以计算兼容的项目数量。

|-|------|------|---------|---------------------|
| | type |weight|quantity | compatible quantity |
|-|------|------|---------|---------------------|
|0|100010|   3  |  456    |        615          | 456 + 159
|1|100010|   1  |  159    |        159          | 159 only (the lightest items)
|2|100010|   5  |  735    |       1350          | 735 + 159 + 456 (the heaviest)   
|3|100024|   3  |  153    |        912          | 153 + 759
|4|100024|   7  |  175    |       1087          | ...
|5|100024|   1  |  759    |        759          | ...
|-|------|------|---------|---------------------|

我想避免使用For循环ti获得此结果。（数据帧很大）。

我的代码使用For循环

import pandas as pd 

df = pd.DataFrame([[100010, 3, 456],[100010, 1, 159],[100010, 5, 735], [100024, 3, 153], [100024, 7, 175], [100024, 1, 759]],columns = ["type", "weight", "quantity"])

print(df)

for inc in range(df["type"].count()):

    the_type = df["type"].iloc[inc]
    the_weight = df["weight"].iloc[inc]
    the_quantity = df["quantity"].iloc[inc]

    df.at[inc,"quantity_compatible"] = df.loc[(df["type"] == the_type) & (df["weight"] <= the_weight),"quantity"].sum()

print(df)

一些可能的想法

“应用”或“转换”会有所帮助吗？
可以在loc内使用loc来完成吗？

奕奕

首先按weight和对值进行排序type，然后进行groupbyfor cumsum，最后对索引进行合并：

df = pd.DataFrame([[100010, 3, 456],[100010, 1, 159],[100010, 5, 735], [100024, 3, 153], [100024, 7, 175], [100024, 1, 759]],columns = ["type", "weight", "quantity"])

new_df = df.merge(df.sort_values(["type","weight"])
                  .groupby("type")["quantity"]
                  .cumsum(),left_index=True, right_index=True)

print (new_df)

#
     type  weight  quantity_x  quantity_y
0  100010       3         456         615
1  100010       1         159         159
2  100010       5         735        1350
3  100024       3         153         912
4  100024       7         175        1087
5  100024       1         759         759

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。