我有一个看起来像这样的数据框。我创建了3个新列,这些列将从其他列中获取值。我希望功能列将列分开,并为每个用户获取每个功能的总工作时间。
User Function Total hours Damage Processing problem solve damages sweeper
schae Damage Processing 9.36
Julie Problem solve 9.70
John sweeper 18.9
Dan Damages 1.83
Dan Damages 1.83
Julie Damages 1.83
Dan Problem solve 1.83
预期的输出看起来像
User Function Total hours Damage Processing problem solve damages sweeper
schae Damage Processing 9.36 9.36
Julie Problem solve 9.70 9.70
John sweeper 18.9 18.9
Dan Damages 1.83 1.83
Dan sweeper 1.83 1.83
Julie Damages 1.83 1.83
Dan Problem solve 1.83 1.83
我想到了pd.melt但它抛出一个错误值var不存在
res = pd.melt(result,id_vars = ['Function'],value_vars=['Total hours'])
这是使用get_dummies
and的方法df.assign
:
out = (df[['User','Function','Total hours']].assign(**pd.get_dummies(df['Function'])
.mul(df['Total hours'],axis=0).replace(0,np.nan)))
print(out)
User Function Total hours Damage Processing Damages \
0 schae Damage Processing 9.36 9.36 NaN
1 Julie Problem solve 9.70 NaN NaN
2 John sweeper 18.90 NaN NaN
3 Dan Damages 1.83 NaN 1.83
4 Dan Damages 1.83 NaN 1.83
5 Julie Damages 1.83 NaN 1.83
6 Dan Problem solve 1.83 NaN NaN
Problem solve sweeper
0 NaN NaN
1 9.70 NaN
2 NaN 18.9
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 1.83 NaN
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句