我有一个熊猫数据框,看起来像这样:
df = pd.DataFrame({
'job': ['football','football', 'football', 'basketball', 'basketball', 'basketball', 'hokey', 'hokey', 'hokey', 'football','football', 'football', 'basketball', 'basketball', 'basketball', 'hokey', 'hokey', 'hokey'],
'team': [4.0,5.0,9.0,2.0,3.0,6.0,1.0,7.0,8.0, 4.0,5.0,9.0,2.0,3.0,6.0,1.0,7.0,8.0],
'cluster': [0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1]
})
每个cluster
团队包含9个团队。每个小组有3种球队的每种运动football
,basketball
和hokey
。我想对每个集群应用移位功能,以使团队顺序以非常特定的方式出现(我尝试用颜色突出显示):
如何对更大的数据框进行此转换(以上面显示的方式移动行)?
让我们做groupby
+cumcount
创建基于列的顺序计数器cluster
,job
然后用于sort_values
对cluster
this进行数据框排序counter
:
df['j'] = df.groupby(['cluster', 'job']).cumcount()
df = df.sort_values(['cluster', 'j'], ignore_index=True).drop('j', axis=1)
job team cluster
0 football 4.0 0
1 basketball 2.0 0
2 hokey 1.0 0
3 football 5.0 0
4 basketball 3.0 0
5 hokey 7.0 0
6 football 9.0 0
7 basketball 6.0 0
8 hokey 8.0 0
9 football 4.0 1
10 basketball 2.0 1
11 hokey 1.0 1
12 football 5.0 1
13 basketball 3.0 1
14 hokey 7.0 1
15 football 9.0 1
16 basketball 6.0 1
17 hokey 8.0 1
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句