以下是我的df:
df = pd.DataFrame({'A': [1, 1, 1, 2],
'B': [2, 2, 2, 3],
'C': [3, 3, 3, 4],
'D': ['Cancer A', 'Cancer B', 'Cancer A', 'Cancer B'],
'E': ['Ecog 9', 'Ecog 1', 'Ecog 0', 'Ecog 1'],
'F': ['val 6', 'val 1', 'val 0', 'val 1'],
'measure_m': [100, 200, 500, 300]})
print(df)
A B C D E F measure_m
0 1 2 3 Cancer A Ecog 9 val 6 100
1 1 2 3 Cancer B Ecog 1 val 1 200
2 1 2 3 Cancer A Ecog 0 val 0 500
3 2 3 4 Cancer B Ecog 1 val 1 300
当我pivot
不通过索引而使用此df时,我得到以下信息:
In [1280]: df.pivot(index=None, columns = ['A', 'B', 'C', 'D', 'E', 'F'])
Out[1280]:
measure_m
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100.0 NaN NaN NaN
1 NaN 200.0 NaN NaN
2 NaN NaN 500.0 NaN
3 NaN NaN NaN 300.0
我想要的4 rows
不仅仅是1
列的所有值的单行measure_m
,如下所示:
measure_m
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100.0 200.0 500.0 300.0
怎么办呢?
你的意思是:
df.set_index(list(df.columns[:-1])).T
输出:
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
measure_m 100 200 500 300
更新一些修改以匹配您的输出:
cols = ['A', 'B', 'C', 'D', 'E', 'F']
(df.set_index(cols)
[['measure_m']] # only need this if you have more columns
.unstack(level=cols)
.to_frame().T
)
输出:
measure_m
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100 200 500 300
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句