根据列值在DataFrame中复制行，输出列名

debugcn 发表于 Dev

沉默的冲浪者

我有一个看起来像这样的DataFrame：

data = [
    ['item 1', 'Some text', 0.0, 1, 0.25],
    ['item 2', 'Some other text', 0.5, 0.0, 0.0],
    ['item 3', 'Etc.', 0.0, 0.25, 0.0],
]

df = pd.DataFrame(data, columns=['item_name', 'description', 'class1', 'class2', 'class3'])

print(df)

  item_name      description  class1  class2  class3
0    item 1        Some text     0.0    1.00    0.25
1    item 2  Some other text     0.5    0.00    0.00
2    item 3             Etc.     0.0    0.25    0.00

我想重复的每行每一个值的0列发现时间class1到class3，输出item_name，description和class_name。预期结果是：

  item_name      description    class
0    item 1        Some text   class2
1    item 1        Some text   class3
2    item 2  Some other text   class1
3    item 3             Etc.   class2

我设法通过使用Iterrows获得了一些朝着正确方向发展的输出，但是我只能访问类值，而不能访问其名称：

data_transf = []
for index, row in df.iterrows():
   for col in row.loc['class1':'class3']:
        if col > 0: data_transf.append(
            [row['item_name'],
             row['description'],
             col
            ])

df_new = pd.DataFrame(data_transf, columns=['item_name', 'description', 'class'])

print(df_new)

  item_name      description  class
0    item 1        Some text   1.00
1    item 1        Some text   0.25
2    item 2  Some other text   0.50
3    item 3             Etc.   0.25

问题是那col是一个浮点数，我找不到一种方法来访问其索引位置以检索类名。如何做到这一点？也许有更优雅的方法可以使用内置或共同理解来做到这一点？

通道3

替代使用 df.melt

(df.melt(id_vars=['item_name', 'description'],var_name='class').
    query("value>0").drop(columns='value'))

  item_name      description   class
1    item 2  Some other text  class1
3    item 1        Some text  class2
5    item 3             Etc.  class2
6    item 1        Some text  class3

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。