熊猫从宽到长，但列值作为新列

debugcn 发表于 Dev

伊万·帕拉（Ivan Parra）

我需要将数据框从宽到长从以下一个转换为一个：

    country_code    category    statistic       2000    2001    2002    2003    2004    2005    2006    2007    2008    2009    2010    2011    2012    2013    2014    2015
0   AFG Rural   Population using at least...    22.0    22.0    23.0    23.0    24.0    25.0    26.0    27.0    27.0    28.0    29.0    30.0    31.0    31.0    32.0    33.0
1   AFG Urban   Population using at least...    31.0    31.0    33.0    35.0    37.0    38.0    40.0    42.0    44.0    46.0    47.0    49.0    51.0    53.0    55.0    56.0
2   ARG Total   Population using at least...    24.0    24.0    25.0    26.0    27.0    28.0    29.0    30.0    31.0    32.0    34.0    35.0    36.0    37.0    38.0    39.0
3   ARG Total   Population using at least...    24.0    24.0    25.0    26.0    27.0    28.0    29.0    30.0    31.0    32.0    34.0    35.0    36.0    37.0    38.0    39.0
4   COL Total   Population using at least...    24.0    24.0    25.0    26.0    27.0    28.0    29.0    30.0    31.0    32.0    34.0    35.0    36.0    37.0    38.0    39.0

我需要一个新的数据框，其具有country_code，category和year作为列值，而在statistic列中的统计值则作为新列，如下所示：

country_code  category year   Population using at least...  Population using safely...
AFG           Rural    2000   22.0                          31.0
AFG           Urban    2001   22.0                          31.0
ARG           Urban    2000   83.0                          80.0
COL           Rural    2000   75.0                          82.0

而且我一直在使用melt，stack和其他pandas函数，但无法正常工作。

戴维·埃里克森

您可以melt像这样使用数据框，但需要指定id_vars。然后set_index()准备列，以根据输出中的需要将最后一列从行旋转到列。请注意，示例数据框的“统计列”只有一个唯一字段，但是您会在实际数据中看到多个列：

cols = ['country_code', 'category', 'statistic']
df = (df.melt(id_vars=cols, var_name='year', value_name='')
        .set_index(cols+['year'])
        .unstack(2)
        .reset_index())
df.columns = [''.join(col) for col in df.columns] # makes column names clean/single-level
df
Out[1]: 
    country_code   category  year  Population using at least...
0              0  AFG Rural  2000                          22.0
1              0  AFG Rural  2001                          22.0
2              0  AFG Rural  2002                          23.0
3              0  AFG Rural  2003                          23.0
4              0  AFG Rural  2004                          24.0
..           ...        ...   ...                           ...
75             4  COL Total  2011                          35.0
76             4  COL Total  2012                          36.0
77             4  COL Total  2013                          37.0
78             4  COL Total  2014                          38.0
79             4  COL Total  2015                          39.0