I have two pandas DataFrames named df1 and df2, which are defined as follow:
>> df1 = pd.DataFrame( {
'id': ['A', 'A', 'A', 'B', 'B', 'C', 'C'],
'year': [2015, 2016, 2017, 2016, 2017, 2015, 2017],
'amount': [1, 2, 3, 4, 5, 6, 7] } )
>> df2 = pd.DataFrame( {
'id': ['A', 'B', 'C'],
'2015': [1, np.NaN, 6],
'2016': [2, 4, np.NaN],
'2017': [3, 5, 7] } )
>> df1
amount id year
0 1 A 2015
1 2 A 2016
2 3 A 2017
3 4 B 2016
4 5 B 2017
5 6 C 2015
6 7 C 2017
>> df2
2015 2016 2017 id
0 1.0 2.0 3 A
1 NaN 4.0 5 B
2 6.0 NaN 7 C
As you can see, these two DataFrames contain the same information. DataFrame df1 is a triplet while df2 is a DataFrame with no duplicate values in its field id.
My question is how can I use pandas to convert df1 to df2 in a effective way ( without a for loop ) ? And how about df2 to df1?
You can pivot df1 to get df2 like this:
pd.pivot_table(df1, index='id', columns='year', values = 'amount')
year 2015 2016 2017
id
A 1 2 3
B NaN 4 5
C 6 NaN 7
And melt to do the reverse
pd.melt(df2, id_vars=["id"],var_name="year", value_name="amount").dropna()
id year amount
0 A 2015 1
2 C 2015 6
3 A 2016 2
4 B 2016 4
6 A 2017 3
7 B 2017 5
8 C 2017 7
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments