If I want to create a new DataFrame with several columns, I can add all the columns at once -- for example, as follows:
data = {'col_1': [0, 1, 2, 3],
'col_2': [4, 5, 6, 7]}
df = pd.DataFrame(data)
But now suppose farther down the road I want to add a set of additional columns to this DataFrame. Is there a way to add them all simultaneously, as in
additional_data = {'col_3': [8, 9, 10, 11],
'col_4': [12, 13, 14, 15]}
#Below is a made-up function of the kind I desire.
df.add_data(additional_data)
I'm aware I could do this:
for key, value in additional_data.iteritems():
df[key] = value
Or this:
df2 = pd.DataFrame(additional_data, index=df.index)
df = pd.merge(df, df2, on=df.index)
I was just hoping for something cleaner. If I'm stuck with these two options, which is preferred?
Pandas has assign
method since 0.16.0
. You could use it on dataframes like
In [1506]: df1.assign(**df2)
Out[1506]:
col_1 col_2 col_3 col_4
0 0 4 8 12
1 1 5 9 13
2 2 6 10 14
3 3 7 11 15
or, you could directly use the dictionary like
In [1507]: df1.assign(**additional_data)
Out[1507]:
col_1 col_2 col_3 col_4
0 0 4 8 12
1 1 5 9 13
2 2 6 10 14
3 3 7 11 15
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments