I'm new to pandas so excuse me if I sound too naive. I have two dataframes df1 and df2,
df1 = pd.DataFrame({'key1': ['K0', 'K1', 'K2', 'K3'],
'key2': ['K5', 'K4', 'K5', 'K4']})
df2 = pd.DataFrame({'key1': ['K0', 'K1', 'K2', 'K3', 'K9', 'K8', 'K7'],
'key2': ['K5', 'K6', 'K5', 'K4', 'K6', 'K4', 'K5'],
'A':['1', '2', '3', '4', '5', '6', '7'],
'B':['8', '9', '10', '11', '12', '13', '14']})
I'd like to merge df2 on df1 like
final = df1.merge(df2, on=['key1', 'key2'], how='left')
and then have the leftover values in df2 as one dataframe.
Any help would be appreciated. Thanks.
IIUC you need outer join with parameter indicator
, then split by boolean indexing
:
final = df1.merge(df2, how='outer', indicator=True)
print (final)
key1 key2 A B _merge
0 K0 K5 1 8 both
1 K1 K4 NaN NaN left_only
2 K2 K5 3 10 both
3 K3 K4 4 11 both
4 K1 K6 2 9 right_only
5 K9 K6 5 12 right_only
6 K8 K4 6 13 right_only
7 K7 K5 7 14 right_only
print (final[final._merge == 'right_only'])
key1 key2 A B _merge
4 K1 K6 2 9 right_only
5 K9 K6 5 12 right_only
6 K8 K4 6 13 right_only
7 K7 K5 7 14 right_only
print (final[final._merge != 'right_only'])
key1 key2 A B _merge
0 K0 K5 1 8 both
1 K1 K4 NaN NaN left_only
2 K2 K5 3 10 both
3 K3 K4 4 11 both
print (final[final._merge == 'right_only'].drop('_merge', axis=1))
key1 key2 A B
4 K1 K6 2 9
5 K9 K6 5 12
6 K8 K4 6 13
7 K7 K5 7 14
print (final[final._merge != 'right_only'].drop('_merge', axis=1))
key1 key2 A B
0 K0 K5 1 8
1 K1 K4 NaN NaN
2 K2 K5 3 10
3 K3 K4 4 11
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments