pandas dataframe split on condition

Harshit saxena

I'm new to pandas so excuse me if I sound too naive. I have two dataframes df1 and df2,

df1 = pd.DataFrame({'key1': ['K0', 'K1', 'K2', 'K3'],
               'key2': ['K5', 'K4', 'K5', 'K4']})

df2 = pd.DataFrame({'key1': ['K0', 'K1', 'K2', 'K3', 'K9', 'K8', 'K7'],
                   'key2': ['K5', 'K6', 'K5', 'K4', 'K6', 'K4', 'K5'],
                     'A':['1', '2', '3', '4', '5', '6', '7'],
                     'B':['8', '9', '10', '11', '12', '13', '14']})

I'd like to merge df2 on df1 like

final = df1.merge(df2, on=['key1', 'key2'], how='left')

and then have the leftover values in df2 as one dataframe.

Any help would be appreciated. Thanks.

jezrael

IIUC you need outer join with parameter indicator, then split by boolean indexing:

final = df1.merge(df2, how='outer', indicator=True)
print (final)
  key1 key2    A    B      _merge
0   K0   K5    1    8        both
1   K1   K4  NaN  NaN   left_only
2   K2   K5    3   10        both
3   K3   K4    4   11        both
4   K1   K6    2    9  right_only
5   K9   K6    5   12  right_only
6   K8   K4    6   13  right_only
7   K7   K5    7   14  right_only

print (final[final._merge == 'right_only'])
  key1 key2  A   B      _merge
4   K1   K6  2   9  right_only
5   K9   K6  5  12  right_only
6   K8   K4  6  13  right_only
7   K7   K5  7  14  right_only

print (final[final._merge != 'right_only'])
  key1 key2    A    B     _merge
0   K0   K5    1    8       both
1   K1   K4  NaN  NaN  left_only
2   K2   K5    3   10       both
3   K3   K4    4   11       both

print (final[final._merge == 'right_only'].drop('_merge', axis=1))
  key1 key2  A   B
4   K1   K6  2   9
5   K9   K6  5  12
6   K8   K4  6  13
7   K7   K5  7  14

print (final[final._merge != 'right_only'].drop('_merge', axis=1))
  key1 key2    A    B
0   K0   K5    1    8
1   K1   K4  NaN  NaN
2   K2   K5    3   10
3   K3   K4    4   11

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Java

Pandas dataframe Split One column data into 2 using some condition

From Dev

Split a pandas dataframe into two dataframes efficiently based on some condition

From Java

Split a large pandas dataframe

From Dev

Split pandas dataframe index

From Dev

Condition in pandas dataframe

From Dev

If condition in pandas dataframe

From Dev

Count If with Condition Pandas DataFrame

From Dev

If and append condition pandas dataframe?

From Dev

Split pandas dataframe by column variable

From Dev

Split pandas dataframe based on groupby

From Dev

Split Pandas Series into DataFrame by delimiter

From Java

Pandas split DataFrame by column value

From Dev

Pandas Split Dataframe into two Dataframes

From Dev

Apply split join Pandas DataFrame

From Dev

Split Pandas Series into DataFrame by delimiter

From Dev

How to split Dataframe using pandas

From Dev

plot pandas DataFrame with condition columns

From Dev

how to subset pandas dataframe on a condition

From Dev

Selecting columns with condition on Pandas DataFrame

From Dev

checking for name of Pandas dataframe as a condition

From Dev

Python pandas dataframe slicing, with if condition

From Dev

checking for name of Pandas dataframe as a condition

From Dev

Pandas: write condition to filter in dataframe

From Dev

Slicing Pandas Dataframe with a sort in the condition

From Dev

How to split dataframe or reorder dataframe by rows in pandas

From Dev

Average Pandas Dataframe with condition other Dataframe

From Dev

Split pandas dataframe column based on number of digits

From Dev

Pandas - Split dataframe into multiple dataframes based on dates?

From Dev

how to split 'number' to separate columns in pandas DataFrame

Related Related

HotTag

Archive