R: merge two data frames when either of two criteria matches

LLL Published at Dev

lll

Say I have two dataframes like the following:

n = c(2, 3, 5, 5, 6, 7) 
s = c("aa", "bb", "cc", "dd", "ee", "ff") 
b = c(2, 4, 5, 4, 3, 2) 
df = data.frame(n, s, b)
#  n  s b
#1 2 aa 2
#2 3 bb 4
#3 5 cc 5  
#4 5 dd 4
#5 6 ee 3
#6 7 ff 2

n2 = c(5, 6, 7, 6) 
s2 = c("aa", "bb", "cc", "ll") 
b2 = c("hh", "nn", "ff", "dd")  
df2 = data.frame(n2, s2, b2)

 #   n2 s2 b2
 #1  5 aa hh
 #2  6 bb nn
 #3  7 cc ff
 #4  6 ll dd

I want to merge them to achieve the following result:

 #n s  b n2 s2 b2
 #2 aa 2 5  aa hh
 #3 bb 4 6  bb nn
 #5 cc 5 7  cc ff
 #5 dd 4 6  ll dd

Basically, what I want to achieve is to merge the two dataframes whenever the values in s of the first data is found in either the s2 or the b2 columns of data2.

I know that merge can work when I specify the two columns from each dataframe but I am not sure how to ADD the OR condition in the merge function. Or how to achieve this goal using other commands from packages such as dpylr.

Also, to clarify, there will be a situation where s2 and b2 have matches with s column in the same row. If this is the case, then just merge them once.

IRTFM

A coupld of problems: 1) you have built a couple of dataframes with factors which has a tendency to screw up matching and indexing, so I used stringsAsFactors =FALSE in hte dataframe calls. 2) you have an ambiguous situation with no stated resolution when both s2 and b2 have matches in the s column (as does occur in your example):

> df2[c("s")] <- list( c( df$s[pmax( match( df2$s2 , df$s), match(df2$b2, df$s),na.rm=TRUE)]))
> df2
  n2 s2 b2  s
1  5 aa hh aa
2  6 bb nn bb
3  7 cc ff ff
4  6 ll dd dd
> df2[c("s")] <- list( c( df$s[pmin( match( df2$s2 , df$s), match(df2$b2, df$s),na.rm=TRUE)]))
> df2
  n2 s2 b2  s
1  5 aa hh aa
2  6 bb nn bb
3  7 cc ff cc
4  6 ll dd dd

Once you resolve the ambiguity to your satiusfaction just use the same method to extract and match the "b"s:

> df2[c("b")] <- list( c( df$b[pmin( match( df2$s2 , df$s), match(df2$b2, df$s),na.rm=TRUE)]))
> df2
  n2 s2 b2  s b
1  5 aa hh aa 2
2  6 bb nn bb 4
3  7 cc ff cc 5
4  6 ll dd dd 4

Modified df's:

> dput(df)
structure(list(n = c(2, 3, 5, 5, 6, 7), s = c("aa", "bb", "cc", 
"dd", "ee", "ff"), b = c(2, 4, 5, 4, 3, 2)), .Names = c("n", 
"s", "b"), row.names = c(NA, -6L), class = "data.frame")
> dput(df2)
structure(list(n2 = c(5, 6, 7, 6), s2 = c("aa", "bb", "cc", "ll"
), b2 = c("hh", "nn", "ff", "dd"), s = c("aa", "bb", "cc", "dd"
), b = c(2, 4, 5, 4)), row.names = c(NA, -4L), .Names = c("n2", 
"s2", "b2", "s", "b"), class = "data.frame")

One step solution:

> df2[c("s", "c")] <-  df[pmin( match( df2$s2 , df$s), match(df2$b2, df$s),na.rm=TRUE), c("s", "b")]
> df2
  n2 s2 b2  s c
1  5 aa hh aa 2
2  6 bb nn bb 4
3  7 cc ff cc 5
4  6 ll dd dd 4

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-03-3

Comments

0 comments

From Dev

Related Related

Article

R: merge two data frames when either of two criteria matches

R: merge two data frames when either of two criteria matches

Merge and concatenate two data frames in R

R: Merge two data frames by common columns

Merge two R data frames and identify the source of each row

R issues with merge/rbind/concatenate two data frames

Merge two data frames by a max number condition in r

Matching two data frames in R

Merge two data frames to fill in missing dates

How to merge and sum two data frames

Merge two data frames and select specific columns

Merge two data.frames with replacement

How to merge and compute two data frames?

Merge two data frames on multiple values

How to merge to two pandas data frames?

Merge two data frames from a national survey with panel and not panel individuals of two different years (in r)

Efficiently merging two data frames on a non-trivial criteria

R merge function is unable to find shared matches between data frames

how to rearrange an order of matches between two data frames

Pandas: Creating data frames with non-repeating matches of two

How to merge two data frames and pick lowest value from duplicated row in R

Merge a single column in two data frames in R where only some rows match

How to merge two data frames with different lengths by recycling without duplication in R?

Pandas Data Frame - Merge Two Data Frames based on "InStr" > 0

Extracting equal rows of two data frames (in R)

Replace values between two data frames in R

Multiplying and combine two data frames in R

Substract two data.frames in R, by characters

Dividing Two Data Frames (One into the Other) in R

Divide a Column based on two Data Frames in R

Multiplying and combine two data frames in R