How to merge create a column in R that only take one value based on update date

Ames Published at Dev

Ames

I am new R and have a matrix below:

earning

0  Name     usd/day  update_year  usd/day update_year
1  John     52.0     2011         NA       NA
2  Mary     44.0     2012         NA       NA
3  Nicole   44.5     2000 est.    49.2     2010 est.
4  Cynthia   38.1    2001 est.    44.0     2011

I wanted to clean up the data in R, to only have 3 columns - Name, usd/day and update_year and get something like this:

0  Name    usd/day  update_year
1  John    52.0     2011
2  Mary    44.0     2012
3  Nicole  49.2     2010
4  Cynthia 44.0     2011

How do I go about doing that in R please?

I don't want to manually combine them and the actual data is more than 100 rows.

milan

This should work. It seems that you want to extract the most recent date (i.e., year) and the highest value for USD. There's a couple of things you need to do.

First, only keep the year in update_year; it seems that you don't want ' est.' in your final table. We can do that using gsub.

df$update_year.x <- gsub("[^0-9]", "", df$update_year.x)
df$update_year.y <- gsub("[^0-9]", "", df$update_year.y)

Find the most recent year.

df$update_year <- apply(df[, c(4,6)], 1, max, na.rm=TRUE)

Find the highest USD value.

df$usd.day <- apply(df[, c(3,5)], 1, max, na.rm=TRUE)

Keep relevant columns.

df[, c("Name", "usd.day", "update_year")]

#     Name usd.day update_year
#1    John    52.0        2011
#2    Mary    44.0        2012
#3  Nicole    49.2        2010
#4 Cynthia    44.0        2011

Data

df <- read.table(text="
X0  Name     usd/day.x  update_year.x  usd/day.y update_year.y
1  John     52.0     2011         NA       NA
2  Mary     44.0     2012         NA       NA
3  Nicole   44.5     '2000 est.'    49.2     '2010 est.'
4  Cynthia   38.1    '2001 est.'    44.0     2011", header=TRUE,fill=TRUE,stringsAsFactors=FALSE)

As pointed out in the comments to your answer; there are duplicate column names and that's a problem. I solved that here by adding x/y at the end of the names.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-03-3

Comments

0 comments

From Dev

How do I need to modify this MERGE statement, for a mass update of one field based on the value of a field in another table?

From Dev

Using SQL how can I update field values of one column to match a different column's field value only if the values do NOT equal?

From Dev

Related Related

Article