How can I find and replace values between two dataframes in R

debugcn 投稿 Dev

jerH

I have a dataframe from tidytext that contains the individual words from some survey free-response comments. It has just shy of 500,000 rows. Being free-response data, it is riddled with typos. Using textclean::replace_misspellings took care of almost 13,000 misspelled words, but there were still ~700 unique misspellings that I manually identified.

I now have a second table with two columns, the first is the misspelling and the second is the correction.

For instance

allComments <- data.frame("Number" = 1:5, "Word" = c("organization","orginization", "oragnization", "help", "hlp"))
misspellings <- data.frame("Wrong" = c("orginization", "oragnization", "hlp"), "Right" = c("organization", "organization", "help"))

How can I replace all the values of allComments$word that match misspellings$wrong with misspellings$right?

I feel like this is probably pretty basic and my R ignorance is showing....

GKi

You can use match to find the index for words from allComments$Word in misspellings$Wrong and then use this index to subset them.

tt <- match(allComments$Word, misspellings$Wrong)
allComments$Word[!is.na(tt)]  <- misspellings$Right[tt[!is.na(tt)]]
allComments
#  Number         Word
#1      1 organization
#2      2 organization
#3      3 organization
#4      4         help
#5      5         help

In case the right word is not already in allComments$Word cast it to a character:

allComments$Word <- as.character(allComments$Word)

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-12

コメントを追加

サインイン

分類Dev

Related 関連記事

記事

How can I find and replace values between two dataframes in R

How can I find and replace values between two dataframes in R

How can I find the nth substring in between two substrings in C?

How I can find out difference of days between two dates in Java

how can i find two substrings

How can I return two values in a controller

How to find and Replace String column values of a Data frame in R

How can I find the duplicated elements in a array and replace them?

How can i replace the values in respect with with missing data with Zero?

How can I replace values in a Transition layer? (gdistance)

Find the difference (set difference) between two dataframes in python

How can I add rows for all dates between two columns?

How can I share my clipboard between two X servers?

How can I erase a line between two CGpoint?

How can I organize interaction between two divs?

How can I delete everything between two markers in a file?

How can I differentiate between two systems at startup menu

How can I use two bash commands in -exec of find command?

how can i check two columns while inserting values?

How can I retrieve and compare two values from a file?

How I can concat two values inside a id of a list

How can I get the values in between single or double quotes?

How can I iterate over two dataframes to compare data and do processing?

How to zip multiple columns between two dataframes into a dictionary object?

How to merge two dataframes based on header and columns values?

How to merge two pandas DataFrames on matching values in a column

How can I tell if my vector matches the variable values in r?

How can I confirm individual changes in a project-wide find and replace?

how can i replace time-series dataframe specific values in pandas?

How can I replace each and every word with 3 values from another dataframe

How can I replace the NULL values in dataframe with Average of Forward and backward fill?