Convert written number to number in R

Henk

Does anybody know a function to convert a text representation of a number into an actual number, e.g. 'twenty thousand three hundred and five' into 20305. I have written numbers in dataframe rows and want to convert them to numbers.

In package qdap, you can replace numeric represented numbers with words (e.g., 1001 becomes one thousand one), but not the other way around:

library(qdap)
replace_number("I like 346457 ice cream cones.")
[1] "I like three hundred forty six thousand four hundred fifty seven ice cream cones."
Thomas

Here's a start that should get you to hundreds of thousands.

word2num <- function(word){
    wsplit <- strsplit(tolower(word)," ")[[1]]
    one_digits <- list(zero=0, one=1, two=2, three=3, four=4, five=5,
                       six=6, seven=7, eight=8, nine=9)
    teens <- list(eleven=11, twelve=12, thirteen=13, fourteen=14, fifteen=15,
                  sixteen=16, seventeen=17, eighteen=18, nineteen=19)
    ten_digits <- list(ten=10, twenty=20, thirty=30, forty=40, fifty=50,
                       sixty=60, seventy=70, eighty=80, ninety=90)
    doubles <- c(teens,ten_digits)
    out <- 0
    i <- 1
    while(i <= length(wsplit)){
        j <- 1
        if(i==1 && wsplit[i]=="hundred")
            temp <- 100
        else if(i==1 && wsplit[i]=="thousand")
            temp <- 1000
        else if(wsplit[i] %in% names(one_digits))
            temp <- as.numeric(one_digits[wsplit[i]])
        else if(wsplit[i] %in% names(teens))
            temp <- as.numeric(teens[wsplit[i]])
        else if(wsplit[i] %in% names(ten_digits))
            temp <- (as.numeric(ten_digits[wsplit[i]]))
        if(i < length(wsplit) && wsplit[i+1]=="hundred"){
            if(i>1 && wsplit[i-1] %in% c("hundred","thousand"))
                out <- out + 100*temp
            else
                out <- 100*(out + temp)
            j <- 2
        }
        else if(i < length(wsplit) && wsplit[i+1]=="thousand"){
            if(i>1 && wsplit[i-1] %in% c("hundred","thousand"))
                out <- out + 1000*temp
            else
                out <- 1000*(out + temp)
            j <- 2
        }
        else if(i < length(wsplit) && wsplit[i+1] %in% names(doubles)){
            temp <- temp*100
            out <- out + temp
        }
        else{
            out <- out + temp
        }
        i <- i + j
    }
    return(list(word,out))
}

Results:

> word2num("fifty seven")
[[1]]
[1] "fifty seven"

[[2]]
[1] 57

> word2num("four fifty seven")
[[1]]
[1] "four fifty seven"

[[2]]
[1] 457

> word2num("six thousand four fifty seven")
[[1]]
[1] "six thousand four fifty seven"

[[2]]
[1] 6457

> word2num("forty six thousand four fifty seven")
[[1]]
[1] "forty six thousand four fifty seven"

[[2]]
[1] 46457

> word2num("forty six thousand four hundred fifty seven")
[[1]]
[1] "forty six thousand four hundred fifty seven"

[[2]]
[1] 46457

> word2num("three forty six thousand four hundred fifty seven")
[[1]]
[1] "three forty six thousand four hundred fifty seven"

[[2]]
[1] 346457

I can tell you already that this won't work for word2num("four hundred thousand fifty"), because it doesn't know how to handle consecutive "hundred" and "thousand" terms, but the algorithm can be modified probably. Anyone should feel free to edit this if they have improvements or build on them in their own answer. I just thought this was a fun problem to play with (for a little while).

Edit: Apparently Bill Venables has a package called english that may achieve this even better than the above code.

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정
0

몇 마디 만하겠습니다

0리뷰
로그인참여 후 검토

관련 기사

분류에서Dev

Convert number to Roman numerals

분류에서Dev

Convert formatted cstring number to long

분류에서Dev

Reverse a number in R

분류에서Dev

Microsoft Word with Automatic numbering that replace written number in text?

분류에서Dev

How to convert phonetic phone number to numeric phone number?

분류에서Dev

convert int number to separated number each 3 digits by comma

분류에서Dev

How to convert oracle number type to string with format?

분류에서Dev

How to convert number to width string property? Javascript

분류에서Dev

Convert Text to Number in MySQL to allow arithmetic operators

분류에서Dev

How to Convert Oracle Number(19) to Java Long?

분류에서Dev

How to convert a positive number to negative in batch?

분류에서Dev

R heatmap with number of members in cell

분류에서Dev

R - calculating values assigned to the same number

분류에서Dev

Function for multi-level Harshad Number in R

분류에서Dev

How to show error line number in R studio

분류에서Dev

How to compute a check digit for a large number in R?

분류에서Dev

how to write a loop of the number of for loops in R?

분류에서Dev

python, rank a list of number/string (convert list elements to ordinal value)

분류에서Dev

Number () 대 new Number ()?

분류에서Dev

Disable autocorrection of (number) to -number

분류에서Dev

How to bind tibbles by row with different number of columns in R

분류에서Dev

R Count number of sign switches for adjacent vector elements

분류에서Dev

R: Merging two vectors and shuffle them with a maximum number of repetitions

분류에서Dev

Replace Inf in R data.table / Show number of Inf in colums

분류에서Dev

Vim S&R to remove number from end of InstallShield file

분류에서Dev

Expand the for loop across n number of columns by making a function in R

분류에서Dev

Collapse columns in some rows to the row with least number of columns in R

분류에서Dev

How to smartly convert a number of seconds in a date-time value using C

분류에서Dev

TS에서 왜 number [] [number] === number입니까?

Related 관련 기사

  1. 1

    Convert number to Roman numerals

  2. 2

    Convert formatted cstring number to long

  3. 3

    Reverse a number in R

  4. 4

    Microsoft Word with Automatic numbering that replace written number in text?

  5. 5

    How to convert phonetic phone number to numeric phone number?

  6. 6

    convert int number to separated number each 3 digits by comma

  7. 7

    How to convert oracle number type to string with format?

  8. 8

    How to convert number to width string property? Javascript

  9. 9

    Convert Text to Number in MySQL to allow arithmetic operators

  10. 10

    How to Convert Oracle Number(19) to Java Long?

  11. 11

    How to convert a positive number to negative in batch?

  12. 12

    R heatmap with number of members in cell

  13. 13

    R - calculating values assigned to the same number

  14. 14

    Function for multi-level Harshad Number in R

  15. 15

    How to show error line number in R studio

  16. 16

    How to compute a check digit for a large number in R?

  17. 17

    how to write a loop of the number of for loops in R?

  18. 18

    python, rank a list of number/string (convert list elements to ordinal value)

  19. 19

    Number () 대 new Number ()?

  20. 20

    Disable autocorrection of (number) to -number

  21. 21

    How to bind tibbles by row with different number of columns in R

  22. 22

    R Count number of sign switches for adjacent vector elements

  23. 23

    R: Merging two vectors and shuffle them with a maximum number of repetitions

  24. 24

    Replace Inf in R data.table / Show number of Inf in colums

  25. 25

    Vim S&R to remove number from end of InstallShield file

  26. 26

    Expand the for loop across n number of columns by making a function in R

  27. 27

    Collapse columns in some rows to the row with least number of columns in R

  28. 28

    How to smartly convert a number of seconds in a date-time value using C

  29. 29

    TS에서 왜 number [] [number] === number입니까?

뜨겁다태그

보관