I need to find the duration of a large number of events by using the start and end time variables in a dataset, but both the variables encode the time in the annoying format "mmddyyyyhhmm," with the cherry on top being that the first nine months are encoded as single digits (January is " 1" rather than "01"). At least the time uses a twenty-four clock (assuming the people filling out each event did it right).
I know there has to be a fairly simple way to do this, but I can't think of one and suspect one of you fine folks have it memorized and can write it out in a couple of seconds.
If you have a vector x
with character values for conversion ...
x <- c("41520092010", "11520092010", "121520092010")
... you can check this vector for 11 characters (or whatever). If an element has 11 characters, we paste a zero on the front, then convert the whole vector to POSIXt.
as.POSIXct(
ifelse(nchar(x) == 11, paste0("0", x), x),
format = "%m%d%Y%H%M",
tz = "UTC"
)
# [1] "2009-04-15 20:10:00 UTC" "2009-01-15 20:10:00 UTC"
# [3] "2009-12-15 20:10:00 UTC"
If you don't like ifelse()
, you can use replace()
.
replace(x, nchar(x) == 11, paste0("0", x[nchar(x) == 11]))
or formatC()
formatC(as.numeric(x), digits = 12, width = 12, flag = "0")
The most efficient of these is likely formatC()
.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments