私は実際にこのデータフレームを持っています:
activity_type leg_mode route_distance
1 home access_walk 239.83275
2 pt interaction pt 15802.78756
3 pt interaction transit_walk 71.92245
4 pt interaction pt 2958.24598
5 pt interaction transit_walk 0.00000
6 pt interaction pt 9555.56836
私の関数はベクトルベースで機能するので、列を貼り付けて、情報を失わないように次のdfを使用します。
activity_type__leg_mode__route_distance
1 home@[email protected]
2 pt interaction@[email protected]
3 pt interaction@[email protected]
4 pt interaction@[email protected]
5 pt interaction@transit_walk@0
6 pt interaction@[email protected]
このコード行を新しいdfに適用しようとしています。
r = rle(df$activity_type)
ix = c(
which(head(r$values, -1) == "pt interaction" & tail(r$values, -1) == "outside"), # p before o
which(head(r$values, -1) == "outside" & tail(r$values, -1) == "pt interaction") + 1) # o before p
したがって、新しいdfにはpt interaction
orだけoutside
でなく、他の文字が続くため、柔軟性が必要になります。ただし、文字列の先頭のみをチェックする必要があります。grep以上を使用することを考えていましたが、これをうまく行う方法がわかりません。
私は主にこの条件をより柔軟にする方法を見つけたいwhich(head(r$values, -1) == "pt interaction" & tail(r$values, -1) == "outside")
、それがために見てはいけない、つまり"pt interaction"
けどために"pt interaction<some varying, but irrelevant stuff>"
。
ここにあなたが試すためのいくつかのデータがあります
c("home@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@transit_walk@0", "pt interaction@[email protected]",
"pt interaction@[email protected]", "outside@outside@0",
"outside@[email protected]", "outside@[email protected]",
"pt interaction@[email protected]", "pt interaction@transit_walk@0",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"home@[email protected]", "leisure@[email protected]",
"other@[email protected]", "leisure@[email protected]",
"leisure@[email protected]", "other@[email protected]",
"leisure@[email protected]", "other@[email protected]",
"leisure@[email protected]", "home@[email protected]",
"adpt interaction@adpt@NaN", "leisure@[email protected]",
"adpt interaction@[email protected]", "home@adpt@NaN", "@[email protected]",
"home@@NA", "outside@transit_walk@0", "outside@[email protected]",
"outside@[email protected]", "outside@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"outside@outside@0", "outside@[email protected]",
"outside@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@transit_walk@0", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "outside@@NA",
"outside@[email protected]", "leisure@[email protected]",
"work@[email protected]", "outside@@NA", "outside@[email protected]",
"outside@[email protected]", "outside@[email protected]",
"leisure@[email protected]", "outside@@NA", "outside@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"outside@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"work@[email protected]", "outside@@NA", "outside@[email protected]",
"other@[email protected]", "outside@@NA", "outside@[email protected]",
"pt interaction@[email protected]", "pt interaction@transit_walk@0",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"outside@[email protected]", "pt interaction@[email protected]",
"pt interaction@transit_walk@0", "pt interaction@[email protected]",
"pt interaction@[email protected]", "work@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"outside@@NA", "outside@[email protected]", "pt interaction@[email protected]",
"pt interaction@transit_walk@0", "pt interaction@[email protected]",
"pt interaction@[email protected]", "outside@[email protected]",
"pt interaction@[email protected]", "pt interaction@[email protected]",
"pt interaction@[email protected]")
これは力ずくのアプローチです。activity_type
usingのすべての値のペアを作成しますexpand.grid
。次に、を使用apply
してこれらすべてのペアを実行し、を使用して変更検出コードを適用しますrle
。これにより、すべての変更点のリストが表示されます。その後、必要に応じて整理できます。
r = rle(data$activity_type)
combinations <- expand.grid(unique(r$values), unique(r$values))
names(combinations) <- c("first", "second")
combinations <- combinations %>%
mutate_if(is.factor, as.character) %>%
mutate(labels = paste0(first, " <-> ", second))
ix_list <- apply(combinations, 1, function(x) c(
which(head(r$values, -1) == x[1] & tail(r$values, -1) == x[2]), # first before last
which(head(r$values, -1) == x[2] & tail(r$values, -1) == x[1]) + 1)) # last before first
names(ix_list) <-combinations$labels
# remove empty list elements
ix_list <- Filter(length, ix_list)
この結果で:
> glimpse(ix_list)
List of 26
$ pt interaction <-> home : num [1:2] 4 2
$ outside <-> home : num 20
$ leisure <-> home : num [1:2] 12 6
$ adpt interaction <-> home : num [1:2] 16 14
$ <-> home : num [1:2] 18 18
$ home <-> pt interaction : num [1:2] 1 5
$ outside <-> pt interaction : num [1:16] 3 20 22 29 31 36 38 42 44 3 ...
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加