基于多个其他列的条件式替换数据框列中的值-R

Dom 发表于 Dev

判断

我的数据框看起来像这样

> tornado_frame
         tornado_names Level      value
1     node per cluster   low  -34.72222
2          TB per node   low  -52.08333
3  expense per cluster   low -104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high   20.83333
7          TB per node  high   41.66667
8  expense per cluster  high   52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

我想把桌子变成这个

> tornado_frame
         tornado_names Level      value
1     node per cluster   low   34.72222
2          TB per node   low   52.08333
3  expense per cluster   low  104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high  -20.83333
7          TB per node  high  -41.66667
8  expense per cluster  high  -52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

如果绝对值大于“高”级别列和相同的tornado_name列的绝对值，则“值”中的负号发生变化。

我尝试了一些嵌套的if，但这对我来说很混乱。任何帮助，将不胜感激！

这是我的数据：

> dput(tornado_frame)
structure(list(tornado_names = structure(c(2L, 4L, 1L, 5L, 3L, 
2L, 4L, 1L, 5L, 3L), .Label = c("expense per cluster", "node per cluster", 
"revenue per cluster", "TB per node", "Total TB"), class = "factor"), 
    Level = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L
    ), .Label = c("high", "low"), class = "factor"), value = c(34.72222, 
    52.08333, 104.16667, -62.5, -52.08333, -20.83333, -41.66667, 
    -52.08333, 145.83333, 156.25)), .Names = c("tornado_names", 
"Level", "value"), class = "data.frame", row.names = c(NA, -10L
))

大卫·阿伦堡

这是一个可能的data.table解决方案

library(data.table)
setDT(df)[, value := if(diff(abs(value)) < 0) value * -1,
                                            by = tornado_names]
df
#           tornado_names Level     value
#  1:    node per cluster   low  34.72222
#  2:         TB per node   low  52.08333
#  3: expense per cluster   low 104.16667
#  4:            Total TB   low -62.50000
#  5: revenue per cluster   low -52.08333
#  6:    node per cluster  high -20.83333
#  7:         TB per node  high -41.66667
#  8: expense per cluster  high -52.08333
#  9:            Total TB  high 145.83333
# 10: revenue per cluster  high 156.25000

这将检查您的条件，tornado_names并且仅更改满足条件的组中的值的符号。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。