比较Pandas列中的上一个和下一个不同的值

debugcn 发表于 Dev

托马斯·卡雷拉·德·苏扎

我有一个带有一列浮点数的数据框，看起来像这样（为简单起见，该示例使用整数）：

我正在尝试创建一个新列，该列针对每一行比较该行中的上一个和下一个不同的值，并根据它们是否相等来分配一个布尔值。例如，在row [2]中，该值为5，前一个不同的值（不是5）在row [1]中为10，下一个不同的值在row [5]中为10。在这种情况下，新列中的值为True。

然后，对于示例df，我尝试获取的输出是

  col1  col2                
0 10    NaN
1 10    False
2 5     True
3 5     True
4 5     True
5 10    False
6 4     False
7 4     False
8 4     False
9 4     False
10 4    False
11 5    False
12 5    NaN

我知道如何与特定数量的上一行和下一行进行比较，但是我不知道是否有可能进行比较以搜索“第一个不同的值”。

有什么办法吗？

非常感谢！

我想要一片T骨牛排

您可以连续使用唯一值来执行此操作，然后执行以下操作reindex：

s = df['col1'] #to ease the code
#where the value is not the same as before
m = s.diff().ne(0) 
# unique value if following
su = s[m].reset_index(drop=True)
print (su)
# 0    10
# 1     5
# 2    10
# 3     4
# 4     5
# Name: col1, dtype: int64

#create columns in df to align previous and after not equal value
df['col1_after'] = su.reindex(m.cumsum().values).values
df['col1_before'] = su.reindex(m.cumsum().values-2).values
#create col2 where the two previous columns are equal
df['col2'] = df['col1_after'].eq(df['col1_before'])

你得到

print (df)
    col1  col1_after  col1_before   col2
0     10         5.0          NaN  False
1     10         5.0          NaN  False
2      5        10.0         10.0   True
3      5        10.0         10.0   True
4      5        10.0         10.0   True
5     10         4.0          5.0  False
6      4         5.0         10.0  False
7      4         5.0         10.0  False
8      4         5.0         10.0  False
9      4         5.0         10.0  False
10     4         5.0         10.0  False
11     5         NaN          4.0  False
12     5         NaN          4.0  False

请注意，您可以df.drop(['col1_after','col1_before'], axis=1)删除不必要的列，我将其留在此处以显示正在发生的情况

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。