这是熊猫数据框。“方向”列仅包含3个变量值:向下,向下或向上。只有最后一个相同的值。因此,标题中有问题。
Time Direction
id
0 16:59 Up
1 17:00 Flat
2 17:01 Up
3 17:02 Down
4 17:03 Down
5 17:04 Up
6 17:05 Up
7 17:06 Up
假设数据框名称是panda。结果必须是这样的(这是更喜欢的):
result = 0
result = panda.tail(?)['Direction'].count_last_values(#as the most last value[Up <- in this case])[0]
print(result)
3
或者像这样:
Time Direction Series
id
0 16:59 Up 1
1 17:00 Flat 0
2 17:01 Up 1
3 17:02 Down 1
4 17:03 Down 2
5 17:04 Up 1
6 17:05 Up 2
7 17:06 Up 3
我自己可以做到(但我想要更简单的方法):
import pandas as pd
panda = pd.DataFrame({'Time':['16:59','17:00','17:01','17:02','17:03','17:04','17:05','17:06'], 'Direction':['Up','Flat','Up','Down','Down','Up','Up','Up']})
Time Direction
0 16:59 Up
1 17:00 Flat
2 17:01 Up
3 17:02 Down
4 17:03 Down
5 17:04 Up
6 17:05 Up
7 17:06 Up
tail = panda.tail(1)['Direction'].iloc[0]
counter = 0
i = len(panda) - 1
if tail != 'Flat':
while tail==panda.iloc[i]['Direction']:
i -= 1
counter += 1
print(counter)
3
检查当前值是否与上一个值相同,shift
并使用创建“组” cumsum()
。使用.groupby
和创建新列cumcount
。
s = (df['Direction'] != df['Direction'].shift()).cumsum()
df['Series'] = df.groupby(s).cumcount()+1
#output:
Time Direction Series
id
0 16:59 Up 1
1 17:00 Flat 1
2 17:01 Up 1
3 17:02 Down 1
4 17:03 Down 2
5 17:04 Up 1
6 17:05 Up 2
7 17:06 Up 3
如果您需要在“方向”列为“平面”时从零开始计数,请使用.loc
df.loc[df['Direction'] == 'Flat', 'Series'] = df['Series'].subtract(1)
#output
Time Direction Series
id
0 16:59 Up 1
1 17:00 Flat 0
2 17:01 Up 1
3 17:02 Down 1
4 17:03 Down 2
5 17:04 Up 1
6 17:05 Up 2
7 17:06 Up 3
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句