我正在用python pandas编写脚本,我必须在其中找到值和日期的第一个下降点,然后在达到最大值之前,然后在值和日期下降之前找到它。然后再次是下降点值和日期。在下面显示的图形中,我用红色圆圈标记了要获取日期和值的位置。我有一个脚本,但是我需要提及获取值的日期,但是我想提取日期和值,我们将不胜感激。
代码:
import pandas as pd
df = pd.read_csv(r"D:\Data\2015_20.csv", parse_dates=["Date"])
df = df[["Date", "Mean"]]
df = df.set_index("Date")
z1 = df['2016-04-28' : '2017-02-22'].min()
z2 = df['2017-05-13' : '2018-02-02'].max()
z3 = df['2018-03-19' : '2019-03-04'].max()
print("2016", '%.2f'%z1)
print("2017", '%.2f'%z2)
print("2018", '%.2f'%z3)
您可以argrelextrema
用来查找本地的最小值和最大值:
from scipy.signal import argrelextrema
np.random.seed(0)
rs = np.random.randn(200)
xs = [0]
for r in rs:
xs.append(xs[-1] * 0.9 + r)
df = pd.DataFrame(xs, columns=['data'], index=pd.date_range('2000-01-01',periods=len(xs)))
n = 5 # number of points to be checked before and after
# Find local peaks
df['min'] = df.iloc[argrelextrema(df.data.values, np.less_equal,
order=n)[0]]['data']
df['max'] = df.iloc[argrelextrema(df.data.values, np.greater_equal,
order=n)[0]]['data']
df['min_date'] = df.index.where(df['min'].notna())
df['max_date'] = df.index.where(df['max'].notna())
print (df.head(15))
data min max min_date max_date
2000-01-01 0.000000 0.000000 NaN 2000-01-01 NaT
2000-01-02 1.764052 NaN NaN NaT NaT
2000-01-03 1.987804 NaN NaN NaT NaT
2000-01-04 2.767762 NaN NaN NaT NaT
2000-01-05 4.731879 NaN NaN NaT NaT
2000-01-06 6.126249 NaN 6.126249 NaT 2000-01-06
2000-01-07 4.536346 NaN NaN NaT NaT
2000-01-08 5.032800 NaN NaN NaT NaT
2000-01-09 4.378163 NaN NaN NaT NaT
2000-01-10 3.837128 NaN NaN NaT NaT
2000-01-11 3.864013 NaN NaN NaT NaT
2000-01-12 3.621656 3.621656 NaN 2000-01-12 NaT
2000-01-13 4.713764 NaN NaN NaT NaT
2000-01-14 5.003425 NaN NaN NaT NaT
2000-01-15 4.624757 NaN NaN NaT NaT
编辑:
来自真实数据的解决方案:
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
from scipy.signal import argrelextrema
n = 5
s1 = df.iloc[argrelextrema(df.Mean.values, np.less_equal,
order=n)[0]]['Mean']
s2 = df.iloc[argrelextrema(df.Mean.values, np.greater_equal,
order=n)[0]]['Mean']
s = s1.append(s2).sort_index()
print (s)
Date
2016-05-18 0.293171
2016-11-04 0.692509
2017-05-13 0.232963
2017-09-10 0.675797
2017-11-09 0.528592
2018-04-03 0.189523
2018-11-09 0.713351
Name: Mean, dtype: float64
s.to_csv('out.csc')
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句