我有以下DataFrame,当第一个日期(升序)执行标志列= Y时,我试图为每个客户查找
df = {
"customer_key": ["1","1","1","2","2","2"],
"date": ["2020-09-30", "2020-01-31", "2020-06-30","2020-01-31", "2020-02-29", "2020-03-31"],
"flag": ["Y","N","Y","N","N","Y"]
}
预期结果:
首先,我要按日期排序。
df.sort_values('date', inplace=True)
这是我被卡住的地方,我知道我需要按客户密钥分组,然后在flag = y处找到第一个匹配项,我现在确定如何以Python方式进行此操作。
df['first_occurence_date'] = df.groupby(by='customer_key') ## i dunno...
试试看
out = df.loc[df['flag'].eq('Y')].groupby('customer_key').date.min()
customer_key
1 2020-06-30
2 2020-03-31
Name: date, dtype: object
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句