我必须遵循数据框。我成功从该数据帧的leap年中删除了所有2月29日,因为我打算对“ Day of year”列(使用.dt.dayofyear创建)进行分组,因此我决定忽略多余的extra年日。现在,为了按“一年中的天”列进行分组,如果该天是三月初或更晚,则必须从leap年的天数中减去1。否则,the年将具有366天而不是355天(即使删除了leap天也是如此)。
这是我的代码:
clim_rec = pd.read_csv("daily_climate_records.csv")
clim_rec['Date'] = pd.to_datetime(clim_rec['Date']) # converting "Date" column from string into datetime format
# Let's drop all leaping days by masking all Feb 29 days
feb_29_mask = ~((clim_rec.Date.dt.month == 2) & (clim_rec.Date.dt.day == 29))
clim_rec = clim_rec.where(feb_29_mask).dropna()
# Let's add new column with the "day of year" in order to group by this column
clim_rec['Day of year'] = clim_rec['Date'].dt.dayofyear
print(clim_rec.head())
#print('---------------------------------------------------')
# Now, if the year is a leap year and the dayofyear is greater than the dayofyear of Feb-29
# we subtract 1 from dayofyear. After doing that we will get values 1-365 for dayofyear
leap_year_mask = (clim_rec.Date.dt.year % 4 == 0) & ((clim_rec.Date.dt.year % 100 != 0)
|(clim_rec.Date.dt.year % 400 == 0)) & (clim_rec.Date.dt.month >=3)
clim_rec['Day of year'] = clim_rec['Day of year'].apply(lambda x: x-1) # this line is not correct
我的问题是:如何修改我的附加代码的最后一行,以便仅对根据布尔掩码条件为真的特定行应用减法
使用DataFrame.loc
由Mask选择行,更好/更快的是减法1
,而不是apply
为避免环路(因为引擎盖使用循环下适用):
clim_rec.loc[leap_year_mask, 'Day of year'] -= 1
像这样工作:
clim_rec.loc[leap_year_mask, 'Day of year'] = clim_rec.loc[leap_year_mask, 'Day of year']-1
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句