我有一个时间序列索引,几乎没有变量和湿度读数。我已经训练了一个ML模型来基于X,Y和Z预测湿度值。现在,当我使用pickle加载保存的模型时,我想使用X,Y和Z填充湿度缺失值。但是,应该考虑一下X,Y和Z本身不应该丢失的事实。
Time X Y Z Humidity
1/2/2017 13:00 31 22 21 48
1/2/2017 14:00 NaN 12 NaN NaN
1/2/2017 15:00 25 55 33 NaN
在此示例中,将使用模型填充最后一行湿度。由于模型X和Z也缺失,因此模型不应预测第二行。
到目前为止,我已经尝试过了:
with open('model_pickle','rb') as f:
mp = pickle.load(f)
for i, value in enumerate(df['Humidity'].values):
if np.isnan(value):
df['Humidity'][i] = mp.predict(df['X'][i],df['Y'][i],df['Z'][i])
这给了我一个错误“ predict()从2到5个位置参数,但给了6个位置参数”,并且我没有考虑X,Y和Z列值。以下是我用来训练模型并将其保存到文件中的代码:
df = df.dropna()
dfTest = df.loc['2017-01-01':'2019-02-28']
dfTrain = df.loc['2019-03-01':'2019-03-18']
features = [ 'X', 'Y', 'Z']
train_X = dfTrain[features]
train_y = dfTrain.Humidity
test_X = dfTest[features]
test_y = dfTest.Humidity
model = xgb.XGBRegressor(max_depth=10,learning_rate=0.07)
model.fit(train_X,train_y)
predXGB = model.predict(test_X)
mae = mean_absolute_error(predXGB,test_y)
import pickle
with open('model_pickle','wb') as f:
pickle.dump(model,f)
在训练和保存模型期间,我没有任何错误。
为了进行预测,由于您要确保拥有所有X,Y,Z值,因此可以这样做,
df = df.dropna(subset = ["X", "Y", "Z"])
现在,您可以预测其余有效示例的值,如下所示:
# where features = ["X", "Y", "Z"]
df['Humidity'] = mp.predict(df[features])
mp.predict将返回所有行的预测,因此无需进行迭代预测。
编辑:。
为了进行推断,假设您有一个dataframe df
,可以做到,
# Get rows with missing Humidity where it can be predicted.
df_inference = df[df.Humidity.isnull()]
# remaining rows
df = df[df.Humidity.notnull()]
# This might still have rows with missing features.
# Since you cannot infer with missing features, Remove them too and add them to remaining rows
df = df.append(df_inference[df_inference[features].isnull().any(1)])
# and remove them from df_inference
df_inference = df_inference[~df_inference[features].isnull().any(1)]
#Now you can infer on these rows
df_inference['Humidity'] = mp.predict(df_inference[features])
# Now you can merge this back to the remaining rows to get the original number of rows and sort the rows by index
df = df.append(df_inference)
df.sort_index()
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句