如何执行交叉验证以及GridSearchCV（）具体如何？

debugcn 发表于 Dev

Ben

如何GridSearchCV（）（和或RandomizedSearchCV（））在scikit实施？我想知道以下几点：使用这些技术之一时，如何考虑以下方面：

验证集
选型
超参数调整
预测

？这是概述我困惑的图片：

什么时间和频率会发生什么？为了简单起见，我们假设一个神经网络作为我们的模型。到目前为止，我的理解是：

In the first iteration, the model is fit on the training fold, separated into different folds. Here I struggle already: Is the model trained on a single fold and then tested on the validation fold? What happens then with the next fold? Does the model keep the weights achieved by its first training fold or will it re-initialize for the next training fold?

To be more precise: In the first iteration, is the model fit four times and tested four times on the validation set, independently between all folds?

When the next iteration begins, the model keeps no information from the first iteration, right? Thus, are all iterations and all folds are independent from each other? How are the hyperparameters tuned here?

In above example, there are 25 folds in total. Is the model with a constant set of hyperparameters fit and tested 20 times? Let's say, we have two hyperparameters to tune: Learning rate and dropout rate, both with two levels:

learning_rate = [0.3, 0.6] and
dropout_rate = [0.4, 0.8].

Will the neural net now fitted 80 times? And when having not only a single model but e.g. two models (neural network and random forest), the whole procedure will be performed twice?

Is there a possibility to see how many folds GridSearchCV() will consider?

I have seen Does GridSearchCV perform cross-validation? , Model help using Scikit-learn when using GridSearch and scikit-learn GridSearchCV with multiple repetitions but I can't see a clear and precise answer to my questions.

AdForte

So the k-folds method:

you split your training set into n parts (k folds) for example 5. You take de first part as the validation set and the 4 other parts as the training set. You train and this gives you a training/CV performance. You do this 5 (number of folds) times, each folds become the validation set and the others de training set. At the end you do the mean of the performances to obtain the cv performance of your model. This is for the k-fold.

Now, GridSearchCV is an hyperparameter tuner which uses k-folds method. The principel is you give to gridsearch a dictionary with all the hyper parameters you want to test then it will tests all the hyperparameters (dictionary) and select the best set of hyperparameters (those with the best model cv performance). It can take a very loooooooong time.

您可以在gridsearch中传递模型（估计量），一组参数，以及是否需要k折数。

例：

GridSearchCV(SVC(), parameters, cv = 5)

哪里SVC()是估算器，参数是您的超参数字典，而cv是倍数。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-04-1

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章

如何执行交叉验证以及GridSearchCV（）具体如何？

如何执行交叉验证以及GridSearchCV（）具体如何？

您会在交叉验证后对测试数据进行预测（带有KFold的gridsearchcv），以及如何预测？

如何正确交叉验证

如何在没有交叉验证的情况下运行GridSearchCV？

如何通过 GridSearchCV 打印最佳参数进行 k 折交叉验证

GridsearchCV 和 Kfold 交叉验证

scikit的交叉验证如何工作？

如何在MatLab中为LIBSVM执行多类交叉验证

如何在python中的keras功能api中执行交叉验证

如何使用purrr中的cross和pmap对多个模型执行k折交叉验证？

如何在sklearn中对不平衡数据集执行交叉验证

如何使用GridSearchCV的结果绘制验证曲线？

与Imblearn管道和GridSearchCV进行交叉验证

如何在libSVM中使用交叉验证？

交叉验证后如何获取数据？

如何在MATLAB中使用交叉验证

如何在 python 中使用交叉验证？

udhcpc如何执行以及如何更改？

输入验证以及如何改进它

什么是动作清单（以及如何执行）？

原型继承以及如何执行 super()

Oracle如何执行OR条件验证？

如何从普通的机器学习技术转变为交叉验证？

如何在TensorFlow中使用K折交叉验证

如何对块克里金法进行交叉验证？

如何在python交叉验证中计算不同的指标值

交叉验证如何对这2棵树起作用？

如何使用多类SVM实现k倍交叉验证

如何使用交叉验证 (Kfold) 和 sklearn 预测标签

如何在表格的CSS中具体执行此操作