我正在研究一个二进制分类数据集,并将xgBoost模型应用于该问题。模型准备好后,我将绘制特征重要性以及底层随机森林所产生的树之一。请在下面找到这些图。
问题
What do you mean by "datapoint"? Is a datapoint a single case/subject/patient/etc? If so;
The feature importance plot and the tree you plotted both relate only to the model, they are independent of the test set. Finding out which features were important in categorising a specific subject/case/datapoint in the test set is a more challenging task (see e.g. XGBoostExplainer / https://medium.com/applied-data-science/new-r-package-the-xgboost-explainer-51dd7d1aa211).
每个主题/案例/数据点的每个功能的顺序和相对重要性都不同(请参见上文),并且xgboost中没有“类激活图”-分析所有数据,并且认为“不重要”的数据不起作用最终决定。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句