模型可解释性:特征重要性/SHAP/LIME
模型可解释性特征重要性/SHAP/LIME1. 特征重要性fromsklearn.ensembleimportRandomForestClassifierimportpandasaspd# 树模型内置特征重要性rfRandomForestClassifier(n_estimators100,random_state42)rf.fit(X_train,y_train)importancepd.Series(rf.feature_importances_,indexfeature_names)print(importance.nlargest(10))# XGBoost 特征重要性importxgboostasxgb xgb_clfxgb.XGBClassifier().fit(X_train,y_train)xgb.plot_importance(xgb_clf,max_num_features10)2. SHAP 值importshap# 计算 SHAP 值explainershap.TreeExplainer(rf)shap_valuesexplainer.shap_values(X_test)# 摘要图shap.summary_plot(shap_values[1],X_test,feature_namesfeature_names)# 单样本解释shap.force_plot(explainer.expected_value[1],shap_values[1][0],X_test.iloc[0])# 依赖图shap.dependence_plot(feature_name,shap_values[1],X_test)3. LIMEfromlime.lime_tabularimportLimeTabularExplainer explainerLimeTabularExplainer(X_train.values,feature_namesfeature_names,class_names[class_0,class_1],modeclassification)# 解释单个样本expexplainer.explain_instance(X_test.iloc[0].values,rf.predict_proba,num_features10)exp.show_in_notebook()总结方法适用模型粒度计算速度特征重要性树模型全局快SHAP任意模型全局局部中LIME任意模型局部快