《机器学习五模型评估与优化.docx》由会员分享,可在线阅读,更多相关《机器学习五模型评估与优化.docx(5页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、机器学习(五)模型评估与优化fig3:fig4:2.芯片质量好坏预测2.1根据高斯分布概率密度函数寻找异常点并剔除任务1基于data_class_raw.csv数据根据高斯分布概率密度函数寻找异常点并剔除#1.loadthedataimportpandasaspdimportnumpyasnpdatapd.read_csv(data_class_raw.csv)data.head()#2.defineXandyXdata.drop(y,axis1)ydata.loc:,y#3.visualizethedata%matplotlibinlinefrommatplotlibimportpyplot
2、aspltfig1plt.figure(figsize(5,5)badplt.scatter(X.loc:,x1y0,X.loc:,x2y0)goodplt.scatter(X.loc:,x1y1,X.loc:,x2y1)plt.legend(good,bad),(good,bad)plt.title(rawdata)plt.xlabel(x1)plt.ylabel(x2)plt.show()#4.异常检测anomaydetectionfromsklearn.covarianceimportEllipticEnvelopead_modelEllipticEnvelope(contaminati
3、on0.02)ad_model.fit(Xy0)y_predict_badad_model.predict(Xy0)print(y_predict_bad)#5.可视化异常检测点fig2plt.figure(figsize(5,5)badplt.scatter(X.loc:,x1y0,X.loc:,x2y0)goodplt.scatter(X.loc:,x1y1,X.loc:,x2y1)plt.scatter(X.loc:,x1y0y_predict_bad-1,X.loc:,x2y0y_predict_bad-1,markerx,s150)plt.legend(good,bad),(good
4、,bad)plt.title(rawdata)plt.xlabel(x1)plt.ylabel(x2)plt.show()fig1:fig2:2.2基于data_class_processed.csv数据进展PCA处理确定重要数据维度及成分任务2基于data_class_processed.csv数据进展PCA处理确定重要数据维度及成分#1.加载数据datapd.read_csv(data_class_processed.csv)data.head()#2.defineXandyXdata.drop(y,axis1)ydata.loc:,y#3.主成分分析pcafromsklearn.prep
5、rocessingimportStandardScalerfromsklearn.decompositionimportPCAX_normStandardScaler().fit_transform(X)pcaPCA(n_components2)#保存2个主成分信息即2维数据集X_reducedpca.fit_transform(X_norm)var_ratiopca.explained_variance_ratio_print(var_ratio)fig3plt.figure(figsize(5,5)plt.bar(1,2,var_ratio)plt.show()fig3:2.3调用skle
6、arn库根据数据别离参数完成训练与测试数据别离任务3完成数据别离数据别离参数random_state4,test_size0.4#trainandtestsplit:random_state4,test_size0.4fromsklearn.model_selectionimporttrain_test_splitX_train,X_test,y_train,y_testtrain_test_split(X,y,random_state4,test_size0.4)print(X_train.shape,X_test.shape,X.shape)#(21,2)(14,2)(35,2)2.4建立
7、KNN模型完成分类计算准确率可视化分类边界任务4建立KNN模型完成分类n_neighbors取10计算分类准确率可视化分类边界#1.knnmodelfromsklearn.neighborsimportKNeighborsClassifierknn_10KNeighborsClassifier(n_neighbors10)knn_10.fit(X_train,y_train)y_train_predictknn_10.predict(X_train)y_test_predictknn_10.predict(X_test)#calculatetheaccuracyfromsklearn.metr
8、icsimportaccuracy_scoreaccuracy_trainaccuracy_score(y_train,y_train_predict)accuracy_testaccuracy_score(y_test,y_test_predict)print(trianingaccuracy:,accuracy_train)print(testingaccuracy:,accuracy_test)#2.生成网格点坐标矩阵xx,yynp.meshgrid(np.arange(0,10,0.05),np.arange(0,10,0.05)print(yy.shape)#3.扁平化操作返回一个数
9、组x_rangenp.c_xx.ravel(),yy.ravel()print(x_range.shape)#4.模型预测y_range_predictknn_10.predict(x_range)#5.可视化KNN模型结果及其边界visualizetheknnresultandboundaryfig4plt.figure(figsize(10,10)knn_badplt.scatter(x_range:,0y_range_predict0,x_range:,1y_range_predict0)knn_goodplt.scatter(x_range:,0y_range_predict1,x_r
10、ange:,1y_range_predict1)badplt.scatter(X.loc:,x1y0,X.loc:,x2y0)goodplt.scatter(X.loc:,x1y1,X.loc:,x2y1)plt.legend(good,bad,knn_good,knn_bad),(good,bad,knn_good,knn_bad)plt.title(predictionresult)plt.xlabel(x1)plt.ylabel(x2)plt.show()fig4:2.5计算测试数据集对应的混淆矩阵计算准确率、召回率、特异度、准确率、F1分数任务5计算测试数据集对应的混淆矩阵计算准确率、
11、召回率、特异度、准确率、F1分数#1.导入库创立混淆矩阵实例fromsklearn.metricsimportconfusion_matrixcmconfusion_matrix(y_test,y_test_predict)print(cm)#2.定义TPTNFPFNTPcm1,1TNcm0,0FPcm0,1FNcm1,0print(TP,TN,FP,FN)#3.计算准确率Accuracy(TPTN)/(TPTNFPFN)accuracy(TPTN)/(TPTNFPFN)print(accuracy)#4.计算召回率灵敏度SensitivityRecallTP/(TPFN)recallTP/(
12、TPFN)print(recall)#5.计算特异度SpecificityTN/(TNFP)specificityTN/(TNFP)print(specificity)#6.计算准确度PrecisionTP/(TPFP)precisionTP/(TPFP)print(precision)#7.计算F1分数F1Score2*PrecisionXRecall/(PrecisionRecall)f12*precision*recall/(precisionrecall)print(f1)2.6尝试不同的n_neighbors计算其在训练数据集、测试数据集上的准确率并可视化任务6尝试不同的n_neig
13、hbors1-20,计算其在训练数据集、测试数据集上的准确率并作图#1.trydifferentkandcalcualtetheaccuracyforeachniforiinrange(1,21)accuracy_trainaccuracy_testforiinn:knnKNeighborsClassifier(n_neighborsi)knn.fit(X_train,y_train)y_train_predictknn.predict(X_train)y_test_predictknn.predict(X_test)accuracy_train_iaccuracy_score(y_train
14、,y_train_predict)accuracy_test_iaccuracy_score(y_test,y_test_predict)accuracy_train.append(accuracy_train_i)accuracy_test.append(accuracy_test_i)print(accuracy_train,accuracy_test)#2.可视化fig5plt.figure(figsize(12,5)plt.subplot(121)plt.plot(n,accuracy_train,markero)plt.title(trainingaccuracyvsn_neighbors)plt.xlabel(n_neighbors)plt.ylabel(accuracy)plt.subplot(122)plt.plot(n,accuracy_test,markero)plt.title(testingaccuracyvsn_neighbors)plt.xlabel(n_neighbors)plt.ylabel(accuracy)plt.show()fig5: