《环境数据分析课件 (11).pdf》由会员分享,可在线阅读,更多相关《环境数据分析课件 (11).pdf(12页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Chapter 4 Pre-treatmentBe slow in choosing a friend,slower in changing Pretreatment Missing Value Standardization OutliersContentStandardization Meet statistical model assumptions Make all the units comparable Interested in relative variables meter6588.999000meter02.099.0s s Standardization Pretreat
2、ment Missing Value Standardization OutliersContentWhat constitutes a true“outlier”depends on the question being asked and the analysis being conducted.Keep or delete accordinglyNo general rule With and without the suspect points-ANALYSIS Box plotTwo of the most common graphical ways of detecting out
3、liers are the boxplot and the scatterplot,as well as histogram we mentioned before.Ways to deal with outliers Set up a filter in your testing tool Remove or change outliers during post-test analysis Change the value of outliers Consider the underlying distribution Consider the value of mild outliers
4、General rulesFirst,examine extreme values at all stages of analysis.Second,be aware of the potential impact of extreme values in the chosen statistical analysis.Third,only if justifiable on environmental grounds,we can delete extreme values.Good analysts have to continues to grow!1.The ways of data
5、transformation and standardization2.The methods for treatment of“outliers”3.Standardization4.Pretreatment of dataHomeworkTips:Data standardization results from mapping the source data into a target structural representation.Data transformation is often rule basedtransformations are guided by mapping
6、s of data values from their derived position and values in the source into their intended position and values in the target.Standardization is a special case of transformation,employing rules that capture context,linguistics,and idioms that have been recognized as common over time through repeated analysis by the rules analyst or tool vendor.