《校级精品课程汇报.ppt》由会员分享,可在线阅读,更多相关《校级精品课程汇报.ppt(44页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、22:29,1,Introduction to Logistic Regression,宇传华( )武汉大学公共卫生学院流行病与卫生统计学系2011,5,31,22:29,2,LPM 线性概率模型Odds Ratio 优势比Nominal Variables 名义变量Dummy Variable 哑变量Multiple Logistic Regression 多重Logistic回归,New Words,22:29,3,1. Review the Type of Variables 2. Variables In Logistic Regression3. Why cannot we use
2、a Linear Regression for Categorical Response?4. Logistic Regression Model 5. What Is an Odds Ratio?6. Multiple Logistic Regression,CONTENTS,22:29,4,1. Review the Type of Variables,22:29,5,Choosing the Scale of Measurement,Before analyzing, select the measurement scale for each variable.,22:29,6,分类(定
3、性)变量,数值(定量)变量,名义变量,有序变量,离散变量,连续变量,22:29,7,Nominal Variables,22:29,8,Ordinal Variables,22:29,9,Weather Good or Bad ?,Binomial Variables,Male or Female ?,22:29,10,Continuous Variables,22:29,11,2. Variables In Logistic Regression,22:29,12,Predicted ,Outcome ,Dependent variable,应变量,22:29,13,Types of Log
4、istic Regression,1. 二项分类logistic回归,2. 多项分类logistic回归,3. 有序分类logistic回归,22:29,14,What Does Logistic Regression Do?,to predict the probability of specific outcomes.,Predictor variables Predicted variable Explanatory variables Response variableCovariables Outcome variableIndependent variables Dependent
5、 variable,二分类应变量,自变量,22:29,15,Independent variables of Logistic Regression,Continuous variables,Dummy Variable for Nominal,22:29,16,3. Why cannot we use a Linear Regression for Categorical Response?,22:29,17,Example: Failing or Passing an Exam,Let us define a variable OutcomeOutcome = 0 if the indiv
6、idual fails the exam = 1 if the individual passes the examPredictor variable:the quantity of hours we use to studyLinear Probability Model (LPM) : Prob (Outcome=1) = + *Quantity of hours of study,22:29,18,Linear Probability Models (LPM),?,22:29,19,4. Logistic Regression Model,22:29,20,Logistic Regre
7、ssion Curve,0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,x,Probability,22:29,21,Logit Transformation,Logistic regression models transform probabilities called logits.whereiindexes all cases (observations).is the probability the event (a sale, for
8、example) occurs in the ith case.lnis the natural log (to the base e).,22:29,22,Assumption,1,0,22:29,23,Logistic Regression Model,logit ( ) = b0 + b1X1wherelogit( )logit transformation of the probability of the eventb0intercept of the regression lineb1slope of the regression line.,线性关系,22:29,24,LOGIS
9、TIC Procedure,PROC LOGISTIC DATA=SAS-data-set ;CLASS variables ;MODEL response=predictors ;OUTPUT OUT=SAS-data-set keyword=name ;RUN;,Analyze Regression Binary Logistic Dependent: y Covariates: x Method: Forward WardSave Predicted Values Probabilities Group membershipOption CI for exp 95% Probabilit
10、y for Stepwise Entry: 0.1 Removal 0.15,SAS,SPSS,Maximum Likelihood Estimation is a statistical method for estimating the coefficients of a model.The likelihood functionL = Prob (p1* p2* * pn),22:29,25,SPSS Output result,Odds Ratio,22:29,26,LPM and Logistic Regression Models,22:29,27,Comparing LPM an
11、d the Logistic Curve,22:29,28,5. What Is an Odds Ratio?,An odds ratio indicates how much more likely, with respect to odds, a certain event occurs in one group relative to its occurrence in another group.,22:29,29,Probabilities from odds,The odds, calculated asCan be rearranged to express the probab
12、ility of an event in terms of the odds:,22:29,30,Probabilities and Odds,22:29,31,Probability of Outcome,22:29,32,Odds,22:29,33,Odds Ratio,22:29,34,Properties of the Odds Ratio,Group B MoreLikely,Group A MoreLikely,0 1,No Association,- 0 ,Odds Ratio,Regression Coefficientb,22:29,35,Odds Ratio from a
13、Logistic Regression Model,Estimated logistic regression model:Estimated odds ratio (each more 1 Study Hours):odds ratio = (e-8.469+.495(a+1)/(e-8.469+.495(a)odds ratio = eb=e.495 = 1.640,22:29,36,6. Multiple Logistic Regression,logit ( ) = b0 + b1X1 + b2X2 + b3X3,22:29,37,Backward Elimination Method
14、,22:29,38,Adjusted Odds Ratio,22:29,39,Interaction in Multiple Logistic Regression,22:29,40,Interaction Plot,Income Level,Low,Medium,High,Predicted Logit,Males,Females,22:29,41,Backward Elimination Method,.,.,.,22:29,42,Multicollinearity in Multiple Logistic Regression,The presence of multicollinear
15、ity will not lead to biased coefficients. But the standard errors of the coefficients will be inflated. If a variable which you think should be statistically significant is not, consult the correlation coefficients. If two variables are correlated at a rate greater than .6, .7, .8, etc. then try dropping the least theoretically important of the two.,22:29,43,Sample Sizes,=1520 times number of variables,22:29,44,Thanks for your attention,Thanks for your attention!,