《商务与经济统计.ppt》由会员分享,可在线阅读,更多相关《商务与经济统计.ppt(106页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、 2011 Pearson Education,IncStatistics for Business and EconomicsChapter 10Simple Linear Regression 2011 Pearson Education,IncContents10.1 Probabilistic Models10.2 Fitting the Model:The Least Squares Approach10.3 Model Assumptions10.4 Assessing the Utility of the Model:Making Inferences about the Slo
2、pe 1 2011 Pearson Education,IncContents10.5 The Coefficients of Correlation and Determination10.6 Using the Model for Estimation and Prediction10.7 A Complete Example 2011 Pearson Education,IncLearning ObjectivesIntroduce the straight-line(simple linear regression)model as a means of relating one qu
3、antitative variable to another quantitative variableIntroduce the correlation coefficient as a means of relating one quantitative variable to another quantitative variable 2011 Pearson Education,IncLearning ObjectivesAssess how well the simple linear regression model fits the sample dataEmploy the s
4、imple linear regression model for predicting the value of one variable from a specified value of another variable 2011 Pearson Education,Inc10.1Probabilistic Models 2011 Pearson Education,IncModelsRepresentation of some phenomenonMathematical model is a mathematical expression of some phenomenonOfte
5、n describe relationships between variablesTypesDeterministic modelsProbabilistic models 2011 Pearson Education,IncDeterministic ModelsHypothesize exact relationshipsSuitable when prediction error is negligibleExample:force is exactly mass times accelerationF=ma 1984-1994 T/Maker Co.2011 Pearson Educ
6、ation,IncProbabilistic ModelsHypothesize two componentsDeterministicRandom errorExample:sales volume(y)is 10 times advertising spending(x)+random errory=10 x+Random error may be due to factors other than advertising 2011 Pearson Education,IncGeneral Form of Probabilistic Modelsy=Deterministic compon
7、ent+Random errorwhere y is the variable of interest.We always assume that the mean value of the random error equals 0.This is equivalent to assuming that the mean value of y,E(y),equals the deterministic component of the model;that is,E(y)=Deterministic component 2011 Pearson Education,IncA First-Or
8、der(Straight Line)Probabilistic Model y=0+1x+wherey=Dependent or response variable(variable to be modeled)x=Independent or predictor variable(variable used as a predictor of y)E(y)=0+1x =Deterministic component(epsilon)=Random error component 2011 Pearson Education,IncA First-Order(Straight Line)Pro
9、babilistic Modely=0+1x+0(beta zero)=y-intercept of the line,that is,the point at which the line intercepts or cuts through the y-axis1(beta one)=slope of the line,that is,the change(amount of increase or decrease)in the deterministic component of y for every 1-unit increase in x 2011 Pearson Educati
10、on,IncA First-Order(Straight Line)Probabilistic ModelNote:A positive slope implies that E(y)increases by the amount 1 for each unit increase in x.A negative slope implies that E(y)decreases by the amount 1.2011 Pearson Education,IncFive-Step ProcedureStep 1:Hypothesize the deterministic component of
11、 the model that relates the mean,E(y),to the independent variable x.Step 2:Use the sample data to estimate unknown parameters in the model.Step 3:Specify the probability distribution of the random error term and estimate the standard deviation of this distribution.Step 4:Statistically evaluate the u
12、sefulness of the model.Step 5:When satisfied that the model is useful,use it for prediction,estimation,and other purposes.2011 Pearson Education,Inc10.2Fitting the Model:The Least Squares Approach 2011 Pearson Education,IncScattergram1.Plot of all(xi,yi)pairs2.Suggests how well model will fit0204060
13、0204060 xy 2011 Pearson Education,Inc02040600204060 xyThinking ChallengeHow would you draw a line through the points?How do you determine which line fits best?2011 Pearson Education,IncLeast Squares LineThe least squares line is one that has the following two properties:1.The sum of the errors equal
14、s 0,i.e.,mean error=0.2.The sum of squared errors(SSE)is smaller than for any other straight-line model,i.e.,the error variance is minimum.2011 Pearson Education,IncFormula for the Least Squares Estimates n=sample size 2011 Pearson Education,IncInterpreting the Estimates of 0 and 1 in Simple Liner R
15、egressiony-intercept:represents the predicted value of y when x=0(Caution:This value will not be meaningful if the value x=0 is nonsensical or outside the range of the sample data.)slope:represents the increase(or decrease)in y for every 1-unit increase in x(Caution:This interpretation is valid only
16、 for x-values within the range of the sample data.)2011 Pearson Education,IncLeast Squares Graphically2yx134 2011 Pearson Education,IncLeast Squares ExampleYoure a marketing analyst for Hasbro Toys.You gather the following data:Ad Expenditure(100$)Sales(Units)1121324254Find the least squares line re
17、latingsales and advertising.2011 Pearson Education,Inc01234012345Scattergram Sales vs.AdvertisingSalesAdvertising 2011 Pearson Education,IncParameter Estimation Solution 2011 Pearson Education,Inc Parameter Estimates Parameter Standard T for H0:Variable DF Estimate Error Param=0 Prob|T|INTERCEP 1 -0
18、.1000 0.6350 -0.157 0.8849ADVERT 1 0.7000 0.1914 3.656 0.0354Parameter Estimation Computer Output 0 1 2011 Pearson Education,IncCoefficient Interpretation Solution1.Slope(1)Sales Volume(y)is expected to increase by$700 for each$100 increase in advertising(x),over the sampled range of advertising exp
19、enditures from$100 to$500 2.y-Intercept(0)Since 0 is outside of the range of the sampled values of x,the y-intercept has no meaningful interpretation 2011 Pearson Education,Inc01234012345Regression Line Fitted to the DataSalesAdvertising 2011 Pearson Education,IncLeast Squares Thinking ChallengeYour
20、e an economist for the county cooperative.You gather the following data:Fertilizer(lb.)Yield(lb.)43.0 65.5106.5129.0Find the least squares line relatingcrop yield and fertilizer.1984-1994 T/Maker Co.2011 Pearson Education,Inc0246810051015Scattergram Crop Yield vs.Fertilizer*Yield(lb.)Fertilizer(lb.)
21、2011 Pearson Education,IncParameter Estimation Solution*2011 Pearson Education,IncCoefficient Interpretation Solution*2.y-Intercept(0)Since 0 is outside of the range of the sampled values of x,the y-intercept has no meaningful interpretation.1.Slope(1)Crop Yield(y)is expected to increase by.65 lb.fo
22、r each 1 lb.increase in Fertilizer(x)2011 Pearson Education,IncRegression Line Fitted to the Data0246810051015Yield(lb.)Fertilizer(lb.)2011 Pearson Education,Inc10.3Model Assumptions 2011 Pearson Education,IncBasic Assumptions of the Probability DistributionAssumption 1:The mean of the probability d
23、istribution of is 0 that is,the average of the values of over an infinitely long series of experiments is 0 for each setting of the independent variable x.This assumption implies that the mean value of y,E(y),for a given value of x is E(y)=0+1x.2011 Pearson Education,IncBasic Assumptions of the Prob
24、ability DistributionAssumption 2:The variance of the probability distribution of is constant for all settings of the independent variable x.For our straight-line model,this assumption means that the variance of is equal to a constant,say 2,for all values of x.2011 Pearson Education,IncBasic Assumpti
25、ons of the Probability DistributionAssumption 3:The probability distribution of is normal.Assumption 4:The values of associated with any two observed values of y are independentthat is,the value of associated with one value of y has no effect on the values of associated with other y values.2011 Pear
26、son Education,IncBasic Assumptions of the Probability Distribution.2011 Pearson Education,IncEstimation of 2 for a(First-Order)Straight-Line ModelTo estimate the standard deviation of,we calculateWe will refer to s as the estimated standard error of the regression model.2011 Pearson Education,IncCal
27、culating SSE,s2,s ExampleYoure a marketing analyst for Hasbro Toys.You gather the following data:Ad Expenditure(100$)Sales(Units)1121324254Find SSE,s2,and s.2011 Pearson Education,IncCalculating s2 and s Solution 2011 Pearson Education,Inc10.4Assessing the Utility of the Model:Making Inferences abou
28、t the Slope 1 2011 Pearson Education,IncSampling Distribution of If we make the four assumptions about,the sampling distribution of the least squares estimator of the slope will be normal with mean 1(the true slope)and standard deviation 2011 Pearson Education,IncSampling Distribution of We estimate
29、 by and refer to thisquantity as the estimated standard error of the least squares slope .2011 Pearson Education,IncA Test of Model Utility:Simple Linear RegressionOne-Tailed TestH0:1=0Ha:1 0)Rejection region:t t when Ha:1 0)where t is based on(n 2)degrees of freedom 2011 Pearson Education,IncA Test
30、 of Model Utility:Simple Linear RegressionTwo-Tailed TestH0:1=0Ha:1 0Rejection region:|t|twhere t is based on(n 2)degrees of freedom 2011 Pearson Education,IncInterpreting p-Values for Coefficients in RegressionAlmost all statistical computer software packages report a two-tailed p-value for each of
31、 the parameters in the regression model.For example,in simple linear regression,the p-value for the two-tailed test H0:1=0 versus Ha:1 0 is given on the printout.If you want to conduct a one-tailed test of hypothesis,you will need to adjust thep-value reported on the printout as follows:2011 Pearson
32、 Education,IncInterpreting p-Values for Coefficients in RegressionUpper-tailed test(Ha:1 0):Lower-tailed test(Ha:1|T|INTERCEP 1 -0.1000 0.6350 -0.157 0.8849ADVERT 1 0.7000 0.1914 3.656 0.0354 t=1/S P-ValueS 111 2011 Pearson Education,Inc10.5The Coefficients of Correlation and Determination 2011 Pear
33、son Education,IncCorrelation ModelsAnswers How strong is the linear relationship between two variables?Coefficient of correlationSample correlation coefficient denoted rValues range from 1 to+1Measures degree of associationDoes not indicate causeeffect relationship 2011 Pearson Education,IncCoeffici
34、ent of Correlationwhere 2011 Pearson Education,IncCoefficient of Correlation 2011 Pearson Education,IncCoefficient of Correlation 2011 Pearson Education,IncCoefficient of Correlation 2011 Pearson Education,IncCoefficient of Correlation ExampleYoure a marketing analyst for Hasbro Toys.Ad Expenditure(
35、100$)Sales(Units)1121324254Calculate the coefficient ofcorrelation.2011 Pearson Education,IncCoefficient of Correlation Solution 2011 Pearson Education,IncA Test for Linear CorrelationOne-Tailed TestH0:=0Ha:0)Rejection region:t t(or t tWhere the distribution of t depends on(n 2)degrees of freedom 20
36、11 Pearson Education,IncCondition Required for a Valid Test of CorrelationThe sample of(x,y)values is randomly selected from a normal population.2011 Pearson Education,IncCoefficient of Correlation Thinking ChallengeYoure an economist for the county cooperative.You gather the following data:Fertiliz
37、er(lb.)Yield(lb.)43.0 65.5106.5129.0Find the coefficient of correlation.1984-1994 T/Maker Co.2011 Pearson Education,IncCoefficient of Correlation Solution 2011 Pearson Education,IncIt represents the proportion of the total sample variability around y that is explained by the linear relationship betw
38、een y and x.Coefficient of Determination0 r2 1r2=(coefficient of correlation)2 2011 Pearson Education,IncCoefficient of Determination ExampleYoure a marketing analyst for Hasbro Toys.You know r=.904.Ad Expenditure(100$)Sales(Units)1121324254Calculate and interpret thecoefficient of determination.201
39、1 Pearson Education,IncCoefficient of Determination Solutionr2=(coefficient of correlation)2r2=(.904)2r2=.817Interpretation:About 81.7%of the sample variation in Sales(y)can be explained by using Ad$(x)to predict Sales(y)in the linear model.2011 Pearson Education,Incr2 Computer Output Root MSE 0.605
40、53 R-square 0.8167 Dep Mean 2.00000 Adj R-sq 0.7556 C.V.30.27650 r2 adjusted for number of explanatory variables&sample sizer2 2011 Pearson Education,Inc10.6Using the Model for Estimation and Determination 2011 Pearson Education,IncRegression Modeling Steps 1.Hypothesize deterministic component2.Est
41、imate unknown model parameters3.Specify probability distribution of random error term Estimate standard deviation of error4.Evaluate model5.Use model for prediction and estimation 2011 Pearson Education,IncPrediction With Regression ModelsTypes of predictionsPoint estimatesInterval estimatesWhat is
42、predictedPopulation mean response E(y)for given x Point on population regression lineIndividual response(yi)for given x 2011 Pearson Education,IncWhat Is PredictedMean y,E(y)yyIndividualPrediction,yE(y)=b b0 0+b b1 1xxxPyi=b b0 0+b b1 1x 2011 Pearson Education,IncA 100(1 )%Confidence Interval Estima
43、te for the Mean Value of y at x=xpdf=n 2 2011 Pearson Education,IncFactors Affecting Interval Width1.Level of confidence(1 )Width increases as confidence increases2.Data dispersion(s)Width increases as variation increases3.Sample sizeWidth decreases as sample size increases4.Distance of xp from mean
44、 x1.Width increases as distance increases 2011 Pearson Education,IncWhy Distance from Mean?Sample 2 Lineyxx1x2ySample 1 LineGreater dispersion than x1x 2011 Pearson Education,IncConfidence Interval Estimate ExampleYoure a marketing analyst for Hasbro Toys.You find 0=.1,1=.7 and s=.6055.Ad Expenditur
45、e(100$)Sales(Units)1121324254Find a 95%confidence interval forthe mean sales when advertising is$4.2011 Pearson Education,IncConfidence Interval Estimate Solutionx to be predicted 2011 Pearson Education,IncA 100(1 )%Prediction Interval for an Individual New Value of y at x=xpNote!df=n 2 2011 Pearson
46、 Education,IncWhy the Extra S?Expected(Mean)yyy were trying topredictPrediction,yxxpe eE(y)=b b0 0+b b1 1xyi=b b0 0+b b1 1xi 2011 Pearson Education,IncPrediction Interval ExampleYoure a marketing analyst for Hasbro Toys.You find 0=.1,1=.7 and s=.6055.Ad Expenditure(1000$)Sales(Units)1121324254Predic
47、t the sales when advertising is$400.Use a 95%prediction interval.2011 Pearson Education,IncPrediction Interval Solutionx to be predicted 2011 Pearson Education,Inc Dep Var Pred Std Err Low95%Upp95%Low95%Upp95%Obs SALES Value Predict Mean Mean Predict Predict 1 1.000 0.600 0.469-0.892 2.092 -1.837 3.
48、037 2 1.000 1.300 0.332 0.244 2.355 -0.897 3.497 3 2.000 2.000 0.271 1.138 2.861 -0.111 4.111 4 2.000 2.700 0.332 1.644 3.755 0.502 4.897 5 4.000 3.400 0.469 1.907 4.892 0.962 5.837 Interval Estimate Computer OutputPredicted y when x=4Confidence IntervalSYPredictionInterval 2011 Pearson Education,In
49、cConfidence Intervals v.Prediction Intervalsxyxyi=b b0 0+b b1 1xi 2011 Pearson Education,Inc10.7A Complete Example 2011 Pearson Education,IncExampleSuppose a fire insurance company wants to relate the amount of fire damage in major residential fires to the distance between the burning house and the
50、nearest fire station.The study is to be conducted in a large suburb of a major city;a sample of 15 recent fires in this suburb is selected.The amount of damage,y,and the distance between the fire and the nearest fire station,x,are recorded for each fire.2011 Pearson Education,IncExample 2011 Pearson