1、初初级统计学学Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.第1页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Chapter 1Introduction to Statistics1-1 Overview1-2 Types of Data1-3 Critical Thinking1-4 Design of Experiments第2页,共59页,编辑于20
2、22年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Created by Tom Wegleitner,Centreville,VirginiaSection 1-1Overview第3页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.OverviewA common goal of studies and surveys and other data
3、 collecting tools is to collect data from a small part of a larger group so we can learn something about the larger group.In this section we will look at some of the ways to describe data.第4页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Dataobservation
4、s(such as measurements,genders,survey responses)that have been collectedDefinition第5页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.vStatisticsa collection of methods for planning studies and experiments,obtaining data,and then organizing,summarizing,pres
5、enting,analyzing,interpreting,and drawing conclusions based on the dataDefinition第6页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.描述统计与推断统计描述统计与推断统计描述统计描述统计用图形、表格和概括性的数字用图形、表格和概括性的数字对数据进行描述的统计方法,所论不超出已对数据进行描述的统计方法,所论不超出已有数据。有数据。推断统计推断统计(统计推断)(统计推断)根据样本信息
6、对总根据样本信息对总体进行估计、假设检验、预测或其他推断的统体进行估计、假设检验、预测或其他推断的统计方法计方法。第7页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Application areas of statisticsactuarial work(actuarial work(精算精算)agriculture()agriculture(农业农业)animal science(animal science(动物学动物学)anthropology()a
7、nthropology(人类学人类学)archaeology(archaeology(考古学考古学)auditing()auditing(审计学审计学)crystallography(crystallography(晶体学晶体学)demography()demography(人口统计学人口统计学)dentistry(dentistry(牙医学牙医学)ecology()ecology(生态学生态学)econometrics(econometrics(经济计量学经济计量学)education()education(教育学教育学)election forecasting and projection
8、(election forecasting and projection(选举预测和策划选举预测和策划)engineering(engineering(工程工程)epidemiology()epidemiology(流行病学流行病学)finance(finance(金融金融)human genetics()human genetics(人类遗传学人类遗传学)fisheries research(fisheries research(水产渔业研究水产渔业研究)gambling(gambling(赌博赌博)genetics()genetics(遗传学遗传学)geography(geography(
9、地理学地理学)geology()geology(地质学地质学)historical research(historical research(历史研究历史研究)第8页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Interesting StatisticsWomen blink nearly twice as much as men.Right-handed people live,on average,nine years longer than left
10、-handed people.You share your birthday with at least nine million other people.China has more English-speaking people than the United states.American Airlines saves$40,000 in a year by eliminating one olive from each salad served in first-class.第9页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,I
11、nc Publishing as Pearson Addison-Wesley.Definitionv Population the complete collection of all elements(scores,people,measurements,and so on)to be studied;the collection is complete in the sense that it includes all subjects to be studiedvSample Subcollection of members selected from a population第10页
12、,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Chapter Key Conceptsv Sample data must be collected in an appropriate way,such as through a process of random selection.v If sample data are not collected in an appropriate way,the data may be so completely u
13、seless that no amount of statistical torturing can salvage them.第11页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Created by Tom Wegleitner,Centreville,VirginiaSection 1-2 Types of Data第12页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing
14、 as Pearson Addison-Wesley.Key ConceptThe subject of statistics is largely about using sample data to make inferences(or generalizations)about an entire population.It is essential to know and understand the definitions that follow.第13页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing
15、 as Pearson Addison-Wesley.vParameter a numerical measurement describing some characteristic of a population.populationparameterDefinition第14页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.DefinitionvStatistic a numerical measurement describing some chara
16、cteristic of a sample.samplestatistic第15页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.DefinitionvQuantitative data numbers representing counts or measurements.Example:The weights of supermodels第16页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc P
17、ublishing as Pearson Addison-Wesley.DefinitionvQualitative(or categorical or attribute)datacan be separated into different categories that are distinguished by some nonnumeric characteristicExample:The genders(male/female)of professional athletes第17页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education
18、,Inc Publishing as Pearson Addison-Wesley.Working with Quantitative DataQuantitative data can further be described by distinguishing between discrete and continuous types.第18页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.vDiscrete data result when the nu
19、mber of possible values is either a finite number or a countable number Example:The number of eggs that a hen laysDefinition第19页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.vContinuous(numerical)data result from infinitely many possible values that corr
20、espond to some continuous scale that covers a range of values without gaps,interruptions,or jumpsDefinitionExample:The amount of milk that a cow produces;e.g.2.343115 gallons per day第20页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Levels of MeasurementA
21、nother way to classify data is to use levels of measurement.Four of these levels are discussed in the following slides.第21页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Nominal level of measurement characterized by data that consist of names,labels,or
22、categories only,and the data cannot be arranged in an ordering scheme(such as low to high)Definition第22页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Characteristic1.1.计量层次最低计量层次最低2.2.对事物进行平行的分类和分组对事物进行平行的分类和分组,组或类别之组或类别之间可以改变顺序间可以改变顺序3.3.各类别可以指定数字代码表示各类
23、别可以指定数字代码表示4.4.使用时必须符合类别穷尽和互斥的要求使用时必须符合类别穷尽和互斥的要求5.5.具有具有=或或 的数学特性的数学特性例如:性别:男、女企业性质:国有、集体、私营、外资第23页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Ordinal level of measurement involves data that can be arranged in some order,but differences between data
24、 values either cannot be determined or are meaninglessDefinition第24页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.对事物分类的同时给出各类别的顺序对事物分类的同时给出各类别的顺序未测量出类别之间的准确差值未测量出类别之间的准确差值数据表现为数据表现为“类别类别”,但有序,但有序具有具有 或或 的数学特性的数学特性例如:产品分为:一等品、二等品、三等品考试成绩:优、良、中、及格、不及格态度:非
25、常同意、同意、无所谓、不同意、非常不同意Characteristic第25页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Interval level of measurement like the ordinal level,with the additional property that the difference between any two data values is meaningful,however,there is no natu
26、ral zero starting point(where none of the quantity is present)Definition第26页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.1.1.对事物的准确测度对事物的准确测度2.2.比顺序尺度精确,可以转化为顺序尺度比顺序尺度精确,可以转化为顺序尺度3.3.数据表现为数据表现为“数值数值”4.4.具有具有+或或-的数学特性的数学特性例如:年份1000年、2004年温度摄氏或华氏的“度”Chara
27、cteristic第27页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Ratio level of measurementthe interval level with the additional property that there is also a natural zero starting point(where zero indicates that none of the quantity is present);for values
28、at this level,differences and ratios are meaningfulDefinition第28页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.1 1、对事物的准确测度、对事物的准确测度2 2、与间隔数据处于同一层次、与间隔数据处于同一层次3 3、数据表现为、数据表现为“数值数值”4 4、有、有绝对零点绝对零点5 5、具有、具有+、-、的数学特性的数学特性例如:收入、身高、产量Characteristic第29页,共59页,编
29、辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Summary-Levels of MeasurementvNominal-categories onlyvOrdinal-categories with some ordervInterval-differences but no natural starting pointvRatio-differences and a natural starting point第30页,共59页,编辑于2022年,星期五Copyrig
30、ht 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Recapv Basic definitions and terms describing datav Parameters versus statisticsv Types of data(quantitative and qualitative)v Levels of measurementIn this section we have looked at:第31页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Educat
31、ion,Inc Publishing as Pearson Addison-Wesley.Created by Tom Wegleitner,Centreville,VirginiaSection 1-3 Critical Thinking第32页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Key ConceptsvSuccess in the introductory statistics course typically requires more c
32、ommon sense than mathematical expertise.vThis section is designed to illustrate how common sense is used when we think critically about data and statistics.第33页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Misuses of Statistics第34页,共59页,编辑于2022年,星期五Copyr
33、ight 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Misuse#1-Bad SamplesvVoluntary response sample (or self-selected sample)one in which the respondents themselves decide whether to be includedIn this case,valid conclusions can be made only about the specific group of people who agr
34、ee to participate.第35页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Misuse#2-Small SamplesConclusions should not be based on samples that are far too small.Example:Basing a school suspension rate on a sample of only three students第36页,共59页,编辑于2022年,星期五Co
35、pyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.To correctly interpret a graph,you must analyze the numerical information given in the graph,so as not to be misled by the graphs shape.Misuse#3-Graphs第37页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pear
36、son Addison-Wesley.Part(b)is designed to exaggerate the difference by increasing each dimension in proportion to the actual amounts of oil consumption.Misuse#4-Pictographs第38页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Misuse#5-PercentagesMisleading or
37、 unclear percentages are sometimes used.For example,if you take 100%of a quantity,you take it all.110%of an effort does not make sense.第39页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Loaded Questionsv Order of Questionsv RefusalsvSelf Interest Studyv
38、 Precise Numbersv Partial Picturesv Deliberate DistortionsOther Misuses of Statistics第40页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Recapv Reviewed 13 misuses of statisticsv Illustrated how common sense can play a big role in interpreting data and sta
39、tisticsIn this section we have:第41页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Created by Tom Wegleitner,Centreville,VirginiaSection 1-4 Design of Experiments第42页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesle
40、y.Key Conceptv If sample data are not collected in an appropriate way,the data may be so completely useless that no amount of statistical tutoring can salvage them.第43页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Observational study observing and meas
41、uring specific characteristics without attempting to modify the subjects being studiedDefinitionsv Experiment apply some treatment and then observe its effects on the subjects;(subjects in experiments are called experimental units)第44页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing
42、 as Pearson Addison-Wesley.v Confounding occurs in an experiment when the experimenter is not able to distinguish between the effects of different factorsDefinition在一个实验中,当不能够区分不同因素带来的效果在一个实验中,当不能够区分不同因素带来的效果时,就称为发生了混淆。时,就称为发生了混淆。第45页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing
43、as Pearson Addison-Wesley.Controlling Effects of Variablesv Blinding(致盲)(致盲)subject does not know he or she is receiving a treatment or placebo(double-blind)v Blocks(群)(群)groups of subjects with similar characteristicsv Completely Randomized Experimental Designsubjects are put into blocks through a
44、process of random selectionv Rigorously Controlled Designsubjects are very carefully chosen第46页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.v Replication repetition of an experiment when there are enough subjects to recognize the differences from differ
45、ent treatmentsReplication and Sample Sizev Sample Size use a sample size that is large enough to see the true nature of any effects and obtain the sample using an appropriate method,such as one based on randomness第47页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addiso
46、n-Wesley.v Random Sample members of the population are selected in such a way that each individual member has an equal chance of being selected DefinitionsvSimple Random Sample(of size n)subjects selected in such a way that everypossible sample of the same size n has the same chance of being chosen第
47、48页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Stratified Samplingsubdivide the population into at least two different subgroups that share the same characteristics,then draw a sample from each subgroup(or stratum)第49页,共59页,编辑于2022年,星期五Copyright 2007 P
48、earson Education,Inc Publishing as Pearson Addison-Wesley.要求:要求:层内各单位的差异尽可能小,而使层与层之间的差异尽层内各单位的差异尽可能小,而使层与层之间的差异尽可能大。可能大。优点优点:(1 1)分层抽样除了可以对总体进行估计外,还可以对各)分层抽样除了可以对总体进行估计外,还可以对各层的子总体进行估计;层的子总体进行估计;(2 2)分层抽样可以按自然区域或行政区域划分,使抽样组)分层抽样可以按自然区域或行政区域划分,使抽样组织和实施都比较方便;织和实施都比较方便;(3 3)分层抽样的样本分布在各个层内,从而使样本在)分层抽样的样
49、本分布在各个层内,从而使样本在总体中的分布比较均匀;总体中的分布比较均匀;(4 4)如果分层抽样做的好,可以提高抽样精度。)如果分层抽样做的好,可以提高抽样精度。第50页,共59页,编辑于2022年,星期五Copyright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.Systematic SamplingSelect some starting point and then select every k th element in the population第51页,共59页,编辑于2022年,星期五Cop
50、yright 2007 Pearson Education,Inc Publishing as Pearson Addison-Wesley.(一)无关标志排队系统抽样(一)无关标志排队系统抽样 是指排队的标志与调查内容没有直接关系。是指排队的标志与调查内容没有直接关系。特点:特点:总体排列的顺序实际上仍然是随机的,相当于简单随机抽样。总体排列的顺序实际上仍然是随机的,相当于简单随机抽样。优点优点:(1 1)简便易行;)简便易行;(2 2)样本在总体中的分布比较均匀,抽样误差通常)样本在总体中的分布比较均匀,抽样误差通常 要小于简单随机抽样。要小于简单随机抽样。第52页,共59页,编辑于202