《种群的空间分布型及抽样.pptx》由会员分享,可在线阅读,更多相关《种群的空间分布型及抽样.pptx(95页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、 1. 意义 种群生态特性:空间是聚集 分布还是 随机分布, 解决抽样方法,提供理论依据。 2.分类 随机分布:泊松(Poisson)分布 聚集分布:负二项分布(negative binomial distribution) 奈曼分布(Neyman) 泊松二项分布 The simplest view of spatial patterning can be obtained by adopting an individual orientation, and asking the question, Given the location of one individual, what is t
2、he probability that another individual is nearby? There are three possibilities: 1. This probability is increasedaggregated pattern 2. This probability is reduceduniform pattern 3. This probability is unaffectedrandom patternRandomAggregatedUniformFigure4.3 Three possible types of spatial patterning
3、 of individual animals or plant in a population. 3.频次分布理论公式 (1)泊松(普阿松)分布 ( , ),0,1,2.! kp kekk是参数例:蝗蝻的田间分布02050101200112(1)普阿松分布(Poisson 分布)!( ; ),0,1,2.kkp kek称为普阿松分布, 是参数例:对公共汽车客流进行调查,统计某天上午10 3011 47左右每隔20秒钟来到的乘客批数,共得到230个记录。来到批数i0123 4总共频数ni100813496230频率0.430.350.150.040.030.420.360.160.050.01i
4、infn!iipi0.87取100*081*134*23*94*62300.869已经发现许多随机现象服从普阿松分布 (1)社会生活,服务行业 如:电话交换台中来到的呼叫数 公共汽车站来到的 乘客数 (2)物理学 放射性分裂落到某区域的质点数 (3)昆虫个体的空间分布以交换台电话呼叫数为例 (1)平衡性 在t0,t0+t中来到的呼叫数只与时间间隔长度t有关,而与时间起点T0无关 (2)独立增量性(无后效性) 在t0,t0+t内来到k个呼叫这一事件与时刻T0前发生的事件独立 (3)普通性 在充分小的时间间隔中,最多只来到一个呼叫虫数 x频率 f f*x0225011301302408031030
5、43124082522 5 20 .6 1 84 0 8fxxf000.6180.53910!pee 另样的理论数 n*p0=408*0.5391=219.09 有一头虫的样本的理论数 n*p1=135.9 虫数 x观察值 (o)理论值(c)0225219.90.111130135.90.2624042.20.093108.70.21 431.32.222.892()occ22()2.89occ意味不是一个小概率事件(p0.05),没有理由否定假设220.0520.052.89 22查表得: (自由度为3)=7.815 计算所得2离散数据的检验法22111989()iiinniiSYY22年,
6、Pearson提出把作为一个度量实际数(观察值)和预计数(理论值)之间的偏离度的数据,其定义为实际数预计数)预计数要求各组内的预计数都不少于5,当某组的Y少于5时,须把它和相邻的一组或几组合并直到Y大于5,然后再用上式计算 x2值。 检验的理论与方法检验的理论与方法1 公式 O为实际观测值,E为理论推算值。 其基本原理是应用理论推算值与实际观测值之间的偏离程度来决定其 值的大小。 是理论分布总体的频数 是观察分布总体的频数 两个样本来自不同的总体2212120:H1:HkiiiiEEO122)(2分布的特点 df=1 df=3 df=5(1) 分布于区间1, ),偏斜度随自由度降低而增大,当自
7、由度df=1时,曲线以纵轴为渐近线。(2)随自由度df增大, 分布趋左右对称,当df30时, 分布接近正态。2222223 检验的基本步骤(1)建立检验假设,确定检验水平。 (2)计算检验统计量0:H1:H12120.05kiiiiEEO122)((3)确定概率P值,计算自由度dfk-1 由 和自由度查统计表 的临界值(4)判断结果 临界值检验假设的关系 值 P 假设 判断 0.05 不拒绝 差异无显著性 0.05 拒绝 差异有显著性22,df222,05. 02,05. 00H0H例:假定某地婴儿出生的男女比例为1:1。 研究者抽取了一个含10,000名婴儿的样品,男孩5100,女孩4900
8、,问他是否证实了假设或否定了假设。 某地婴儿出生性比为1:1 拒绝 婴儿性比不为1:1kiiiiEEO122)(5000)50004900(5000)50005100(2240:H121:H1284. 321 ,05. 021 ,05. 020H注:在自由度df1时,需进行连续性矫正,其矫正的 为:适合性检验 比较观测数与理论数是否符合的假设检验叫适合性检验。例如在遗传学上,常用 检验来测定所得的结果是否符合孟德尔分离规律,自由组合定律等。2ckiiiicEEO122)5 . 0(2例 有一鲤鱼遗传试验,以荷包红鲤(红色)与湘江野鲤(青灰色)杂交,其 代获得如表5-2所列得体色分离尾数,问这一
9、资料的实际观察值是否符合孟德尔的青:红=3:1一对等为基因的遗传规律? 表表 鲤鱼遗传试验遗传试验 F2观察结果观察结果 体 色 青 灰 色 红 色 总数 F2观测尾数 1503 99 16022F(1) 鲤鱼体色 分离符合3:1比率。(2)取显著水平(3)计算 青灰色理论数 红色理论数22210.51503 1201.50.51201.5iiciiOEE63.3015 .4005 . 05 .4009920:H2F05. 02c1316021201.54E 211602400.54E (4)差 值表。df=1时, 故否定 ,接受 即鲤鱼体色 分离不符合3:1比率。220.053.842F2c
10、20.05,10HAH 正二项分布是( p+q)n 的展开式的各项,其中n为个体总数,p,q为分成对比两类期望的比例。Student (1907). -1, ( - ) ,kmqpp qmpk 其中,为总体平均值,展开上述式子,于是一个样本单位有r个个体的概率为2(1 ) ! (1 ) !1,rrkrkrpprkqsxpkpx可以估算出p,k。矩法 由此可以推出2;,0 ,mVmkVmkVmkV方差, 平均数;当负二项泊松当22()1. (1)95%ixxsCxxnCC222n服从均数为1,方差为的正态分布(n-1)C的概率为的置信区间为2n 1 2(n-1)落入区间,随机型分布落入区间外,聚
11、集型分布上述蝗蝻例子中220.6691.080.61828161212165649112 0.071 0.14sIxnn 0.861.081.14说明上述蝗蝻属Poisson分布。 10 , vImII随 机 分 布I 0 ,聚 集 分 布当 种 群 由 于 随 机 死 亡原 来分 之 一 时 ;聚 集 度原 来分 之 一 时 Index of Dispersion Test. We define an index of dispersion I to be For the theoretical Poisson distribution, the variance equals the me
12、an, so the expected value of I is always 1.0 in a Poisson world. The simplest test statistic for the index of dispersion is a chi-squared one: where I = Index of dispersion (as defined in equation 4.3) n = Number of quadrats counted = value of chi-squared with (n-1) degrees of freedom.2Observed vari
13、ancesObserved meanxI 21I n21I n0 41 82 23 54 25 36 1 虫数 频率25例:取了25个样,调查蚯蚓的田间分布。25,n 2.24x 1.809S 223.271.462.2411.4625 135.0sIxI n由于 observed chi-squared 20.97520.02520.97512.40;20.025.39.36所以,我们接受原假设:蚯蚓田间分布符合Poisson分布。 提出 负二项分布中的K2; , mVmkk 时, V=m, 负二项泊松个体分布呈完全随机性当k0时,V种群分布极不均匀,聚集度极高1k=作为聚集度量kk的特性:
14、当种群密度因为随机死亡而减小时,k保持不变,表示种群空间分布的内在特点,而与密度无关22221 logloglog, , log0,1,1, log0,1, log0,bbsabmsa mTaylozamsabamsaba mmab 2大量生物资料中总结出下列公式,幂法则。当log =0,b=1,s种群在一切密度下随机分布,种群在一切密度下均是聚集的,但不是聚集度的密度依赖性当种群在一切密度下均是聚集的,且具密度依赖性。当21(1) 011, ,bbbsa mmmmm ,所以,密度越高,种群分布越均匀,(聚集度越低)*1(1 )jnjjjnjxxmx例:a 1b 0c 2d 3X1=1; x2
15、=0X3=2; x4=3n=4A: 一头“独居” 1*(1-1)B: 没有邻居C: 有两头,各以对方为邻居;2*(2-1)=2D: 每个有两个邻居, 3*(3-1)=6,总共“邻居”数为:0+0+2+6=8 *881.336jmx平均每个个体有1.33个邻居2*2*2*(1) (1) 11, ssmmmmmmmsm mm 可以证明:,随机分布所以 聚集度指标:*mm*1,1,1,mmmmmm随 机 分 布均 匀 分 布聚 集 分 布*6. Iwao (1968,1972)m m 回归法 Iwao 发现*mm:m0, =0 0, 1 上,文章中说:“任何实验可以作为是许多可能在相同条件下作出的实
16、验的总体中的一个个体.一系列的实验则是以从这个总体中所抽得的一个样品” 1.总体与抽样 设一块棉田有N株棉株,每株上某种害虫数分别为X1,X2.XN,Ni=1N22iiN2ii1 x=XiN1: =(x -x)N1 =(x -x)N总体平均数:总体方差总体标准差: 从总体N中,随机抽取n株(nN)样本,每株虫数分别为X1,X2,Xn.ini=1n22i1N2i11 x=xn1: S =(x -x)n1 s=(x -x)n样本平均数:样本方差样本标准差:目的:通过样本对总体做出推断 1908年,“Student”发表了t分布 d.f.=n-1xtxS(1) n, t(2) tt=0当分布正态分布
17、分布是对称的分布,分布曲线中线为 例: 棉田中随机调查50株棉株,以估计该棉田中害虫的数量.i0.050.05ni= 1n22i121287 x =x = 5.74n501 S(x -x )20.95nS(4.5720.95(=0.646505.7420.645.7420.64 4.467.02 snxtxt平 均 每 株 虫 数( 方 差 ) =标 准 差 )=20.95标 准 误 )xxxSSS0.05222222t n=d95% t24 n=, d- dxstsns概率,允许误差例: 洪泽湖蝗区虫数样本数(f) fx017015353218363103042810012721271.27
18、1000.8650.05 t =2 d=0.540.865n=13.840.25 d=0.140.865n=3460.01xs,允许误差若 如果,我们引入变异系数(coefficient of variation)这儿, =标准差 =观察平均数 那么,绝对误差 可写成相对误差 ,(以百分比形式) (方程1)sxxdt sr100 xt srx100t srx n2222100tsnxr2,0.05tsCVx22100200;CVtCVnrr 两个平均数的比较两个平均数的比较例如,我们要比较两个池塘中同一种鱼的重量是否有差异,典型的方法是个抽取一定数量的样本用t检验来检验两样本平均数是否有差异。
19、但是,如何在抽样前回答应该取多少样? Snedecor and Cochran (1967,113)提出了如下的近似公式: 一般 这儿 =从两个种群中的每一个抽取的样本大小; =水平为 的标准正态离差值 ( )2222 ZZsnd50n nZ0.051.96;Z0.012.576Z =水平为 的型错误概率下的标准正态离差值(见下表) =测量的方差。(已知,或推测)。 =你希望以 概率能检测出的两平均值的最小差异。 Type error Power Two-tailed 0.40 0.60 0.25 0.20 0.80 0.84 0.10 0.90 1.28 0.05 0.95 1.64 0.0
20、1 0.99 2.33 0.001 0.999 2.58Z2sABd1 1Z决策决策 Power越大,决策结果越可靠 不拒绝H0 拒绝H0 H0是真 决策正确(概率 1) I型错误(概率) H0是假 II型错误( P) 决策正确( P1) power 例. 如果上例中我们希望检测出的平均数差异是: (从以前的研究中知道)如果, 则 条。2249.4ABdgsg0.01,0.05. 2222 2.576 1.649.4196.34n2. SAMPLE SIZE FOR DISCRETE VARIABLES Counts of the numbers of plants in a quadrat
21、or the numbers of eggs in a nest differ from continuous variables in their statistical properties. The frequency distribution of counts will often be described by either the binomial distribution, the Poisson distribution or the negative binomial distribution (Elliott 1977). The sampling properties
22、of these distributions differ, so we require a different approach to estimating sample sizes needed for counts.(1)Proportions and Percentages Proportions like the sex ratio or fraction of juveniles in a population are described statistically by the binomial distribution. All the organisms are classi
23、fied into two classes, and the distribution has only two parameters: Proportion of types in the population Proportion of types in the population1qp p If sample size is above 20, we can use the normal approximation to the confidence interval:Where Observed proportion Value of Students t-distribution
24、for n-1 degrees of freedom Standard error of Thus the desired margin of error is Solving for n, the sample size required is ppt st p ps ppq n ppqdt stn22 tpqnd where n=Sample size needed for estimating the proportion p d=Desired margin of error in our estimate As a first approximation for we can use
25、 We need to have an approximate value of p to use in this equation. Prior information, or a guess, should be used; the only rule-of-thumb is that when in doubt, pick a value of p closer to 0.5 than you guess. This will make your answer conservative. As an example, suppose you wish to estimate the se
26、x ratio of a deer population. You expect p to be about 0.40, and you would like to estimate p within an error limit of with . From equation0.052.0.t0.020.05222.00.40 1 0.4024000.02ndeer(2) Counts from a Poisson DistributionSample size estimation is very simple for any variable that can be described
27、by the Poisson distribution, in which the variance equals the mean. From this it follows thatorThus from equation,(1) assuming : where Sample size required for a Poisson variable Desired relative error (as percentage) Coefficient of variation =2ssxCVxxx1CVx0.05222002001;.(2)CVnrrxnrCV 1xFor example
28、,if you are counting eggs in starling nests and know that these counts fit a Poisson distribution and that the mean is about 6.0, then if you wish to estimate this mean with precision of (width of confidence interval), you have: nestsEquation (2) can be simplified for the normal range of relative er
29、rors as follows: For precision 5%22001266.756.0n 400nx10%3. STATISTICAL POWER ANALYSIS DecisionState of real world Do not reject null hypothesis Reject the null hypothesisNull hypothesis is Correct decision Type error actually true (probability =1- ) (probability = )Null hypothesis is Type error Cor
30、rect decision actually false (probability = ) (probability =(1- )=power)Most ecologists worry about , the probability of a Type error, but there is abundant evidence now that we should worry just as much or more about ,the probability of a Type error (Peterman 1990; Fairweather 1991). Power analysis
31、 can carried out before you begin your study (a priori, or prospective power analysis) or after you have finished (retrospective power analysis). Here we discuss a priori power analysis as it is used for the planning of experiments. Thomas (1997)discussed retrospective power analysis. The key point
32、you should remember is that there are four variables affecting any statistical inference: sample sizeProbability of a Probability of a Type error Type error Magnitude of the effect = effect sizeThese four variables are interconnected, and once any three of them are fixed, the fourth is automatically
33、 determined. Looked at from another perspective, given any three of these, you can determine the fourth.Figure 7.3 An example of how power calculations can be visualized.In this simple example, a t-test to be carried out to determine if the plant nitrogen level has changed from the base level of 3.0
34、% (the null hypothese )to the improved level of 3.3% (the alternative hypothese). Given n=100, sSUMMARYThe most common question in ecological research is, how large a sample should I take? This chapter attempts to give a general answer to this question by providing a series of equations from which s
35、ample size may be calculated. It is always necessary to know something about the population you wish to analyze unless you use guesswork or prior observations. You must also make some explicit decision about how much error you will allow in your estimates (or how small a confidence interval you wish
36、 to have). For continuous variables like weight or length, we can assume a normal distribution and calculate the required sample sizes for means and for variances quite precisely. For counts, we need to know the underlying statistical distributionbinomial, Poisson, or negative binomialbefore we can
37、specify sample sizes needed. Power analysis explores the relationships between the four interconnected variables (probability of Type error), (probability of Type error), effect size, and sample size. Fixing three of these automatically fixes the fourth, and ecologists should explore these relations
38、hips before they begin their experiments. Significant effect sizes should be specified on ecological grounds before a study is begun.Sampling Designs: Random, Adaptive and Systematic Sampling(1)Simple Random Sampling(2)Stratilied Random Sampling(3)Adaptive Sampling(4)Systematic Sampling Simple rando
39、m sampling is the easiest and most common sampling design. Each possible sample unit must have an equal chance of being selected to obtain a random sample. All the formulas of statistics are based on random sampling, and probability theory is the foundation of statistics. Thus you should always samp
40、le randomly when you have a choice. In some cases the statistical population is finite in size, and the idea of a finite population correction must be added into formulas for variances and standard errors. These formulas are reviewed for measurements, ratios, and proportion. Often a statistical popu
41、lation can be subdivided into homogeneous subpopulations, and random sampling can be applied to each subpopulation separately. This is stratified random sampling, and represents the single most powerful sampling design that ecologists can adopt in the field with relative ease. Stratified sampling is
42、 almost always more precise than simple random sampling, and every ecologist should use it whenever possible. Sample size allocation in stratified sampling can be determined using proportional or optimal allocation. To use optimal allocation, you need rough estimates of the variances in each of the
43、strata and the cost of sampling each strata. Optimal allocation is more precise than proportional allocation, and is to be preferred. Some simple rules are presented to allow you to estimate the optimal number of strata you should define in setting up a program of stratified random sampling. If orga
44、nisms are rare and patchily distributed, you should consider using adaptive cluster sampling to estimate abundance. When a randomly placed quadrat contains a rare species, adaptive sampling adds quadrats in the vicinity of the original quadrat to sample the potential cluster. This additional nonrand
45、om sampling requires special formulas to estimate abundance without bias. Systematic sampling is easier to apply in the field than random sampling, but may produce biased estimates of means and confidence limits if there are periodicities in the data. In field ecology this is usually not the case, a
46、nd systematic samples seem to be the equivalent of random samples in many field situations. If a gradient exists in the ecological community, systematic sampling will be better than random sampling for describing it. What is the likelihood that problems like periodic variation will occur in actual f
47、ield data? Milne(1959) attempted to answer this question by looking at systematic samples taken on biological populations that had been completely enumerated. He analyzed data from 50 populations and found that, in practice, there was no error introduced by that a centric systematic sample is a simp
48、le random sample, and using all the appropriate formulas from random sampling theory. Step 1. Calculate the average abundance of each of the networks: (8.35) where =Average abundance of the i-th network =Abundance of the organism in each of the k quadrats in the i-th network =Number of quadrats in t
49、he i-th netwrok Step 2. From these values we obtain an estimator of the mean abundance as follows: (8.36) where Unbiased estimate of mean abundance from adaptive cluster sampling Number of initial sampling units selected via random sampling1kjiiyiwmiwiyimiiwxnx n If the initial sample is selected wi
50、th replacement, the variance of this mean is given by: (8.37) where Estimated variance of mean abundance for sampling with replacement and all other terms are as defined above.If the initial sample is selected without replacement, the variance of the mean is given by: (8.38) where N = Total number o