《Natureguideline中山大学循证医学.pptx》由会员分享,可在线阅读,更多相关《Natureguideline中山大学循证医学.pptx(30页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、会计学1Natureguideline中山大学中山大学(zhn shn d xu)循证医学循证医学第一页,共30页。n nAn editorial of Nature Medicine(2005):“Some of the articles published in Nature and Nature Medicine were criticized due to the deficiency in statistical issues”.Challenge to Nature Medicine第1页/共30页第二页,共30页。What happened?n nEmili Garca-Bert
2、hou and Carles Alcaraz(Girona Univ.,Emili Garca-Berthou and Carles Alcaraz(Girona Univ.,Spain)published an article in Spain)published an article in BMC Medical Research BMC Medical Research Methodology Methodology(May 2004).(May 2004).They reviewed 181 research papers of They reviewed 181 research p
3、apers of Nature Nature(2001),(2001),found that 38%of them have at least one mistake in found that 38%of them have at least one mistake in statistics.statistics.n nSince then,a series of critical articles have been published,of Since then,a series of critical articles have been published,of which one
4、 written by Robert Matthews(The Financial Times)which one written by Robert Matthews(The Financial Times)analyzed the statistical methodology of the articles in Nature analyzed the statistical methodology of the articles in Nature Medicine(2000).They found that 31%of the authors had Medicine(2000).T
5、hey found that 31%of the authors had misunderstood the meaning of misunderstood the meaning of P-value,even P-value,even some one some one reportedreported the the P-valueP-value with unnecessary precision with unnecessary precision(0.002387).(0.002387).第2页/共30页第三页,共30页。Independent statistical“audit
6、”n nNature Medicine invited two experts from the Nature Medicine invited two experts from the University of Columbia to work out“University of Columbia to work out“statistical audit”statistical audit”,especially to evaluate 21 articles published in 2003,especially to evaluate 21 articles published i
7、n 2003 with a list of consolidated criteria on statistics.with a list of consolidated criteria on statistics.n nThey found that some papers almost did not have any They found that some papers almost did not have any quantitative analysis,and some contained very quantitative analysis,and some contain
8、ed very complicated statistical and mathematical plicated statistical and mathematical issues.While most of them have just used a litter statistical While most of them have just used a litter statistical testing,but with incomplete descriptions such that testing,but with incomplete descriptions such
9、 that one could hardly assess whether they were one could hardly assess whether they were appropriate or not.appropriate or not.第3页/共30页第四页,共30页。第4页/共30页第五页,共30页。Checklist of statistical adequacy 第5页/共30页第六页,共30页。1.Reported n at start of study and for each analysis2.Provided sample size calculation
10、or justificationExamplesWe believed that.the incidence of symptomatic deep venous thrombosis or pulmonary embolism or death would be 4%in the placebo group and 1.5%in the ardeparin sodium group.Based on 0.9 power to detect a significant difference(P=0.05,two-sided),976 patients were required for eac
11、h study group.To compensate for non-evaluable patients,we planned to enroll 1000 patients per group第6页/共30页第七页,共30页。n nTo have an 85%chance of detecting assignificant (at the two sided 5%level)a five pointdifference between the two groups in the mean SF-36 general health perception scores,with anass
12、umed standard deviation of 20 and a loss tofollow up of 20%,360 women(720 in total)in each group were required.第7页/共30页第八页,共30页。3.Identified all statistical methods unambiguously4.If statistical methods were described adequately,were any of them clearly inappropriate?ExampleAll data analysis was car
13、ried out according to a preestablished analysis plan.Proportions werecompared by tests with continuity correction orFishers exact test when appropriate.Mean serum retinol concentrations were compared by t test.Two sided significance tests were used throughout.第8页/共30页第九页,共30页。n nMultivariate analyse
14、s were conductedwith logistic regression.The durations of episodesand signs of disease were compared by using proportional hazards regression.第9页/共30页第十页,共30页。Methods for additional analyses,such as subgroupMethods for additional analyses,such as subgroupanalyses and adjusted analyses:analyses and a
15、djusted analyses:ExampleProportions of patients responding were compared Proportions of patients responding were compared between treatment groups with the Mantel-Haenszelbetween treatment groups with the Mantel-Haenszelchisquared test,adjusted for the stratification variable,chisquared test,adjuste
16、d for the stratification variable,methotrexate use.methotrexate use.n n.it was planned to assess the relative benefit of.it was planned to assess the relative benefit ofCHART in an exploratory manner in subgroups:CHART in an exploratory manner in subgroups:age,sex,performance status,stage,site,and h
17、istology.age,sex,performance status,stage,site,and histology.To test for differences in the effect of CHART,aTo test for differences in the effect of CHART,achisquar test for interaction was performed,or whenchisquar test for interaction was performed,or whenappropriate a chi-squared test for trend(
18、131).appropriate a chi-squared test for trend(131).第10页/共30页第十一页,共30页。5.Provided alpha for all statistical tests6.Specified whether tests were one-sided or two-sided7.Stated whether the data met the assumptions 7.Stated whether the data met the assumptions of the testof the test8.Reported actual P v
19、alues for primary analyses第11页/共30页第十二页,共30页。ExampleThe data of two samples were adequately The data of two samples were adequately normally normally distributeddistributed(Shapiro-Wilk testShapiro-Wilk test:P P1 1=0.466=0.466;P P2 20.4820.482)and the two population and the two population variances
20、were equalvariances were equal at theat the significant level 0.10significant level 0.10(F F1.3451.345;P P=0.261=0.261),so,so two independent samples two independent samples t t test test was used was used(t t=4.137=4.137;dfdf=18=18;P P=0.001=0.001).The results indicated a.The results indicated a st
21、atistically significant difference between effects of statistically significant difference between effects of two drugs two drugs atat two-tailedtwo-tailed significant level 0.05significant level 0.05 and the and the average increase of concentration of Hb was higher in average increase of concentra
22、tion of Hb was higher in patients taking the new drug,which could also be patients taking the new drug,which could also be observed from the observed from the 95%confidence interval of the 95%confidence interval of the differencedifference of two population means(3.829,11.731).of two population mean
23、s(3.829,11.731).第12页/共30页第十三页,共30页。9.Were the statistical measures(mean,standard error,standard deviation,etc.)reported,andwere they clearly labeled?ExampleThe results show that the mean SD of IL-2 for the experimental group(n=31)was 16.00IU/ml7.50 IU/ml and for the control group(n=30)was 20.00IU/ml
24、8.00 IU/ml;the difference betweenthe two group means was 4.00IU/ml,and the 95%CI of the difference was(0.0304,7.9696)(IU/mL)第13页/共30页第十四页,共30页。10.Was the unit of analysis clearly stated in allcomparisons?11.Are mean and standard deviation used to describe data sets that may be non-normally distribut
25、ed or when the sample size is very small?表4-5试验组和对照组治疗(zhlio)前血气分析结果What are the problems?第14页/共30页第十五页,共30页。12.Explanation of unusual or complex statisticalMethods ExampleExampleIn order to compare the effects of common feed,feed withIn order to compare the effects of common feed,feed withplasma pr
26、otein and feed with bioprotein on weight plasma protein and feed with bioprotein on weight growing to weaning young pigsgrowing to weaning young pigs,30 weaning young pigs 30 weaning young pigs were matched to 10 blocks by gender,days of age and were matched to 10 blocks by gender,days of age and ba
27、seline weight.Then 3 individuals in each block were baseline weight.Then 3 individuals in each block were randomly assigned to 1 of 3 treatment groups.After 10randomly assigned to 1 of 3 treatment groups.After 10days,the changes in weights from baseline were measured.days,the changes in weights from
28、 baseline were measured.-Random block design-Random block design第15页/共30页第十六页,共30页。The The mean change of weightmean change of weight SD SD was 3.33kg was 3.33kg 0.48kg 0.48kg for the group of common feed,3.83kg for the group of common feed,3.83kg 0.61kg for that of 0.61kg for that of plasma protein
29、,and 4.10kg plasma protein,and 4.10kg 0.68kg for that of bioprotein.0.68kg for that of bioprotein.Results of Results of two-way ANOVAtwo-way ANOVA under the under the significance significance level of 0.05level of 0.05 indicated statistically indicated statistically significant differences signific
30、ant differences among 3 treatment groupsamong 3 treatment groups(F F=6.8112,=6.8112,P P=0.0063).Similar=0.0063).Similar results were found among 10 blocks(results were found among 10 blocks(F F=2.7407,=2.7407,P P=0.0328).=0.0328).-Results of ANOVA-Results of ANOVA 第16页/共30页第十七页,共30页。13.Explanation o
31、f data exclusions,if anyExamplen nThe primary analysis was intention-to-treat andinvolved all patients who were randomly assignedn nOne patient in the alendronate group was lost tofollow up;thus data from 31 patients wereavailable for the intention-to-treat analysis.Fivepatients were considered prot
32、ocol violators.Consequently,26 patients remained for the per-protocol analyses第17页/共30页第十八页,共30页。Protocol deviations n n Authors should report all departures from the protocol,including unplanned changes to interventions,examinations,data collection,and methods of analysis.n nThe nature of the proto
33、col deviation and the exact reason for excluding participants after randomization should always be reported.第18页/共30页第十九页,共30页。14.Explained reasons for any discrepancy between initial n and n for each analysisExampleInitially,the 60 rats were randomly divided into 3 groups,15 for each,to receive 3 l
34、evels of dosesrespectively.However,at the end of the first week,2 rats in the group of low dose escaped;on the 40-th day,1 rat in the group of high dose and1 in the control group escaped 第19页/共30页第二十页,共30页。15.Explained method of treatment assignment 15.Explained method of treatment assignment(random
35、ization,if any)(randomization,if any)ExampleExampleDetermination of whether a patient would be treated byDetermination of whether a patient would be treated bystreptomycin and bed-rest(S case)or by bed-rest alone(Cstreptomycin and bed-rest(S case)or by bed-rest alone(Ccase)case)was made by reference
36、 to was made by reference to a statistical series based on a statistical series based on random sampling numbers drawn up for each sex at eachrandom sampling numbers drawn up for each sex at eachcentre by Prof.Bradford Hillcentre by Prof.Bradford Hill;the details of the series;the details of the ser
37、ies were were unknown to any of the investigators or to theunknown to any of the investigators or to thecoordinatorcoordinator and were contained in and were contained in a set of sealed envelopesa set of sealed envelopes,each bearing on the outside each bearing on the outside only the name of the h
38、ospitalonly the name of the hospital and a numberand a number.After acceptanceAfter acceptance of a patient by the panel,of a patient by the panel,the envelope was the envelope was opened at the central officeopened at the central office;the card;the cardinside told the medical officer of the centre
39、 if the patientinside told the medical officer of the centre if the patientwas to be an S or a C casewas to be an S or a C case.第20页/共30页第二十一页,共30页。16.Explained any data transformation16.Explained any data transformationExample18 patients with acute encephalitis B in a clinic were randomly allocated
40、 into 3 groups.Each group accepted different kind of treatments,say treatment A,B and C;and the fevering days were measured as the effects of treatments.Please make an inference from the differences of means of fevering days among the three groups whether the treatments had different effects.第21页/共3
41、0页第二十二页,共30页。n nConsider the two assumptions of one-way ANOVA.Consider the two assumptions of one-way ANOVA.The fevering days are The fevering days are positively skew from the normal positively skew from the normal distributiondistribution;and the ratio of is closed to 10,the;and the ratio of is cl
42、osed to 10,the assumption of assumption of homogeneity of variances is also abandonedhomogeneity of variances is also abandoned.Therefore,Therefore,a square root transformationa square root transformation of the scale for the of the scale for the fevering days is appliedfevering days is appliedn nTh
43、e new scales have been used in computation of one-The new scales have been used in computation of one-way ANOVA.It resulted in that there is no significant way ANOVA.It resulted in that there is no significant difference on the average fevering days(difference on the average fevering days(scales of
44、square scales of square roots)roots)among the three kinds of treatments.among the three kinds of treatments.第22页/共30页第二十三页,共30页。17.Discussed adjustments for multiple testing17.Discussed adjustments for multiple testing ExampleExampleMultiple comparison with Bonferroni adjustment(alpha level of 0.016
45、7)revealed that the effects of the twotreatments with protein were significantly higher thanthat of common feed,while the difference between the Two treatments with protein was not statistically significant.-Multiple comparison 第23页/共30页第二十四页,共30页。For graphs18.Were effect sizes distorted?(by truncat
46、ion of y axis,etc.)What are the problem?第24页/共30页第二十五页,共30页。19.Were error bars unlabeled?20.Were error bars absent?What is the height for?What are the bars for?What are the stars for?第25页/共30页第二十六页,共30页。SummaryThree errors are particularly commonn nMultiple comparisons:When making multiple statistic
47、al comparisons on a single data set,authors should explain how they adjusted the alpha level to avoid an inflated Type I error rate,or they should select statistical tests appropriate for multiple groups(such as ANOVA rather than a series of t-tests).第26页/共30页第二十七页,共30页。n nNormal distribution:Many s
48、tatistical tests require that the data be approximately normallydistributed;when using these tests,authors should explain how they tested their data for normality.If the data do not meet the assumptionsof the test,then a non-parametric alternative should be used instead.第27页/共30页第二十八页,共30页。Small sample size:When the sample size is small(less than about 10),authors should use tests appropriate to small samples or justify their useof large-sample tests.第28页/共30页第二十九页,共30页。Thanks 第29页/共30页第三十页,共30页。