Methods for Summarizing the Evidence: Meta-Analyses and Pooled Analyses Donna Spiegelman, Sc.D. Departments of Epidemiology and Biostatistics Harvard School of Public Health [email protected] Methods for Summarizing the Evidence Narrative review Meta-analysis of published data Pooled analysis of primary data (metaanalysis of individual data) Retrospectively planned Prospectively planned Narrative Review Fruits, Vegetables and Breast Cancer: Summary of Case-Control Studies Total number of studies: 12 Studies showing a statistically significant protective association:
8 (67%) Studies showing no statistically significant protective association: 4 (33%) WCRF/AICR 1997 Specific Fruits and Vegetables and Breast Cancer: Summary of Case-Control Studies Food Group # of studies % of Total risk null risk Fruits 4
75 0 25 Citrus Fruits 3 33 0 66 Vegetables <3 -- -- --
Green Veg. 6 83 17 0 Carrots 4 75 25 0 WCRF/AICR 1997 Fruits, Vegetables and Breast Cancer: Conclusion of Narrative Review A large amount of evidence has accumulated regarding vegetable and fruit consumption and the risk of breast cancer. Almost all of the data
from epidemiological studies show either decreased risk with higher intakes or no relationship; the evidence is more abundant and consistent for vegetables, particularly green vegetables, than for fruits. Diets high in vegetables and fruits probably decrease the risk of breast cancer. WCRF/AICR 1997 Narrative Review Strengths Short timeframe Inexpensive Limitations Provide qualitative summary only frequently tabulate results Subjective Selective inclusion of studies May be influenced by publication bias Publication Bias: Funnel Plots Sterne 2001 Where Can Publication Bias Occur? Project dropped when preliminary analyses
suggest results are negative Authors do not submit negative study Results reported in small, non-indexed journal Editor rejects manuscript Reviewers reject manuscript Author does not resubmit rejected manuscript Journal delays publication of negative study Results not reported by news, policy makers, or narrative reviews Montori 2000 Meta-Analyses Rationale for Combining Studies Many of the groupsare far too small to allow of any definite opinion being formed at all, having regard to the size of the probable error involved. Pearson, 1904 Definition Meta-analysis refers to the analysis of analysesthe statistical analysis of a large collection of analysis results from
individual studies for the purpose of integrating findings. It connotes a rigorous alternative to the casual, narrative discussions of research studies which typify our attempts to make sense of the rapidly expanding literature Glass, 1976 Meta-Analyses: # of Publications by Year 5000 4000 3000 2000 1000 0 1986 1991 1996 Year of Publication 2001
Why conduct a meta-analysis? Summarize published literature A more objective summary of literature than narrative review Estimate average effect from all available studies Increase statistical power size more precise estimate of effect Identify between study heterogeneity Identify research needs Outline for Conducting Meta-Analyses Objective, hypothesis Define outcome, exposure, population Study inclusion criteria Search strategy Data extraction Quality assessment Statistical analysis Meta-Analyses: Study Sources Published literature citation indexes
abstract databases reference lists contact with authors Unpublished literature Uncompleted research reports Work in progress Meta-Analyses: Data Extraction Publication year Performing year Study design Characteristics of study population (n, age, sex) Geographical setting Assessment procedures Risk estimate and variance Covariates Meta-Analyses: Quality Assessment Study components Study design Outcome measurement Exposure measurement Response rate/follow-up rate Analytic strategy Adjustment for confounding
Quality of reporting Meta-Analyses: Statistical Analyses Define analytic strategy Investigate between study heterogeneity Decide whether results can be combined Estimate summary effect, if appropriate Conduct sensitivity analysis Stratification Meta-regression analyses Fixed Effects Assumes that all studies are estimating the same true effect Variability only from sampling of people within each study Precision depends mainly on study size Fixed Effects Model s s , s 1 , . . . , S s t u d i e s tr u e c o m m o n lo g r e la tiv e r is k s r a n d o m w ith in s tu d y ( s a m p lin g v a r ia b ility ) s e s t i m a t e d l o g r e l a t i v e r i s k i n s t u d y s E ( s ) 0 , s 1 ,...,S V a r ( ) V a r ( ) s
s S P o o l e d w s s 1 S w s 1 s , w s [ V a r ( s ) ] 1 s S
V a r ( ) [ V a r ( s ) ] 1 s 1 1 m in im u m v a r ia n c e w e ig h ts S [ w s ] 1 s 1 Random Effects Studies allowed to have different underlying or true effects Allows variation between studies as well as within studies Random Effects Model s b s s , s 1 , . . . , S s t u d i e s bs
s tr u e c o m m o n lo g r e la tiv e r is k r a n d o m b e tw e e n s tu d ie s v a r ia b ility r a n d o m w ith in s tu d y ( s a m p lin g v a r ia b ility s e s tim a te d lo g r e la tiv e r is k in s tu d y s E ( s ) E ( b s ) 0 , s 1 ,...,S V a r ( s ) V a r ( s ) , s 1 , . . . , S V a r ( b s ) B2 , s 1 , . . . , S S P o o l e d w s s 1 S w s 1
s , w s [ V a r ( s ) B2 ] 1 s S V a r ( ) [ V a r ( s ) B2 ] 1 s 1 m in im u m v a r ia n c e w e ig h ts 1 w S s 1 s 1
Random Effects Model, continued Under random effects mod el , b , s 1,..., S studies s s s Var (bs ) B2 Q ( S 1) B2 max 0, S 2 w s S s 1 w
s S s 1 ws s 1 ( s )] 1 ws [Var ( s )] 1 [Var
Test for Heterogeneity U n d e r r a n d o m e ffe c ts m o d e l , S 2 Q w s( s ) , s 1 p o o l e d l o g r e l a t i v e r i s k , f i x e d e f f e c t s m o d e l s s t u d y s p e c i f i c e s t i m a t e d l o g r e l a t i v e r i s k 1 w [ V a r ( )] s Q ~ s 2 S1
2 B u n d e r H 0 : 0 Fixed Effects vs Random Effects Model Random effects generally yield larger variances and CI 2 Why? Incorporate B If heterogeneity between studies is large, will dominate the weights and all studies B2 will be weighted more equally Model weight for large studies less in random vs fixed effects model Sources of Between Study Heterogeneity Different study designs Different incidence rates among unexposed Different length of follow-up Different distributions of effect modifiers Different statistical methods/models used Different sources of bias Study quality
Meta-Analyses: Sensitivity Analyses Exclude studies with particular heterogeneous results Conduct separate analyses based on Study design Geographic location Time period Study quality Meta-regression (Stram, Biometrics, 1996) Purpose: to identify heterogeneity of effects by covariates that are constant within study (e.g. gender, smoking status) Model: ^ s = 0 + 1 GENDERs + 2 CURRENTs + 2 PASTs + bs + s GENDERs = 1 if studys is male; 0 if female CURRENTs = 1 if studys has current smokers only, 0 otherwise PASTs = 1 if studys has past smokers only, 0 otherwise H0: 1 = 0 no effect-modification by gender Standard method for mixed effects models can be used to test
Analysis of between-studies heterogeneity p-value for test for heterogeneity is a function of the power of the pooled analysis to detect between-studies differences. This power is believed to be low. a simulation study was conducted and published (Takkouche, Cardoso-Surez, Spiegelman, AJE, 1999) which investigated the power of several old and some newly developed test statistics to detect heterogeneity of different plausible magnitudes, as quantified by CV and R2 (to be defined) CV = 2 B 12 / , with S=ranging from 7 to 33 Analysis of between-studies heterogeneity Explored maximum likelihood methods for estimation of 2 and testing H0: 2=0 (i.e. no between-studies B B
heterogeneity) ML methods have power roughly equivalent to D&Ls, but assume bs ~N (0, ) and s ~ N [ 0, Vars ( s ) ] 2 ^ B= 0 has no known asymptotic distribution LRT for H0: because hypothesis is on the boundary of the parameter 2 space B A simulation-based bootstrap approach for constructing the empirical distribution function of the test statistic was developed 14
Quantification of heterogeneity 2 B / CVB = 2 between-studies variance expressed relative to the magnitude of B the overall association if the association is small, CV 'blows up S proportion the variance 2 ^ R 2 2 of ) / S ] of the / [
Var ( B B s pooled estimate due to s 1 between-studies variation Useful for when is near 0 as well as when it is far from it Heterogeneity is evaluated relative to within-studies contribution to the variance, and can appear large if the participating studies yield precise estimates Further experience with these measures will give us more insight as to their relative merits Confidence intervals for CVB, R2 (Identical to I2 (Higgins et al., 2006)) 1 Power of Test of Heterogeneity Q RI = 0.25
S=7 14.38 S = 20 25.90 S = 40 38.12 RI = 0.5 S=7 37.54 S = 20 70.56 S = 40 90.88 RI = 0.75 S=7 77.66 S = 20 98.66 S = 40 100.00 * LRT
Q* LRT* 2bootstrap 7.85 17.90 30.64 12.18 24.18 35.10 14.68 25.97 37.81 15.32 24.92 37.94 27.14 64.11 88.61
27.60 58.54 79.76 39.10 69.21 90.21 38.50 68.62 91.14 70.21 97.51 99.93 58.84 93.64 99.64 78.31 99.12 99.94 75.02 98.34
99.94 Parametric bootstrap version of the test. Odds ratio = 2. WLS, weighted least squares; LRT, likelihood ratio test. 2 2 R2, proportion of the total variance due to between-study variance: B /(( B+ (S x Var())). 13 Fruit & Vegetable and Breast Cancer Meta-Analysis Objective: analyze published results that explore the relationship between breast cancer risk and the consumption of fruits and vegetables Search strategy MEDLINE search of studies published January 1982 April 1997 Review of reference lists Gandini, 2000 Fruit & Vegetable and Breast Cancer Meta-Analysis: Inclusion Criteria Relative risks and confidence intervals reported or could be estimated
Comparisons: tertiles, quartiles, quintiles Studies were independent Diet assessed by food frequency questionnaire Populations were homogeneous, not limited to specific subgroup Gandini, 2000 Fruit & Vegetable and Breast Cancer Meta-Analysis: Data Extraction and Analysis Selected risk estimate for total fruits and total vegetables, when possible Otherwise, selected nutrient dense food Extracted the most adjusted relative risk comparing the highest vs. lowest intake Comparisons: tertiles, quartiles, quintiles Used random effects model to calculate summary estimate Sensitivity analyses Gandini, 2000 Fruit and Vegetable and Breast Cancer Meta-Analysis: Results RR (95% CI)
high vs low p for het Total fruits 0.94 (0.79-1.11) <0.001 Total vegetables 0.75 (0.66-0.85) <0.001 Gandini, 2000 Fruits and Vegetables and Breast Cancer Meta-Analysis: Sensitivity Analysis # of Studies RR Significance
of factors Study design Case-control Cohort 14 3 0.71 0.73 0.30 Validated FFQ Yes No 6 11 0.85 0.66 0.13 Gandini, 2000
Fruits and Vegetables and Breast Cancer: Conclusion of Meta-Analysis The quantitative analysis of the published studiessuggests a moderate protective effect for high consumption of vegetables For fruit intake, study results were less clear. Only two studies show a significant protective effect of high fruit intake for breast cancer. This analysis confirms the association between intake of vegetables and, to a lesser extent, fruits and breast cancer risk from published sources. Gandini, 2000 Limitations of Meta-Analyses Heterogeneity across studies Difficult to evaluate dose-response associations Difficult to examine population subgroups Errors in original work cannot be checked Errors in data extraction Limited by quality of the studies included May be influenced by publication bias
Pooled Analyses Outline for Conducting Pooled Analyses Search strategy Study inclusion criteria Obtain primary data Prepare data for pooled analysis Estimate study-specific effects Examine whether results are heterogeneous Estimate pooled result Conduct sensitivity analyses Friedenreich 1993 Pooling Project of Prospective Studies of Diet and Cancer Collaborative project to re-analyze the primary data in multiple cohort studies using standardized analytic criteria to generate summary estimates Retrospectively-planned meta-analysis of individual patient data
Established in 1991 http://www.hsph.harvard.edu/poolingproject/about.html Pooling Project of Prospective Studies of Diet and Cancer: Inclusion Criteria Prospective study with a publication on diet and cancer Usual dietary intake assessed Validation study of diet assessment method Minimum number of cases specific for each cancer site examined Cohort Studies in the Pooling Project of Prospective Studies of Diet and Cancer Canadian National Breast Screening Study New York University Womens Health Study CA Teachers Study
Sweden Mammography Cohort Netherlands Cohort Study Health Professionals Follow-up Study, Nurses Health Study, Womens Health Study, Nurses Health Study II New York State Cohort ORDET Alpha-Tocopherol BetaCarotene Cancer Prevention Study Adventist Health Study Iowa Womens Health Study Cancer Prevention Study II Nutrition Cohort
Breast Cancer Detection Demonstration Project Total=948,983 Analytic Strategy Nutrients Main effect Food groups Study-specific analyses Data collection Population subgroups Pooling Effect modification Annual
meeting Foods Non-dietary risk factors Pooling Project: Data Management Receive primary data Different media Different formats Check data follow-up with investigators Apply exclusion criteria Calculate energy-adjusted nutrient intakes Create standardized name and format for each variable Pooling Project: Data Checks Dates Questionnaire return Follow-up time Diagnosis Death Non-dietary variables Frequency distributions, means Consistency checks
Parity and age at first birth Smoking status and # cigarettes Nutrients and foods: means, range Data Management: Inconsistent Units Example: Vitamin A, Carotene Units: g retinol equivalents g International units How determine units Original data sheets Published values Correspondence Data Management: Standardizing Food Data Created standardized name for each food on the FFQs (range: 46-276 items) Multiple foods on same line Other categories Standardized quantity information as grams/d For some studies, calculated gram intakes by frequency * portion size * gram weight Defined food groups Pooling Project: Foods
Food Std Name Std Serv Orange juice juior 6 oz Orange or grapefruit juice juici 6 oz Oranges orang one Oranges, grapefruits
cifru average Oranges, orange juice orang one orange Oranges, tangerines orang one Oranges, tangerines, grapefruit cifru average Pooling Project: Primary Analysis Programs Read data
Study name, exposure, covariates Data management Analysis - Cox proportional hazards model (SAS, Epicure) Continuous Categorical Splines Interactions Pooling Project: Primary Analysis Programs, contd Output for each study File with number of cases and covariates included in the model File identifying whether any of the relative risks > 10 For quantile analyses, file with the mean, range and number of cases for each quantile Data set with beta, variance, covariance, likelihood for each variable Pooling Project: Pooling Programs Read study-specific output data sets Calculate summary relative risk by using random effects model Weight studies by the inverse of their variance
Test for between studies heterogeneity Test for effect modification by sex Output table Pooling Project: Analytic Strategy for Breast Cancer Analyses Analysis Cohorts analyzed using Cox proportional hazards model Canadian National Breast Screening Study is a nested case-control study with a 1:2 ratio of cases:controls Netherlands Cohort Study is a case-cohort study Subcohort: 1812 women Pooling Project: Exposures Total fruits Fruits, excluding fruit juice Fruit juice Total vegetables Total fruits and vegetables Botanical groups Specific foods Pooled Multivariate Relative Risks for Breast
Cancer and Fruits and Vegetables Quartile Total Fruits RR (95% CI) (ref) Total Vegetables RR (95% CI) 1 1.00 1.00 (ref) 2 0.94 (0.87-1.01) 0.99 (0.90-1.08) 3
0.92 (0.86-0.99) 0.97 (0.90-1.05) 4 0.93 (0.86-1.00) 0.96 (0.89-1.04) p for trend 0.08 0.54 p for hetero. 0.94 0.73 Smith-Warner, et al. 2001 Contrast to Vegetable and Breast Cancer Meta-Analysis: Results
RR (95% CI) high vs low p for het Total fruits 0.94 (0.79-1.11) <0.001 Total vegetables 0.75 (0.66-0.85) <0.001 Gandini, 2000 Specific Fruits and Vegetables and Breast Cancer: Summary of Case-Control Studies Food Group # of studies
% of Total risk null risk All Fruits 4 75 0 25 Citrus Fruits 3 33 0 66
All Vegetables <3 -- -- -- Green Veg. 6 83 17 0 Carrots 4 75 25
0 WCRF/AICR 1997 CONTRAST TO Fruits, Vegetables and Breast Cancer: Conclusion of Narrative Review A large amount of evidence has accumulated regarding vegetable and fruit consumption and the risk of breast cancer. Almost all of the data from epidemiological studies show either decreased risk with higher intakes or no relationship; the evidence is more abundant and consistent for vegetables, particularly green vegetables, than for fruits. Diets high in vegetables and fruits probably decrease the risk of breast cancer. WCRF/AICR 1997 Pooled Multivariate Relative Risks of Breast Cancer and Botanically Defined Fruit and Vegetable Groups and Individual Foods RR (95% CI) Cruciferae 0.96 (0.87-1.06)
Leguminosae 0.97 (0.87-1.08) Rutaceae 0.99 (0.97-1.01) Carrots 0.95 (0.81-1.12) Potatoes 1.03 (0.98-1.08) Apples 0.96 (0.92-1.01) Increment=100 g/d Smith-Warner 2001 Pooled Multivariate Relative Risks of Breast Cancer and F & V by Menopausal Status
Multi RR (95% CI) p-for (increment=100 g/d) interaction Total fruits Premeno. Postmeno. 0.98 (0.94-1.02) 0.99 (0.98-1.01) Total vegetables Premeno. 0.99 (0.93-1.06) 1.00 (0.97-1.02) 0.54 0.80 Postmeno. Smith-Warner, et al. 2001 Pooling Study-Specific Results vs. Combining Primary Data into One Dataset Difficult to distinguish population-specific differences in true intake from artifactual differences due to differences in dietary
assessment methods exception: when the unit of measurement is standard (e.g. alcohol, body mass index) Pooling allows for study-specific differences in the adjustment for confounders Multivariate meta-analysis for data consortia, individual patient meta-analysis, and pooling projects Ritz J, Demidenko E, Spiegelman D Journal of Statistical Planning and Inference, 2008; 138: 1919-1933 Maximum likelihood and estimating equations metnods for combining results from multiple studies in pooling projects The univariate meta-analysis model is generalized to a multivariate method, and eficiency advantages are investigated 4 The test for heterogeneity is generalized to a multivariate one ^
Multivariate Pooling -Let s = + bs + es + + , s = 1, s studies dim ()= p model covariates ^ E(bs) = E (es) = 0, Var(bs) = B, Var(es) = Cov (s) pxp ^ -If B and Cov (s) are not diagonal, more efficient estimates of can be obtained -Weighted estimating equation ideas are used -Test statistics for H0: B = 0 and major submatrices are given assuming normality in bs and es and asymptotically 5 6 Results From Smith-Warner, et al. 2001 Types of dietary fat and breast cancer: a pooled
analysis of cohort studies ARE ( compared to corresponding univariate estimate) MLE Saturated fat Mono-unsaturated fat Poly-unsaturated fat EE p=3 p=18* 86% 76% 91% 81% 72% 88% p=3 93% 36%
1.35% p=18* 84% 32% 1.21% *Using estimates adjusted for total energy, % protein intake, bmi, parity x age at first birth, height (4 levels), dietary fiber (4 quintiles) Measurement error correction for pooled estimates Measurement error correction study-specific estimate using the measurement error model developed from each study's validation data Pool measurement error corrected 's in the usual way -- note that pooled variance will reflect additional uncertainty in the studyspecific measurement error corrected estimates 17 Pooled Analyses vs. Meta-Analyses Strengths Increased standardization Can examine rare exposures
Can analyze population subgroups Reduce publication bias Limitations Expensive Time-consuming Requires close cooperation with many investigators Errors in study design multiplied (prospective) Another example: Dairy intake and ovarian cancer incidence (Genkinger et al., 2005) From pooling project of prospective studies of diet and cancer 12 studies, 2,087 epithelial ov ca, 560,035 women Located in North America and Europe Follow-up between 1976-2002, of between 7 to 20 years Studies had to have 50 or more cases to be included (range 52 to 315) Daily Mean Intakes of Dairy Nutrients and Foods by Cohort Study in the Ovarian Cancer Analyses in the Pooling Project Table 1. Daily Mean Intakes of Dairy Nutrients and Foods by Cohort Study in the Ovarian Cancer Analyses in the Pooling Project of Prospective Studies of Diet and Cancer
Mean (SD) Intake3 Baseline Dietary Total Lactose Dietary Total Total Hard Yogurt 4 Follow-up Cohort Number Calcium Calcium (g/day) Vitamin D Vitamin Milk Cheese (g/day) 5 Cohort1 Years Size2 of Cases (mg/day) (mg/day) D4 (g/day) 5 (g/day) 5 (IU/day) (IU/day) AHS 1976-1988 18,402 53 832 (124) 878 (139) 18 (14) --- --- 419 (349) 8 (8) --- BCDDP 1987-1999 32,885 142 862 (369) 1186 (2979) 19 (14) 206 (122) 341 (279) 260 (269) 13 (20) ---
CNBSS 1980-2000 56,837 223 672 (252) --- 8 (7) --- --- 199 (198) 22 (23) 29 (60) CPS II 1992-2001 60,796 233 888 (381) 1140 (585) 18 (14) 197 (119) 343 (259) 269 (269) 6 (12) 44 (71) IWHS 1986-2001 28,486 208 748 (285) 1029 (483) 15 (11) 223 (111) 382 (292) 275 (265) 11 (13) 12 (39) --- 14 (8) --- --- 187 (153) 23 (18) 52 (56) NLCS 1986-1995 62,412 208 869 (261) NYSC 1980-1987 22,550 77 828 (209) 873 (220) 15 (9) 203 (68) 371 (227) 137 (87) --- --- NYU 1985-1998 12,401 65 810 (307) 867 (317) 14 (11) --- --- 202 (243) 17 (22) 38 (61) NHS80 1980-1986 80,195 120 722 (298) 731 (310) 14 (11) 167 (107) 279 (262) 215 (241) 14 (15) 21 (54) NHS86 1986-2000 59,538 315 718 (254) 1056 (492) 13 (10) 182 (100) 319 (243) 221 (230) 13 (13) 28 (55) NHS II 1991-2000 91,502 52
787 (271) 910 (381) 16 (11) 223 (109) 351 (231) 268 (255) 12 (12) 31 (55) SMC 1987-2003 61,103 287 913 (256) --- 16 (10) 199 (130) --- 156 (130) 27 (19) 104 (108) WHS 1993-2002 32,466 104 729 (258) 940 (442) 14 (10) 217 (104) 324 (216) 215 (222) 9 (11) 36 (64) 1 AHS=Adventist Health Study, BCDDP=Breast Cancer Detection Demonstration Project, CNBSS=Canadian National Breast Screening Study, CPS II=Cancer Prevention Study II Nutrition Cohort, IWHS=Iowa Womens Health Study, NLCS=Netherlands Cohort Study, NYSC=New York State Cohort, NYU=New York University Womens Health Study, NHS80=Nurses Health Study (part a), NHS86=Nurses Health Study (part b), NHS II=Nurses Health Study II, SMC=Sweden Mammography cohort, WHS=Womens Health Study 2 Baseline cohort size determined after specific exclusions (i.e., prior cancer diagnosis other than non-melanoma skin cancer at baseline, had a bilateral oophorectomy prior to baseline, or if they had loge-transformed energy intakes beyond three standard deviations from the study-specific loge-transformed mean energy intake of the population ) 3 Studies which have a --- did not estimate that nutrient or did not ask on their questionnaire about the consumption of that food item 4 Total calcium and vitamin D includes dietary and supplemental sources. (RR) and 95% Confidence Intervals (CI) for Ovarian Cancer According to Lactose Intake (>30g/day compared to <10g/day)
by Study F ig u r e 2 . M u ltiv a r ia te 1 A d ju ste d R e la tiv e R isk s (R R ) a n d 9 5 % C o n fid en c e In te r v a ls (C I) fo r O v a r ia n C a n c er A c c o rd in g to L a c to se In ta k e (> 3 0 g /d a y c o m p a r ed to < 1 0 g /d a y ) b y S tu d y A d v e n ti st H e a l th S tu d y C a n a d i a n N a ti o n a l B re a st S c re e n i n g S tu d y N e w Y o rk S ta te C o h o rt N e w Y o rk U n i v e rsi ty W o m e n 's H e a l th S tu d y B re a st C a n c e r D e m o n stra t i o n D e te c ti o n P ro g ra m C a n c e r P re v e n ti o n S tu d y II Io w a W o m e n 's H e a l th S tu d y N u rse s' H e a l th S tu d y 1 9 8 6 S w e d e n M a m m o g ra p h y C o h o rt N e th e rl a n d s C o h o rt S tu d y N u rse s' H e a l t h S tu d y II N u rse s' H e a l th S tu d y 1 9 8 0 W o m e n 's H e a l th S tu d y Co m b in ed .2 .5 1 R e la tiv e Ris k
2 5 1 M u ltiv a ria te re la tiv e risk s w e re a d ju ste d fo r a g e a t m e n a rc h e (< 1 3 , 1 3 , > 1 3 y e a rs), m e n o p a u sa l sta tu s a t b a se lin e , o ra l c o n tra c e p tiv e u se (e v e r, n e v e r), h o rm o n e re p la c e m e n t th e ra p y u se a m o n g p o st-m e no p a u sa l w o m e n (n e v e r, p a st, c u rre n t), p a rity (0 , 1 , 2 , > 2 ), b o d y m a ss in d e x (< 2 3 , 2 3 -2 4 .9 , 2 5 -2 9 .9 , > 3 0 k g /m 2 ), sm o k in g sta tu s (n e v e r, p a st, c u rre n t), p h y sic a l a c tiv ity (lo w , m e d iu m , h ig h ), a nd e n e rg y in ta k e (c o n tin u o u sly), m o d e le d id e n tic a lly a c ro ss stu d ie s. . E u n y o u n g p a p e r [T h e b la c k sq ua re s a n d h o riz o n ta l lin e s c o rre sp o n d to th e stu d y -sp e c ific re la tiv e risk s a n d 9 5 % c o n fid e n c e in te rv a ls fo r > 3 0 g /d a y la c to se in ta k e . T h e a re a o f th e b la c k sq u a re s re fle c ts th e s tu d y -sp e c ific w e ig h ts (in v e rse o f th e v a ria n c e ), w h ic h is re la te d to s a m p le siz e a n d in ta k e v a ria tio n . T h e d ia m o n d re p re se n ts th e p o o le d m u ltiv a ria te re la tiv e risk a n d 9 5 % c o n fid e n c e in te rv a l. T h e d a sh e d lin e re p re se n ts th e p o o le d m u ltiv a ria te re la tiv e risk . ] Data and SAS Program to analyze Lactose and Ovarian Cancer Pooling Project AHS 0.496 1.23 BCDDP 0.873 1.46 CNBSS 0.659 2.76 CPS2 1.033 1.48
IWHS 1.225 1.95 NLCS 1.568 3.32 NYS 0.767 3.53 NYU 0.787 2.69 NHSa 1.619 2.90 NHSb 1.240 1.90 NHSII 1.573 3.51 SMC 1.346 2.07 WHS 1.699 3.23 options ps=58 ls=120 nodate nonumber; %include "newmeta.mac"; data dat; *infile "no_conf.dat" lrecl=2000; infile "lactose.dat" lrecl=2000; input study $ beta ub; beta = log(beta); var = ((log(ub)-beta)/1.96)**2; w=1/var; %meta(beta = beta, var = var, data = _last_, labels=study, wt=1,name='pooling project -- lactose and ov ca',modlab=1, out=metaout); proc mixed data=dat; class study; model beta = /s; random study/s; repeated; weight w;
parms (0) (1) / eqcons=2; proc mixed data=dat; class study; model beta = /s; random study/s; repeated; weight w; parms (0) (1) / eqcons=1 to 2; SAS Output for Analysis of Lactose Data The SAS System Meta-Analysis for variable : 'pooling project -- lactose and ov ca' Model : 1 Weight used for Odds Ratios etc., is 1 Inverse-variance weighted average of estimates ----------------------------------------------------------------------------------------------Pooled Lower Upper Z-score for Chi-sq Estimate S.E. 95% CL 95% CL
H0: OR=1 Prob ----------------------------------------------------------------------------------------------Beta 0.164614 0.082208 0.003487 0.325741 2.002423 0.045239 OR 1.178938 1.003494 1.385057 ----------------------------------------------------------------------------------------------Test of heterogeneity ======================= Q : 10.550490 df : 12 Prob : 0.567783 ======================= Estimate of among-study variability =
0 Weighted average of estimates using the DerSimonian and Laird random effects model ----------------------------------------------------------------------------------------------Pooled Lower Upper Z-score for Chi-sq Estimate S.E. 95% CL 95% CL H0: OR=1 Prob ----------------------------------------------------------------------------------------------Beta 0.164614 0.082208 0.003487 0.325741 2.002423 0.045239 OR 1.178938 1.003494
1.385057 ----------------------------------------------------------------------------------------------Input data: ========================================================================================================== Label Beta Variance OR Lower 95% Upper 95% Fixed wt Random wt ========================================================================================================== AHS -0.701179 0.214706 0.496000 0.200013 1.230000 0.03 0.03 BCDDP -0.135820 0.068841 0.873000 0.522006 1.460000
0.10 0.10 CNBSS -0.417032 0.533990 0.659000 0.157348 2.760000 0.01 0.01 CPS2 0.032467 0.033656 1.033000 0.721006 1.480000 0.20 0.20 IWHS 0.202941 0.056258 1.225000 0.769551 1.950000 0.12 0.12
NLCS 0.449801 0.146487 1.568000 0.740549 3.320000 0.05 0.05 NYS -0.265268 0.606623 0.767000 0.166654 3.530000 0.01 0.01 NYU -0.239527 0.393224 0.787000 0.230249 2.690000 0.02 0.02 NHSa 0.481809
0.088446 1.619000 0.903849 2.900000 0.08 0.08 NHSb 0.215111 0.047405 1.240000 0.809263 1.900000 0.14 0.14 NHSII 0.452985 0.167695 1.573000 0.704937 3.510000 0.04 0.04 SMC 0.297137 0.048223 1.346000
0.875225 2.070000 0.14 0.14 WHS 0.530040 0.107438 1.699000 0.893685 3.230000 0.06 0.06 ========================================================================================================== SAS Output: Random effects regression model The SAS System The Mixed Procedure Model Information Data Set Dependent Variable Weight Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method
WORK.DAT beta w Variance Components REML Parameter Model-Based Containment Class Level Information Class Levels study 13 Values AHS BCDDP CNBSS CPS2 IWHS NHSII NHSa NHSb NLCS NYS NYU SMC WHS Dimensions Covariance Parameters
Columns in X Columns in Z Subjects Max Obs Per Subject Observations Used Observations Not Used Total Observations 2 1 13 1 13 13 0 13 Parameter Search CovP1 CovP2 Res Log Like -2 Res Log Like
0 1.0000 -5.3070 10.6140 Iteration History Iteration Evaluations -2 Res Log Like Criterion 1 1 10.61404477 0.00000000 Convergence criteria met.
SAS Output: Random effects regression model The SAS System The Mixed Procedure Covariance Parameter Estimates Cov Parm Estimate study Residual 0 1.0000 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 10.6 10.6 10.6 10.6
PARMS Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 0 0.00 1.0000 Solution for Fixed Effects Effect Intercept Estimate Standard Error DF t Value
Pr > |t| 0.1646 0.08221 12 2.00 0.0684 SAS Output: Fixed effects regression model fixed effects model The Mixed Procedure Model Information Data Set Dependent Variable Weight Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method
WORK.DAT beta w Variance Components REML Parameter Model-Based Containment Class Level Information Class Levels study 13 Values AHS BCDDP CNBSS CPS2 IWHS NHSII NHSa NHSb NLCS NYS NYU SMC WHS Dimensions Covariance Parameters Columns in X
Columns in Z Subjects Max Obs Per Subject Observations Used Observations Not Used Total Observations 2 1 13 1 13 13 0 13 Parameter Search CovP1 CovP2 Res Log Like -2 Res Log Like 0
1.0000 -5.3070 10.6140 Iteration History Iteration Evaluations -2 Res Log Like Criterion 1 1 10.61404477 0.00000000
Convergence criteria met. SAS Output: Fixed effects regression model The SAS System The Mixed Procedure Covariance Parameter Estimates Cov Parm Estimate study Residual 0 1.0000 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 10.6 10.6
10.6 10.6 PARMS Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 0 0.00 1.0000 Solution for Fixed Effects Effect Intercept Estimate Standard Error
DF t Value Pr > |t| 0.1646 0.08221 12 2.00 0.0684 Prospectively Planned Pooled Analyses Protocol standardized across studies for hypotheses data collection methods definition of variables analyses European Prospective Investigation into Cancer and Nutrition (EPIC): Study Design
Multicenter prospective study 22 centers in 9 countries Initiated 1993-1998 Objective: Improve scientific knowledge on nutritional factors involved in diet Provide scientific bases for public health interventions Baseline cohort: 484,042 Riboli 2001 EPIC: Study Design, contd Measures Questionnaires Anthropometry Blood samples (n=387,256) Outcomes Cancer registry (n=6) Combination (n=3) Health insurance records Cancer and pathology registries Active followup Mortality registries (n=9) Riboli 2001 EPIC: Diet Assessment Methods
Self-administered questionnaire (n=7 countries) 300-350 foods Interviewer-administered questionnaire (n=2 centers) Similar to self-administered questionnaire Food frequency questionnaire + 7-day diet record (n=2 centers) 24-hr recall from 8-10% random sample from each cohort Riboli 2001 Why Are Pooled Analyses Time-Consuming? Data management Add updated case information Errors found in data Individual study wants to publish their findings prior to submitting pooled results Manuscript review Signature sheets Pooling decreases the variation caused by random error (increasing the sample size) but does not eliminate any bias (systematic error).
Blettner 1999 REFERENCES 1. Blettner M, Sauerbrei W, Schlehofer B, Scheuchenpflug T, Friedenreich C. Traditional reviews, metaanalyses and pooled analyses in epidemiology. International Journal of Epidemiology, 1999; 28:1-9. 2. Costa-Bouzas J, Takkouche B, Cadarso-Surez C, Spiegelman D. HEpiMA: Software for the identification of heterogeneity in meta-analysis. Computer Methods and Programs in Biomedicine, 2000; 64(2):101-107. 3. DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials, 1986; 7:177-188. 4. Friedenreich CM, Methods for pooled analyses of epidemiologic studies. Epidemiology; 1993; 4:295-302. 5. Gandini S, Merzenich H, Robertson C, Boyle P. Meta-analysis of studies on breast cancer risk and diet: the role of fruit and vegetable consumption and the intake of associated micronutrients. European Journal of Cancer, 2000; 36:636. 6. Smith-Warner S, Spiegelman D, Adami H, et al. Intake of fruits and vegetables and risk of breast cancer: A pooled analysis of cohort studies. JAMA, 2001; 285:769-776. 7. Steinberg KK, Smith SJ, Striup DF, et al. Comparison of effect estimates from a meta-analysis of summary data from published studies and from a meta-analysis using individual patient data for ovarian cancer studies. American Journal of Epidemiology, 1997; 145:917-925. 8. Stram DO. Meta-analysis of publixhed data using a linear mixed-effects model. Biometrics, 1996; 52:536544. 9. Takkouche B, Cardarso-Surez C, Spiegelman D. An evaluation of old and new tests for heterogeneity in meta-analysis for epidemiologic research. American Journal of Epidemiology, 1999; 150:206-215. 10. World Cancer Research Fund, American Institute for Cancer Research Expert Panel (J.D. Potter,
Chair). Food, nutrition and the prevention of cancer: A global perspective. Washington DC: American Institute for Cancer Research, 1997. 11. Smith-Warner SA, et al. Types of dietary fat and breast cancer: A pooled analysis of cohort studies. International Journal of Cancer, 2001; 92:767-774. 12. Ritz J, Demidenko E, Spiegelman D. Multivariate pooling for efficiency. Journal of Statistical Planning and Inference, 2008; 138:1919-1933. 13. Higgins JPT, T hompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analysis. British Journal of Medicine, 2003; 327:557-560. 14. Sterne J, Egger M. Funnel plots for detecting bias in meta-analytics: Guidelines in choice of axis. Journal of Clinical Epidemiology, 2001; 54:1046-1055. 15. Genkinger JM, Hunter DJ, Spiegelman D, Anderson KE, Arslan A, Beeson WL, Buring JE, Fraser GE, Freudenheim JL, Goldbohm RA, Hankinson SE, Jacobs DR Jr, Koushik A, Lacey JV Jr, Larsson SC, Leitzmann M, McCullough ML, Miller AB, Rodriguez C, Rohan TE, Schouten LJ, Shore R, Smit E, Wolk A, Zhang SM, Smith-Warner SA. Dairy products and ovarian cancer: A pooled analysis of 12 cohorts. Cancer Epidemiol Biomarkers Prev, 2006; 15:364-72. 16. Riboli E. The European prospective investigation into cancer and nutrition (EPIC): Plans and progress. Journal of Nutrition, 2001; 131(1):170S-175S.