Skip to content
— CH. 1 · ORIGINS AND EARLY HISTORY —

Meta-analysis

~7 min read · Ch. 1 of 7
7 sections
  • The term meta-analysis entered the statistical lexicon in 1976 when Gene Glass defined it as the analysis of analyses. This coinage marked a shift toward aggregating measures of relationships and effects across independent studies. While Glass claimed authorship of the first modern meta-analysis, historical records show earlier precedents. A paper published by Karl Pearson in 1904 collated data from several typhoid inoculation studies within the British Medical Journal. That work aggregated outcomes from multiple clinical trials decades before the formal method existed. Other early examples appeared in occupational aptitude testing during the mid-20th century. Agricultural research also utilized similar aggregation techniques to synthesize field data. The field expanded rapidly after 1978 when Mary Lee Smith and Gene Glass published their model meta-analysis on psychotherapy effectiveness. Their article triggered immediate pushback regarding validity and usefulness. Hans Eysenck responded in 1978 with an article calling the process an exercise in mega-silliness. He later referred to the technique as statistical alchemy despite growing adoption rates. By 1991 researchers had published 334 meta-analyses globally. That number surged to 9,135 by 2014 indicating massive acceptance across disciplines like psychology medicine and ecology.

  • Efficient database selection forms the foundation of any rigorous literature search strategy. Researchers must identify appropriate keywords and apply specific search limits to filter results. Boolean operators assist in narrowing down thousands of potential matches within databases like PubMed or Embase. Many scientists employ duplicate search terms across two or more databases to ensure comprehensive coverage. Reference lists of eligible studies can be searched for additional relevant articles through a snowballing technique. Initial searches often return large volumes of studies requiring strict adherence to pre-specified inclusion criteria. Abstracts or titles frequently reveal that a study is ineligible based on these criteria allowing immediate discard decisions. If doubt remains about eligibility the full paper gets retained for closer inspection. These search results require detailed documentation via a PRISMA flow diagram showing information flow through all review stages. The date range of included studies along with the specific date the search was conducted must also be provided. A data collection form provides standardized means of extracting data from eligible studies. For correlational data analysis Pearson's r statistic usually serves as the primary effect size measure. Partial correlations often inflate relationships compared to zero-order correlations leading many meta-analyses to exclude them entirely. Plot digitizers offer a final resort for scraping data points directly from available scatterplots when necessary.

  • Two distinct types of evidence exist when performing a meta-analysis: individual participant data and aggregate data. Aggregate data typically represents summary estimates such as odds ratios or relative risks found in published literature. Direct synthesis occurs across conceptually similar studies using several established approaches. Indirect aggregate data measures effects of two treatments each compared against a similar control group separately. One-stage methods model IPD from all studies simultaneously while accounting for clustering within participants. Two-stage methods first compute summary statistics for AD from each study then calculate overall weighted averages. Recent studies show one-stage and two-stage methods may occasionally lead to different conclusions despite conventional beliefs. Fixed effect models provide weighted averages where inverse variance commonly serves as study weight. Larger studies contribute more than smaller ones under this assumption though findings from small studies get practically ignored if dominance exists. Most importantly fixed effects assume all included studies investigate identical populations with same variable definitions. Random effects models treat heterogeneity purely as random variability among true effects. Greater variability leads to un-weighting until results become simple unweighted averages across all studies. When all effect sizes remain similar no random effects variance component applies defaulting to fixed effect analysis. Confidence intervals generally do not retain coverage probability above specified nominal levels potentially overestimating statistical error.

  • Internal-external cross validation offers a method to measure the statistical validity of meta-analysis results. Each of the k included studies gets omitted in turn and compared with the summary estimate derived from remaining k-1 studies. A general validation statistic Vn based on IOCV measures how well individual studies align with aggregated outcomes. Qualitative appraisal using established tools can uncover potential biases but does not quantify aggregate bias impact. Comparing meta-analysis results with independent prospective primary studies often proves impractical due to resource constraints. Prediction error estimation approaches exist for test accuracy and multivariate effects seeking to validate model performance. The meta-analysis estimate represents a weighted average across studies where heterogeneity may render it unrepresentative of individual cases. Studies showing more variation than expected due to sampling different numbers of research participants require careful scrutiny. Study characteristics such as measurement instruments used or population sampled get coded to reduce estimator variance. Methodological weaknesses in original studies can sometimes be corrected statistically through these advanced modeling techniques. Development and validation of clinical prediction models benefit significantly from combining individual participant data across centers. Aggregating existing prediction models allows researchers to assess generalizability across diverse populations effectively.

  • Reliance on available published bodies creates exaggerated outcomes due to systematic publication bias against negative results. Pharmaceutical companies have been known to hide negative studies while researchers overlook unpublished dissertations or conference abstracts. This file drawer problem characterizes negative or non-significant results tucked away in cabinets creating serious base rate fallacies. Significance of published studies gets overestimated when other studies were either not submitted or rejected entirely. Funnel plots visualize distribution of effect sizes as scatter plots of standard error versus effect size magnitude. Smaller studies show larger standard errors forming symmetrical scatter at the base while large studies cluster tightly at the tip. Asymmetry indicates many negative studies remained unpublished causing unjustifiably favorable results for remaining positive ones. Statistical methods detecting publication bias remain controversial due to low power for detection and potential false positives. Small study effects where methodological differences exist between smaller and larger studies may cause asymmetry resembling true publication bias. Estimates suggest 25% of meta-analyses in psychological sciences suffered from publication bias though actual figures likely higher. Questionable research practices like reworking statistical models until significance is achieved also favor statistically significant findings supporting hypotheses. Studies often fail to report effects when they do not reach statistical significance leading to exclusion similar to publication bias.

  • Severe faults occur when analysts possess economic social or political agendas influencing legislation passage or defeat. People with such agendas abuse meta-analysis through personal bias cherry-picking favorable researchers while ignoring unfavorable ones. Favored authors may themselves be biased or paid to produce results supporting overall political social or economic goals. Selecting small favorable data sets without incorporating larger unfavorable datasets distorts conclusions significantly. A 2011 study reviewing conflicts of interest in underlying medical research found rarely disclosed financial ties across 29 meta-analyses. These included 11 general medicine journals 15 specialty medicine journals and three Cochrane Database entries covering 509 randomized controlled trials. Of these 318 reported funding sources with 69 receiving industry funding yet only two reported RCT funding sources within the meta-analyses themselves. No studies reported author-industry ties despite 69% disclosing them individually. In 1998 a US federal judge found the Environmental Protection Agency abused the process claiming cancer risks from environmental tobacco smoke to influence policy makers. The methodology remains highly malleable allowing agenda-driven manipulation to distort scientific consensus effectively.

  • Seed-based d mapping analyzes differences in brain activity using neuroimaging techniques like fMRI VBM or PET scans. MicroRNA expression profiles identify differentially expressed microRNAs in specific cell types disease conditions or treatment effects. Whole genome sequencing studies provide solutions for collecting large sample sizes discovering rare variants associated with complex phenotypes. Biobank-scale cohorts utilize efficient approaches for summary statistic storage enabling functionally informed rare variant association meta-analysis. Network meta-analyses examine patterns in fuller panoramas of accurately estimated results considering broader contexts like personality-intelligence relations varying by trait family. Forest plots display results showing weighted averages across all studies emphasizing practical importance over statistical significance alone. Inverse variance method computes average effect size as weighted mean where weights equal inverse variance of each study estimator. Mantel-Haenszel and Peto methods offer alternative common approaches frequently used in healthcare research settings. Single-subject design meta-analysis techniques remain subject to considerable dispute regarding most appropriate application methods. Megastudies investigate efficacy of many interventions designed interdisciplinary manner by separate teams recruiting thousands participants through fitness chains. Standardization reproduction experiments open data protocols often fail to mitigate problems when relevant factors remain unknown or unrecorded. Modern statistical meta-analysis tests if outcomes show more variation than expected due to sampling different numbers of research participants.

Common questions

When did the term meta-analysis enter the statistical lexicon?

The term meta-analysis entered the statistical lexicon in 1976 when Gene Glass defined it as the analysis of analyses. This coinage marked a shift toward aggregating measures of relationships and effects across independent studies.

Who published the first known paper collating data from multiple typhoid inoculation studies?

Karl Pearson published a paper in 1904 that collated data from several typhoid inoculation studies within the British Medical Journal. That work aggregated outcomes from multiple clinical trials decades before the formal method existed.

How many meta-analyses were published globally by 2014 compared to 1991?

By 1991 researchers had published 334 meta-analyses globally while that number surged to 9,135 by 2014 indicating massive acceptance across disciplines like psychology medicine and ecology.

What is the file drawer problem in relation to publication bias?

The file drawer problem characterizes negative or non-significant results tucked away in cabinets creating serious base rate fallacies. Estimates suggest 25% of meta-analyses in psychological sciences suffered from publication bias though actual figures likely higher.

Which US federal agency was found to have abused meta-analysis in 1998 regarding environmental tobacco smoke?

In 1998 a US federal judge found the Environmental Protection Agency abused the process claiming cancer risks from environmental tobacco smoke to influence policy makers. The methodology remains highly malleable allowing agenda-driven manipulation to distort scientific consensus effectively.