Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Recommendations for the design and analysis of epigenome-wide association studies

Abstract

Epigenome-wide association studies (EWAS) hold promise for the detection of new regulatory mechanisms that may be susceptible to modification by environmental and lifestyle factors affecting susceptibility to disease. Epigenome-wide screening methods cover an increasing number of CpG sites, but the complexity of the data poses a challenge to separating robust signals from noise. Appropriate study design, a detailed a priori analysis plan and validation of results are essential to minimize the danger of false positive results and contribute to a unified approach. Epigenome-wide mapping studies in homogenous cell populations will inform our understanding of normal variation in the methylome that is not associated with disease or aging. Here we review concepts for conducting a stringent and powerful EWAS, including the choice of analyzed tissue, sources of variability and systematic biases, outline analytical solutions to EWAS-specific problems and highlight caveats in interpretation of data generated from samples with cellular heterogeneity.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Descriptive analysis of prior EWAS.
Figure 2: Steps toward a successful EWAS.

Similar content being viewed by others

References

  1. McCarthy, M.I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9, 356–369 (2008).

    Article  CAS  PubMed  Google Scholar 

  2. Bernstein, B.E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Satterlee, J.S., Schubeler, D. & Ng, H.H. Tackling the epigenome: challenges and opportunities for collaboration. Nat. Biotechnol. 28, 1039–1044 (2010).

    Article  CAS  PubMed  Google Scholar 

  4. Adams, D. et al. BLUEPRINT to decode the epigenetic signature written in blood. Nat. Biotechnol. 30, 224–226 (2012).

    Article  CAS  PubMed  Google Scholar 

  5. Bock, C. et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat. Biotechnol. 28, 1106–1114 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Harris, R.A. et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat. Biotechnol. 28, 1097–1105 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Bock, C. Analysing and interpreting DNA methylation data. Nat. Rev. Genet. 13, 705–719 (2012).This paper provides a comprehensive review of the computational methods and available software tools for the analysis of DNA methylation data.

    Article  CAS  PubMed  Google Scholar 

  8. Hansen, K.D., Wu, Z., Irizarry, R.A. & Leek, J.T. Sequencing technology does not eliminate biological variability. Nat. Biotechnol. 29, 572–573 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Jaffe, A.E., Feinberg, A.P., Irizarry, R.A. & Leek, J.T. Significance analysis and statistical dissection of variably methylated regions. Biostatistics 13, 166–178 (2012).

    Article  PubMed  Google Scholar 

  10. Bibikova, M. et al. High-throughput DNA methylation profiling using universal bead arrays. Genome Res. 16, 383–393 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Michels, K.B. Epigenetic Epidemiology (Springer, 2012). This is the first textbook on epigenetic epidemiology providing guidance to epidemiologists and epigeneticists alike how to design, conduct and analyze an epigenetic epidemiology study.

  12. Mill, J. & Heijmans, B.T. From promises to practical strategies in epigenetic epidemiology. Nat. Rev. Genet. 14, 585–594 (2013).

    Article  CAS  PubMed  Google Scholar 

  13. Rakyan, V.K., Down, T.A., Balding, D.J. & Beck, S. Epigenome-wide association studies for common human diseases. Nat. Rev. Genet. 12, 529–541 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Silviera, M.L., Smith, B.P., Powell, J. & Sapienza, C. Epigenetic differences in normal colon mucosa of cancer patients suggest altered dietary metabolic pathways. Cancer Prev. Res. (Phila.) 5, 374–384 (2012).

    Article  CAS  Google Scholar 

  15. Houseman, E.A. et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics 13, 86 (2012).This paper describes a new method to statistically adjust for the cell mixture distribution of blood cells using DNA methylation marks.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Reinius, L.E. et al. Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PLoS ONE 7, e41361 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Koestler, D.C. et al. Peripheral blood immune cell methylation profiles are associated with nonhematopoietic cancers. Cancer Epidemiol. Biomarkers Prev. 21, 1293–1302 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Abbas, A.R., Wolslegel, K., Seshasayee, D., Modrusan, Z. & Clark, H.F. Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PLoS ONE 4, e6098 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Liu, Y. et al. Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nat. Biotechnol. 31, 142–147 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Saferali, A. et al. Cell culture-induced aberrant methylation of the imprinted IG DMR in human lymphoblastoid cell lines. Epigenetics 5, 50–60 (2010).

    Article  CAS  PubMed  Google Scholar 

  21. Sugawara, H. et al. Comprehensive DNA methylation analysis of human peripheral blood leukocytes and lymphoblastoid cell lines. Epigenetics 6, 508–515 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Caliskan, M., Cusanovich, D.A., Ober, C. & Gilad, Y. The effects of EBV transformation on gene expression levels and methylation profiles. Hum. Mol. Genet. 20, 1643–1652 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Michels, K.B. The promises and challenges of epigenetic epidemiology. Exp. Gerontol. 45, 297–301 (2010).

    Article  PubMed  Google Scholar 

  24. Teschendorff, A.E., Zhuang, J. & Widschwendter, M. Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies. Bioinformatics 27, 1496–1505 (2011).

    Article  CAS  PubMed  Google Scholar 

  25. Leek, J.T. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 11, 733–739 (2010).

    Article  CAS  PubMed  Google Scholar 

  26. Dedeurwaerder, S. et al. Evaluation of the Infinium Methylation 450K technology. Epigenomics 3, 771–784 (2011).This paper provides an in-depth discussion of the 450K Infinium microarray technology for DNA methylation.

    Article  CAS  PubMed  Google Scholar 

  27. Smith, Z.D. et al. A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature 484, 339–344 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Boyle, P. et al. Gel-free multiplexed reduced representation bisulfite sequencing for large-scale DNA methylation profiling. Genome Biol. 13, R92 (2012).This paper describes the methods for the multiplex adaptation of RRBS for DNA methylation.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Bock, C. et al. DNA methylation dynamics during in vivo differentiation of blood and skin stem cells. Mol. Cell 47, 633–647 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Liu, Y., Siegmund, K.D., Laird, P.W. & Berman, B.P. Bis-SNP: combined DNA methylation and SNP calling for Bisulfite-seq data. Genome Biol. 13, R61 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Du, P., Kibbe, W.A. & Lin, S.M. lumi: a pipeline for processing Illumina microarray. Bioinformatics 24, 1547–1548 (2008).

    Article  CAS  PubMed  Google Scholar 

  32. Goecks, J., Nekrutenko, A. & Taylor, J. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11, R86 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Halachev, K., Bast, H., Albrecht, F., Lengauer, T. & Bock, C. EpiExplorer: live exploration and global analysis of large epigenomic datasets. Genome Biol. 13, R96 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Smyth, G.K. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3, 3 (2004).

    Article  Google Scholar 

  35. Akey, J.M., Biswas, S., Leek, J.T. & Storey, J.D. On the design and analysis of gene expression studies in human populations. Nat. Genet. 39, 807–808; author reply 808–809 (2007).

    Article  CAS  PubMed  Google Scholar 

  36. Johnson, W.E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

    Article  PubMed  Google Scholar 

  37. Jaffe, A.E. et al. Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. Int. J. Epidemiol. 41, 200–209 (2012).In this paper the authors suggest a new computational method for detecting differently methylated regions based on a techniques that borrows statistical power from adjacent locations to produce estimates that are substantially more precise than single-locus methods.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Hansen, K.D., Langmead, B. & Irizarry, R.A. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 13, R83 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Langevin, S.M. et al. The influence of aging, environmental exposures and local sequence features on the variation of DNA methylation in blood. Epigenetics 6, 908–919 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Feinberg, A.P. & Irizarry, R.A. Evolution in health and medicine Sackler colloquium: stochastic epigenetic variation as a driving force of development, evolutionary adaptation, and disease. Proc. Natl. Acad. Sci. USA 107 (suppl. 1), 1757–1764 (2010).This paper was the first to propose that genetic changes can drive epigenetic variability and argues that we should search for differential variability between groups, not just average shifts.

    Article  CAS  PubMed  Google Scholar 

  41. Teschendorff, A.E. & Widschwendter, M. Differential variability improves the identification of cancer risk markers in DNA methylation studies profiling precursor cancer lesions. Bioinformatics 28, 1487–1494 (2012).

    Article  CAS  PubMed  Google Scholar 

  42. Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Xu, J. et al. Pioneer factor interactions and unmethylated CpG dinucleotides mark silent tissue-specific enhancers in embryonic stem cells. Proc. Natl. Acad. Sci. USA 104, 12377–12382 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Nativio, R. et al. Disruption of genomic neighbourhood at the imprinted IGF2–H19 locus in Beckwith-Wiedemann syndrome and Silver-Russell syndrome. Hum. Mol. Genet. 20, 1363–1374 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Gibney, E.R. & Nolan, C.M. Epigenetics and gene expression. Heredity 105, 4–13 (2010).

    Article  CAS  PubMed  Google Scholar 

  46. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article  CAS  PubMed  Google Scholar 

  48. Huang da, W. et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 35, W169–W175 (2007).

    Article  PubMed  Google Scholar 

  49. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. McLean, C.Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Bock, C., Halachev, K., Buch, J. & Lengauer, T. EpiGRAPH: user-friendly software for statistical analysis and prediction of (epi)genomic data. Genome Biol. 10, R14 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Bailey, T.L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We are grateful to the Radcliffe Institute for Advanced Study at Harvard University for providing support for the workshop “Challenges of Epigenome-wide Association Studies—Optimizing Analytic Methods to Identify Important DNA Methylation Marks” held in Cambridge, Massachusetts, USA, on 3–5 June, 2012.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karin B Michels.

Ethics declarations

Competing interests

E.A.H. and K.T.K. are inventors on a pending international patent application, WO 2012/162660, entitled "Methods Using DNA Methylation for Identifying a Cell or a Mixture of Cells for Prognosis and Diagnosis of Diseases, and for Cell Remediation Therapies.

Supplementary information

Supplementary Table 1

Review of previously published EWAS. (PDF 870 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Michels, K., Binder, A., Dedeurwaerder, S. et al. Recommendations for the design and analysis of epigenome-wide association studies. Nat Methods 10, 949–955 (2013). https://doi.org/10.1038/nmeth.2632

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.2632

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing