Article Text
Abstract
Background Systemic lupus erythematosus (SLE) is a complex disease caused by interplay of genetic predisposition and environmental triggers. Strong genetic component in lupus supported by familial clustering, where 10-12% of SLE patients have an affected first-degree relative, and high concordance rates (24–69%) in monozygotic twins compared to dizygotic twins and non-twin siblings (2–9%). In the majority of SLE cases, excluding rare monogenic forms, the genetic components are composed of many polymorphisms with small effects, acting additively, which explain about 44% to 66% of the disease etiology. The prevalence and severity of the disease differ across populations, where Africans, Hispanics and East Asians having three to four times higher incidence compared to Europeans, which may be partially explained by genetic differences, as we show here. We summarized all the genetic findings in candidate gene and genome wide association studies (GWAS) for non-monogenic forms of lupus across different populations now available in the medical literature.
Methods A literature search for GWAS and candidate gene studies in SLE was done in the electronic article database PubMed (www.ncbi.nlm.nih.gov/pubmed), NHGRI-EBI GWAS catalog and the references of selected original publications and review articles. We included 127 studies that reported SLE-associated polymorphisms reaching the threshold of significance ≤5x10-8. Variants in linkage disequilibrium (LD) with r2≥0.8 were grouped into the same loci. The leading variant of the locus was defined as a variant with lowest p value. The criteria for the independence of loci were r2<0.2 between leading variants or literature support for there being two (or more) distinguishable contributions to genetic risk.
Results In total we found 730 polymorphisms with p value between 5x10-8 and 2.2x10-298 associated with SLE in Europeans (EU), Asians (AS), African-American (AA) and Mixed Americans/Hispanics (MA). Reported odds ratio (OR) varied between 1.1 and 5, but the majority of the associations are weak effects (~77% have OR<1.5 while ~7.7% have an OR≥2). These variants we grouped into 315 independent loci: 106 loci in EU, 216 loci in AS, 11 loci in AA, 18 loci in MA and 28 loci was reported only in multiancestral group. Many loci are ancestry specific: 174 (80.56%) loci are found to date only in AS, 60 (56.60%) loci only in EU, 3 (27.27%) loci only in AA, and 3 (16.67%) loci only in MA (figure 1). This finding may be explained by the different allele frequencies in the populations studied. This population specificity of the disease loci could influence on SLE prevalence and be a source for heterogeneity of symptoms and disease severity across populations. Identifying the true causal variants and predicting their function is not a trivial task, as genetic variants are in linkage blocks; therefore, the variant with lowest p value may not be causal. Among 730 SLE associated polymorphisms only 21 (4.52%) lead to amino acid change, 484 (66.3%) lay within gene coding region of 272 genes and the rest are intergenic, which suggests that the majority of SLE associations affect gene regulation, instead of protein sequence. These variants may regulate gene or genes near or far, making the identification of their targets that are involved in mechanisms that change lupus risk a major challenge. The pathway analysis for the 272 genes intersected by SLE variants shows that they are involved in immune processes including responses to pathogens and transcription (MHC class II receptor activity, toll-like receptor signaling, response to cytokines and regulation of their production, immune cell activation and proliferation, DNA binding, transcription regulation, Epstein-Barr virus infection and many others).
Conclusions GWAS and candidate gene studies discovered more than 300 lupus associated loci that explain the more than half of heritability in SLE, presumably leaving many SLE loci to be discovered in the future. Extensive studies have been performed in East Asians and Europeans with studies involving fewer subjects mean that we know less about SLE genetics in other populations. Many SLE associated variants are ancestry specific that have the potential to give distinction to clinical manifestations and disease prevalence when comparing ancestries. The great majority of SLE associated variants are located in non-coding regions; therefore, they more likely affect gene regulation, making this mechanism the key for understanding SLE genetics and pathogenesis. Surrogate candidate genes potentially affected by SLE associated polymorphisms appear to be mainly involved in immune processes and the regulation of transcription.
Acknowledgments Support is appreciated from US Department of Veterans Affairs Merit Award (I01 BX001834), and the National Institutes of Health (R01 AI24717 & U01 AI130830).
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.