Background Systemic lupus erythematosus (SLE) is a disease involves the complex interplay of many genes, reflected in more than one hundred loci linked with disease risk by genome-wide association studies (GWAS). Decoding GWAS is therefore a promising strategy to identify novel drug targets in SLE. However, most of the identified disease-associated hits are noncoding single-nucleotide polymorphisms (SNPs), and cannot be distinguished from others that reside incidentally within risk loci. To address this longstanding challenge of finding the real regulatory functional SNPs (fSNPs) from among GWAS hits in SLE, we utilized an unbiased high-throughput screen method.
Methods From 5 GWAS for SLE (Gateva, Sandling et al. 2009, Bentham, Morris et al. 2015, Armstrong et al. 2015, Morris, Sheng et al. 2016, Langefeld, Ainsworth et al. 2017), 87 disease associated SNPs were chosen as lead SNPs and SNPs in linkage disequilibrium (LD)(R2>0.8) with them are also included as the screening library. In total of 2176 SNPs were screened by three different high-throughput methods, SNP-seq (Li, Martinez-Bonet et al. 2018), H3K4me3 epigenetic modification (Trynka, Sandor et al. 2013), and Combined Annotation Dependent Depletion (CADD) (Rentzsch, Witten et al. 2018). Top candidates from the screening were further tested for regulatory function by electrophoretic mobility shift assay (EMSA) and luciferase reporter assay to define the final fSNPs candidates. Through bioinformatics binding motif prediction and mass spectrometry after oligo pulldown, transcriptional factors (TF) that might binds to the fSNPs were prioritized for validation by CHIP-qPCR, Western blot for oligo pulldown assay, and supershift. The association between selected fSNPs and the risk gene/associated gene was measured using genetic modified of the Daudi B cell line (CRISPR-homology directed repair). The association between the TF and the associated gene was measured by TF CRISPR knockout Daudi B cells.
Results Fifty-four candidate fSNPs from 2176 SNPs were found to be possible regulatory variants and tested for regulatory function. After EMSA, 9 SNPs showed allele-specific binding to proteins from both BL2 cells (B cell line) and PBMC nuclear extract. Six out of these 9 SNPs showed allelic differential gene expression in luciferase reporter assay in a B cell line (Daudi). After bioinformatics predictions as well as mass spectrometry for oligo pulldown assay, two fSNPs (rs2297550 and rs9907966) were found to be able to bind to transcriptional factor IKZF1 and YBX1 in B cells respectively. Specifically, IKZF1 prefers to bind to G allele (risk allele) of rs2297550 and YBX1 prefers to bind to A allele (reference allele) of rs9907966. SNP rs2297550 was further analyzed for its association with the risk gene IKBKE and found out that the homozygous of risk G allele is associated with lower IKBKE expression in B cells. Considering the role of IKBKE in preventing DNA damage induced cell death, the deficiency of IKBKE plausibly increases SLE risk. Through CRISPR knockout, we found out that the deficiency of TF IKZF1 leads to increased IKBKE expression. These data suggest a plausible mechanism which the risk allele of rs2297550 binds to IKZF1 to suppress the expression of IKBKE in B cells, therefore increase SLE risk.
Conclusions Our unbiased high-throughput screening for SLE GWAS hits followed by a step-wise validation leads to the identification of real functional regulatory fSNPs that are capable of binding to transcriptional factors and regulate gene expression, which establish a working model to bridge the gap between SLE GWAS and disease mechanism.
Acknowledgments This work was funded by a Target Identification in Lupus grant from the Lupus Research Alliance.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.