Using functional genomics to advance the understanding of psoriatic arthritis

Abstract Psoriatic arthritis (PsA) is a complex disease where susceptibility is determined by genetic and environmental risk factors. Clinically, PsA involves inflammation of the joints and the skin, and, if left untreated, results in irreversible joint damage. There is currently no cure and the few treatments available to alleviate symptoms do not work in all patients. Over the past decade, genome-wide association studies (GWAS) have uncovered a large number of disease-associated loci but translating these findings into functional mechanisms and novel targets for therapeutic use is not straightforward. Most variants have been predicted to affect primarily long-range regulatory regions such as enhancers. There is now compelling evidence to support the use of chromatin conformation analysis methods to discover novel genes that can be affected by disease-associated variants. Here, we will review the studies published in the field that have given us a novel understanding of gene regulation in the context of functional genomics and how this relates to the study of PsA and its underlying disease mechanism.


Introduction
Psoriatic arthritis (PsA) is a chronic autoimmune disease with a high disease burden [1][2][3][4] and major impacts both to the patients' quality of life and economic and social impact to the society. It is characterized by a combination of psoriasis and arthritis. The disease is clinically heterogeneous, causing it to be frequently confused with similar diseases such as other forms of arthritis or psoriasis [5,6]. The biological mechanism behind the disease is not well understood, but it is known that PsA has a strong, complex genetic component with a very high genetic heritability [7,8]. Family studies have shown that people with a first degree relative affected Rheumatology key messages . It is challenging to translate GWAS results into patient benefit. . Functional genomics can elucidate the role of GWAS variants in disease. . Understanding the function of GWAS variants can lead to the discovery of new treatments.
Genome wide association studies (GWAS) have identified a significant part of the genetic factors that lead to the disease [14][15][16][17][18][19] (Table 1). The main association is with HLA class I genes, which was already discovered by earlier family studies [21]. A meta-analysis published in 2015 subsequently identified the loci that distinguish PsA from cutaneous psoriasis [14]. For example, PsA lacked associations with TNFRSF9 and LCE3C/B that are present in psoriasis and has different HLA-C associations. Importantly they found that, in the IL23R and TNFAIP3 loci, PsA had independent signals compared with psoriasis, suggesting a different mechanism acting in the same loci. Nevertheless, the majority of the loci still overlap with psoriasis. In the same year, a larger study with almost 2000 PsA patients was published [19]. In this study they used the custom genotyping chip Immunochip, which targeted specific autoimmuneassociated loci with a much higher resolution and identified many genome-wide significant loci, such as the 5q31 loci that is specific to PsA (Table 1).

Linking variants to function
Despite the success of GWAS studies in identifying the genetic variants that are linked to PsA, understanding how the associated genetic variants affect the underlying biological mechanisms is not straightforward. Only a small proportion of the variants associated with complex traits identified by GWAS affect coding sequence of proteins. Farh et al. [22] produced an important study in this field, mapping GWAS signals from 21 autoimmune diseases and estimating that 90% of them affected noncoding regulatory regions with the majority (60%) affecting enhancer regions in immune related cells [22]. This makes understanding the disruptive effect of diseaseassociated variants intricate because many of these regulatory elements can act at long range through chromatin interaction mechanisms [23][24][25][26][27][28][29][30]. Other mechanisms are also possible in a minority of the loci, such as variants affecting long non-coding RNAs (lncRNA) and microRNAs (miRNA). For example, in rheumatoid arthritis there have been reports of genetic variants affecting the miRNA miR-146a [31] and the lncRNA C5T1lncRNA [32].
The simplest method to link variants to functional effect and their target genes is correlating the genetic makeup of different individuals with the expression levels of genes in a specific tissue or cell type (expression quantitative trait loci or eQTL) (Fig. 1B). The most comprehensive study in this regard is the Genotype-Tissue Expression (GTEx) project, which analysed a large number of tissues from recently deceased people and correlated RNA expression levels with their genotype [33]. More specific studies with larger sample sizes were done with whole blood [34][35][36] and other immune cells [37][38][39], discovering tens of thousands of genetic variants that regulate gene expression. Although there are several limitations, such as the need of large sample sizes to draw statistically significant conclusions and the high cost, eQTL studies have been fundamental in describing many principles of gene regulation, including: . Some eQTLs are stimulation responsive and cell-type specific [38,[40][41][42][43][44]. For example, Schmiedel et al.

FIG. 1
Using functional genomics to describe GWAS loci (A) A typical GWAS loci usually consists of many variants in high linkage disequilibrium and frequently far away from any genes, which can make the interpretation of the association challenging. (B) It is possible to use a combination of functional genomics techniques to study these loci, such as: chromatin activity to identify which SNPs are functionally relevant and in which cell types; eQTLs to correlate genotype with changes in gene expression; and chromatin conformation to identify regulatory domains that determine which genes can be affected. (C) These methods combined with others allow us to identify the functional importance of GWAS associations in the disease.
Using functional genomics to advance the understanding of psoriatic arthritis https://academic.oup.com/rheumatology autoimmune diseases. Because eQTLs are context specific, some studies have used cells derived from patients to discover specific links with disease associated loci. For example, Thalayasingam et al. mapped eQTLs in CD4þ T cells and B cells from 344 patients with untreated RA identifying a number of candidate genes linked to variants associated with RA [45]. In PsA, Bowes et al. mapped their novel disease associated loci using celltype specific eQTLs from CD4þ and CD8þ primary T cells [19]. Another study mapped eQTLs in skin tissues from psoriasis patients and looked at overlap with psoriasis GWAS results, finding significant enrichment of psoriasis GWAS SNPs in their eQTL dataset, with effects on the expression of genes such as FUT2, RPS26 and ERAP2 [46]. Immune cells are of a particular interest in autoimmune diseases and a large number of disease-associated variants have been found to overlap eQTLs in these cells [38,47]. A summary of eQTLs overlapping disease associated variants in PsA is reported in Table 2.
However, eQTL studies have failed to capture all GWAS loci and, although GWAS SNPs are significantly enriched in eQTLs [34,38,44,48], only 20-50% of GWAS SNPs overlap with an eQTL. Moreover, a major drawback of eQTLs is that they only prove correlation and not causation and are also biased towards large effect sizes.
An alternative to eQTL analysis is to functionally describe the mechanisms by which variants can affect genes; for example, using chromatin conformation techniques.
Using chromatin conformation methods to map target genes As stated previously, the majority of disease-associated variants are predicted to affect regulatory regions such as enhancers [22]. Enhancers are regulatory elements that are bound by transcription factors and have long been known to regulate genes by long-range effects [29,[54][55][56]. These elements were initially identified from viral sequences and were found to 'enhance' transcription of nearby elements [57]. Identifying enhancers is challenging due to the lack of accurate computational prediction methods, and due to the fact that they are very context and cell-type specific [40,41,58]. Identification of enhancers is frequently done by probing the presence of histone tail modifications or the presence of bound transcription factors [58,59]. Tools such as RegulomeDB [60] and HaploReg [61] have annotated all known SNPs with known functional elements in a variety of cell lineages and produce a score based on the likelihood that a particular SNP affects a functional element.
Nevertheless, identifying the targets of enhancers has been challenging. Although it was established very early on that enhancers regulate genes at a distance, it wasn't immediately clear how this activity was mediated. With the development of chromatin conformation techniques, it was demonstrated that enhancers interact physically with their target promoters and that these interactions were cell-type specific. A summary of chromatin conformation techniques is presented in Table 3.
With the development of more advanced techniques and higher resolution Hi-C data, it was shown that the contact domains and loops are highly variable and celltype specific [68][69][70]. Other studies have shown that it is possible to reconstruct the lineage of primary blood cell types by interactions alone, and that these interactions are highly cell-type specific [23] and change with differentiation [71,72] and activation [73]. Moreover, they found that the number or intensity of interactions with active enhancers correlated with expression levels of genes [23,73,74]. We now know that multiple enhancers can interact simultaneously with a single promoter [75] and that a single enhancer can regulate multiple genes at the same time within the same chromatin domain [76]. Live imaging studies have also shown that interactions are highly dynamic and transient [76,77]. Recent evidence gathered using ChIP-Seq and RNA-Seq data from a large set of genotyped cell-lines has shown that regulatory activity is highly associated within well-delimited cis-regulatory domains. These domains respect many features found in chromatin conformation, such as topologically associating domains (TADs), interaction intensity and compartments [78]. Moreover, they find that the activity and structure of these cisregulatory domains are partly determined by the genetic makeup of individuals [78]. Similarly, a study has shown that genetic variants are associated with differences in chromatin conformation [79]. These studies have also begun to link diseaseassociated variants to specific interacting genes and have shown that it is possible to use this information to prioritize genetic targets from non-coding GWAS SNPs. Importantly, this list of variant-gene interactions rediscovers 25% of previously reported eQTLs [23]. Because the 3 D chromatin conformation and enhancer-promoter interactome is highly context specific, with many enhancers regulating different genes in different tissues, many studies have interrogated the enhancer-promoter interactome on disease specific tissues such as cardiomyocytes [80,81] for cardiovascular diseases and pancreatic islets [82,83] for diabetes, and have shown that disease-associated loci are significantly enriched in regulatory regions active in those tissues. In their recent publication, Montefiori et al. linked 1999 cardiovascular disease-associated SNPs to 347 target genes in humaninduced pluripotent stem cell derived cardio-myocytes, remarkably showing that 90% of variants did not target the nearest gene [80]. More recently, a group has shown that genetic variants associated with Type 1 Diabetes alter the chromatin conformation at disease-associated loci in a mouse model [84].
Other studies used region CHi-C [24,25,28,65,85,86], targeting specific disease-associated loci to better identify causal genes that interacted with non-coding regulatory elements that could be disrupted by the variants. To date, only one study has investigated PsAassociated loci [24]; in two cell lines (B and T cells) 116 regions associated with autoimmune diseases including PsA were found to interact with at least one gene promoter. For example, the locus 6q23, which contains variants linked to PsA, RA, SLE, celiac disease, T1D, IBD and Ps, was previously assigned to TNFAIP3. In their work, McGovern et al. showed that this region also interacts with IL20RA, as well as showing a significant eQTL effect [25]. Another variant associated with PsA within the DENND1B gene was linked to PTPRC, a gene previously shown to be involved in RA [24]. A summary of the results from this paper for PsA-associated loci is available in Table 4. Although previous studies have shown strong evidence for the use of disease-relevant tissues, no study to date has used tissues derived from PsA patients.
Experimental confirmation is required to confirm the effect of the putative enhancers on the identified genes. Currently available methods include eQTL analysis, measuring the effect that deletions of the enhancers can have on the expression of interacting target genes, and other gene editing (CRISPR) derived methods that use fusion proteins to specifically activate or repress enhancers [41,87,88]. In a recent study, Mumbach et al. used a deactivated Cas9 protein (dCas9) fused with a KRAB domain that functions as a repressor (CRISPRi) to target three enhancers and show that it caused a reduction of transcription from interacting genes [27].
Another limitation that is often present in chromatin conformation studies is the separation of functional First technique developed on which future techniques were based. The chromatin is first digested with enzymes and then re-ligated such that interacting regions are re-ligated together. The resulting products are analysed by quantitative polymerase chain reaction (qPCR) to quantify the frequency of interactions One to one 4C [63] Same as 3C, but resulting products are analysed by microarray to test the interactions originating from one region with the rest of the genome.

One to all
Hi-C [64] Same as 3C, but the resulting products are fragmented and sequenced. This produces the most comprehensive analysis genome wide, but requires significant sequencing efforts to map all possible interactions across the whole genome.
All to all Capture Hi-C [65] Same as Hi-C, but the library is first enriched for specific regions to focus the sequencing efforts to regions of interest such as promoters or disease-associated loci.

Many to all
ChIA-PET and HiChIP [26,27,66,67] Same as Hi-C, but the library is enriched using a chromatin immunoprecipitation step; for example, markers of active regions of the genome. HiChIP is similar to ChIA-PET but provides significant improvements over it.

Many to many/all
Using functional genomics to advance the understanding of psoriatic arthritis https://academic.oup.com/rheumatology annotation and chromatin conformation analysis. Most use publicly available annotation [58,59] to annotate results and the majority use either cell lines or primary cells from healthy donors instead of patient samples. This can lead to missing important regulatory elements that could be disease specific. Nevertheless, combining different forms of functional genomics studies has the potential to translate the results from GWAS studies into understanding of the mechanisms of disease (Fig. 1).

Impact on drug discovery
Treatment options for PsA and other autoimmune diseases are limited and often not effective for all patients.
In particular, most current treatment options are composed of broad-spectrum anti-inflammatory drugs or target very few pathways (Table 5). This is a result of the poor understanding of the mechanisms and pathways involved in the disease and the high failure rate in drug development. Right now, only 10% of drugs that start clinical trials reach patients, with >50% failing at late stages [90,91], primarily due to inadequate efficacy [90]. This has led to extremely high cost of drug development and stagnation of new development due to the high economic risk.
Recently, a new wave of optimism has been brought thanks to the genetic dissection of complex diseases. Contrary to traditional methods of target identification, which use phenotypic data that are subject to confounding and environmental effects, genetic susceptibility factors are stable biomarkers that can provide clues to causality as well as providing information about pathways that are perturbed in disease and therefore could be targets for therapy. Moreover, GWAS studies have been designed from the ground up to obtain high statistical confidence thanks to factors such as adequate correction for multiple testing and appropriate sample sizes. This resulted in high reproducibility of results [92], which is in contrast to the current medical research trend [93,94].
About 22% of the protein-coding genes are druggable by conventional drugs [95], and the percentage could become even higher as methods based on RNA inhibition are developed. Moreover, repurposing of available drugs can significantly speed the path to patient benefit because they have already been safety tested and chemically characterized.
Different studies have begun to exploit GWAS results to produce a new list of possible drug targets for coronary artery disease (CAD) [96], Parkinson's disease [97] and RA [98], often rediscovering most of the drugs currently in use for treating these disorders. Promising results have been already obtained with a number of diseases. In PsA, for example, the identification of genetic associations in the IL-23 pathway provided genetic evidence for the repositioning of biologic drugs targeting components of this pathway in PsA [99]. Another inhibitor of IL-17A originally developed for Ps, RA and uveitis has been repurposed for use in ankylosing spondylitis [100].
Most studies to date have not used functional genomics to link variants to candidate genes, relying often on  COQ10A, BAZ2A, ANKRD52, NABP2, ORMDL2, ZC3H10, SMARCC2, RP11, SLC39A5, CS, RNF41,  PAN2, PA2G4, RN7SL770P, ESYT1, GLS2, MIP, TIMELESS, SPRYD4, SARNP  rs33980500  RP11, REV3L, KIAA1919, TUBE1  rs4921482 ADRA1B, GAPDHP40 rs76956521 ANXA6 Data from CD4 T cells and B cells obtained from Martin et al. [24]. empirical methods such as choosing the closest or overlapping genes. As explained previously, this can often lead to incorrect conclusions about the functional gene and, more frequently, missing out genes that could have been potentially targeted. This is starting to change with the development of new methods that utilize functional annotation and chromatin conformation to discover genes and pathways that can be targeted disease. A recent publication from Martin et al. has used CHi-C data to identify potential drug targets in RA, JIA and PsA [101]. Using their approach, they have rediscovered 48 known drug targets and 87 potential new drugs for PsA. Another recent publication has developed a new way to prioritize target genes using a network connectivity metric [102] and, by analysing genetic and functional data from 30 other immune traits, rediscovered many known targets and predicted activity in high-throughput screens.

Conclusion
Over the last decade, genetic studies have identified a significant number of loci associated with PsA that have greatly improved our understanding of the disease mechanism. However, PsA has a low prevalence and is clinically heterogeneous and difficult to distinguish from psoriasis. For this reason, it has been understudied compared with other diseases such as RA, T2D and IBD for which there have been a great number of new loci identified and many more potential drug targets identified. Moreover, there are still many obstacles to overcome to accomplish the full potential of functional genomics, such as limitation in current techniques and analysis methods, but recent discoveries, especially in the field of basic biology (such as chromatin conformation regulation), will surely provide a new wave of functional targets (or discoveries) for complex diseases.
Thanks to these we will have a better understanding of the underlying mechanism of the diseases, the cell types actively involved in disease progression and discover novel drug targets that can expand our repertoire of tools to combat PsA. A better understanding of the disease and its genetics will also aid in the stratification of patients for existing therapies, which is going to be particularly important in a disease that is as varied as PsA.