Toward an elucidation of the molecular genetics of inherited retinal degenerations

Abstract While individually classed as rare diseases, hereditary retinal degenerations (IRDs) are the major cause of registered visual handicap in the developed world. Given their hereditary nature, some degree of intergenic heterogeneity was expected, with genes segregating in autosomal dominant, recessive, X-linked recessive, and more rarely in digenic or mitochondrial modes. Today, it is recognized that IRDs, as a group, represent one of the most genetically diverse of hereditary conditions - at least 260 genes having been implicated, with 70 genes identified in the most common IRD, retinitis pigmentosa (RP). However, targeted sequencing studies of exons from known IRD genes have resulted in the identification of candidate mutations in only approximately 60% of IRD cases. Given recent advances in the development of gene-based medicines, characterization of IRD patient cohorts for known IRD genes and elucidation of the molecular pathologies of disease in those remaining unresolved cases has become an endeavor of the highest priority. Here, we provide an outline of progress in this area.


Genetic Heterogeneity in IRDs
Inherited retinal degenerations (IRDs) represent the most frequent cause of visual dysfunction in those of working age, such conditions therefore having a highly significant impact on quality of life and health economics. The more common IRDs include Retinitis Pigmentosa (RP), Choroideremia, Leber congenital amaurosis (LCA), Usher syndrome, Congenital stationary night blindness (CSNB), Vitelliform macular dystrophy, Stargardt Macular Dystrophy, Best disease, Retinoschisis, conerod dystrophy and myocilin-based hereditary forms of open angle glaucoma (see www.sph.uth.tmc.edu/retnet/ for details of disorders, causative genes and mutations). Typically, IRDs result in gradual photoreceptor loss and compromised vision, often leading to registered blindness. Patterns of inheritance include autosomal recessive, autosomal dominant and X-linked, however, more rare mitochondrial and digenic forms of retinal dystrophies have been characterized.
IRD genes and encoded proteins are involved in a wide array of functions. It is of note that mutations in essentially all genes encoding components of the visual transduction and retinoid cycles have been implicated in disease etiology, as well as major structural components of photoreceptors (for extensive reviews of gene function see 26,27; Tables 1-3). RP is the most common form of IRD, affecting approximately 1 in 3,000 people (4,5), the disease being characterized by progressive loss of rod photoreceptor function and their demise, followed by the death of the cone photoreceptors, initial symptoms of nyctalopia (night blindness) being followed usually by extensive loss of daytime vision as cones degenerate. Mutations within around 70 disease-causing genes have so far been implicated in RP (www. sph.uth.tmc.edu/retnet/; date last accessed April 25, 2017), while other genes still remain to be identified, indicating that this condition may, perhaps, be better regarded as a cluster of related conditions with similar clinical presentations.

Next-Generation Sequencing of IRD Populations
Future exome and whole genome sequencing, in conjunction with emerging methods to define the function of candidate mutations in regulatory and intronic regions and to better characterize copy number variations (CNVs), all of which have been implicated as causative of some forms of IRD, will undoubtedly progress our understanding of the molecular genetics of IRDs to a new level. The situation for IRDs is further complicated by the diversity of clinical presentations that can be caused by mutations even within a single IRD gene, as well as overlapping clinical phenotypes which may be the result of mutations in entirely distinct genes, making it impossible for a diagnosis to be given in the majority of cases on the basis of disease phenotype alone (17,(28)(29)(30). By way of example, in a recent NGS study of an Irish IRD patient cohort, new disease phenotypes were associated with the GNAT1 and SLC24A1 genes (28). In both cases a severe homozygous mutation in a known congenital stationary night blindness (CSNB) gene caused a late-onset form of RP involving photoreceptor loss (Fig. 1). Furthermore IRD phenotypes can be influenced by modifier loci (31)(32)(33). Given the limitations associated with the clinical diagnosis of specific forms of IRD, it is no surprise that the majority of patients with IRDs do not know the mutation(s) responsible for their condition. Recent advances in the development of gene-specific therapies for IRDs has made this issue all the more pressing, since this offers, for the first time, the prospect of modulating, halting or reversing the degeneration associated with some of these conditions (34), however, such therapies cannot, obviously, be administered without a precise knowledge of the underlying mutation to be treated.
High throughput NGS technologies offer an opportunity to rapidly characterize causative mutations in the growing number of IRD genes that have now been identified. These technologies have greatly accelerated DNA sequencing by providing a means of simultaneously sequencing many small fragments of DNA from different regions of the genome in a single reaction, and moreover by barcoding patient DNA to enable sequencing of samples from multiple patients in parallel. The short sequence reads from such fragments are then reconstructed by comparing the sequence to a reference genome; hg38 being currently the reference genome of choice for many such studies (hg38; GCA_000001405.15). NGS can be undertaken on the whole genome, or limited to expressed (coding) sequences (whole exome), or targeted to particular regions of the genome (employing target capture panel NGS); where genes of interest are, in essence, captured and sequenced. This form of sequencing has advanced at an exponential pace, with the cost per megabase of sequence halving regularly (35) and set to reduce further with the introduction of the new NovaSeq series from Illumina, among other innovations. IRDs have an ideal profile for NGS studies (14,18,36), representing a group of Mendelian conditions where many (to date approx. 260 genes), but not all of the causative genes, have been characterized. Given this scenario, NGS studies enable identification of known, or novel mutations in known genes and the stratification of IRD patient cohorts into  1q32.3 NIMA (never in mitosis gene A)-related kinase 2; Involved in regulation of centrosome disjunction.
Continued those whose genetic pathogenesis has been resolved, and those for whom new genes, or variants in regulatory sequences, splice variants or CNVs associated with disease genes, may be causative of disease and therefore additional analyses deploying whole exome or whole genome sequencing would be appropriate.
A variety of NGS studies, many employing a target panelbased NGS approach focused on sequencing the exons of known IRD genes, have been undertaken (10-25; Table 1). These studies support the view that IRDs represent an exemplar group of disorders for the application of panel-based NGS or WES as effective tools for detection of causative mutations. From such studies, some common patterns in findings have been observed, as have some unique findings, which at present, remain specific to individual studies. Of note, using target-capture based NGS, approximately 50-60% of IRD patients were found to carry disease causing or likely disease causing mutations in many studies. Given additional family studies indicating segregation of the disease, and replication of the findings in a certified diagnostic laboratory, these patients would be deemed to be categorised as 'resolved' in terms of the genetic etiology of the retinal pathology. Among recent examples of capture-panel NGS for IRDs, is a study of 537 IRD patients in which the variants identified were deemed to account, or likely account for the disease in 51% of cases and a recent study of Italian IRD patients, in which 59% of cases were resolved, similar to the identification levels obtained in many other NGS studies (17,19; Tables  1-3). Detection rates were significantly affected by the range of conditions under study, the number of genes included in the capture panel, as well as whether the rate was corrected based on previous screening of the same population, making precise comparisons of detection rates between studies challenging. Nonetheless, there is a consistent finding that between 25-50% of cases were not solved by targeted sequencing or WES. It is of interest that studies employing WES, rather than capture-panel NGS, have resulted in the identification of the underlying  Fig. 2). However, even within such comparatively homogenous forms of IRD, the level of intragenic mutational heterogeneity present is still being characterised. For example, in a recent study in which 148 pathogenic or likely pathogenic mutations in the ABCA4 gene were identified, about a third of these (n ¼ 48) represented new Stargardt associated disease alleles (37).

Deciphering the Genetic Pathogenesis of Unresolved IRDS
It is evident from this body of research that currently the genetic pathogenesis of IRDs remains unresolved in 40% of cases (10-25; Tables 1-3). IRD mutation detection may remain elusive for a variety of reasons, possibly due to inadequate sequence coverage, inappropriate filtering of data (and loss of the pathogenic variant), variants in intronic sequences which affect splicing (such sequences do not form part of most capture panels), structural variants such as inversions, duplications and so on which can be difficult to resolve with NGS, regulatory variants or new, as yet uncharacterized, IRD genes and which therefore were not included in capture panels, among many other causes. However from some NGS studies there are indicators that support the hypothesis that, within the known IRD genes included in many target capture panels, additional significant levels of pathological genetic variants are present but as yet remain undetected. For example, it has been found that there can be a preponderance of patients with one causative mutation in a recessive IRD gene; such over-representation of heterozygotes compared to a control population has been observed in a number of NGS studies, including our own study of the Irish IRD population, for example, for the ABCA4 gene causative of Stargardt disease. The majority of clinically defined Irish Stargardt cases who did not have two ABCA4 mutations in the coding sequence, had a single ABCA4 mutation and this was highly significantly enriched compared to the frequency of ABCA4 mutant heterozygotes in a control Irish population (Carrigan M, unpublished data). Recent studies have focused on revealing the genetic variants underlying such observations. Various tools have been developed to detect copy number variations (CNVs) in NGS datasets and have been employed to successfully detect CNVs in IRD patient cohorts (38)(39)(40). Indeed, CNVs were identified in a recent WES study in approx. 10% of the 60 IRD patients assessed by exon coverage data analysis and confirmed by PCR (39). Alternative methods for CNV detection, that previously have been employed extensively in diagnostic laboratories for disorders other than IRDs, include quantitative PCR (qPCR), multiplex ligationdependent probe amplification (MLPA) and comparative genomic hybridization. Using the latter, Van Cauwenbergh and colleagues developed a custom microarray (arrEYE) with coding and noncoding sequence from 166 known and candidate IRD genes and 196 noncoding RNAs for CNV detection in IRD patients (40); the CNV detection rate obtained with arrEYE from a first study of 57 IRD patients was 3.5%. In a recent study, 18% of unresolved cases (n ¼ 28) were resolved by CNV mapping (38). It is clear that CNVs may represent a significant contributor to the unresolved cases of IRDs and that methods of CNV analysis, both computational and experimental, should, where possible be included in future IRD studies, which has not always been the case thus far. In addition to CNVs, a proportion of the remaining unresolved IRD cases may be caused by variants that affect RNA splicing and thereby contribute to disease. Mutations that may have an impact on pre-mRNA splicing can be predicted using various in silico tools such as Human Splicing Finder (www.umd. be/HSF/; date last accessed April 25, 2017) among others, and can be confirmed by transcript analysis. RNA sequencing technologies are greatly enabling high throughput transcriptomics and undoubtedly will be applied to a greater extent to explore splice variants in IRDs in the future. A transcriptomic approach to assess the effects of IRD mutations on splicing may in principle be readily adopted for ubiquitously expressed IRD genes, and for tissue specific IRD genes, using iPS cells differentiated into appropriate cell lineages as per the example above. One approach to identify intronic variants with effects on RNA splicing in IRD patients involves deep intronic sequencing. For example, in a recent study of the USH2A gene, the most frequent cause of Usher syndrome type II (USH2) involving RP and sensorineural hearing loss, an analysis of the whole 800kb of the USH2A gene was prompted by the identification of patients with an USH2 phenotype and a single exonic mutation in the USH2A gene. Deep sequencing of intronic sequences enabled the identification of candidate splice mutations and subsequent functional validation of these mutations using reporter minigene assays leading to the resolution of 3 out of 5 USH2 patients with a single exonic mutation (41). In another study, the functional effect of a commonly observed mutation in the ABCA4 gene (the c.5461-10T!C variant) causative of Stargardt disease was analysed using patient-derived fibroblasts reprogrammed into induced pluripotent stem (iPS) cells and then differentiated into photoreceptor progenitor cells. This ABCA4 variant was found to induce skipping of exon 39 or exon 39 and 40 in the mature transcript again using a minigene assay (42). Recently homozygosity mapping and whole genome sequencing revealed a variant deep in intron 18 of the PROM1 gene causative of a recessive cone-rod dystrophy, resulting in the inclusion of a pseudo-exon in the mutant transcript, which was functionally validated again using a minigene assay (43). Detection of pathogenic IRD intronic variants will elude many of the current capture panel NGS strategies for IRDs which are focused solely on exons and hence the extension of screening studies in the future to include intronic sequences will aid in elucidating what proportion of IRD cases involve aberrant splicing. In addition to CNVs and splice variants as a source of as yet 'unresolved' pathogenic IRD mutations, variants in regulatory sequences, such as, promoter sequences or miRNAs, and their associated target sites, may also be implicated in some forms of IRD. Indeed a mutation in the seed region of microRNA-204 has been found to segregate in a family with autosomal dominantly inherited retinal dystrophy and bilateral coloboma (44), a 5'UTR sequence in the NMNAT1 gene has been implicated in a form of LCA (45) and recently a single base mutation in the promoter region of the CHM gene has been implicated as causative of choroideremia (46), all highlighting the potential role of regulatory mutations in some forms of IRD. The implementation of more extensive NGS strategies in the future will aid in characterising such IRD regulatory mutations.
In conclusion, while significant advances in high throughput genomic and transcriptomic NGS has greatly facilitated an elucidation of much of the genetic architecture of IRD patient cohorts, substantial levels of unresolved IRD cases still remain. Recently it has become evident that a significant proportion of these will be accounted for by CNVs, splicing defects and regulatory variants. It still remains to be established within these unresolved cases, what proportion will be caused by new, as yet uncharacterised, retinal disease genes.

A Diagnostic Imperative Driven by Developments in Gene Therapy
There is an obvious rationale for pursuing such studies to their conclusion,in that gene therapies for a growing number of ocular disorders are now in clinical trial (47)(48)(49)(50)(51). To date, approximately 300 IRD patients have been treated with ocular gene therapies (www.clinicaltrials.gov). In some of these trials, a continued retinal degeneration has been observed (52) and the parameters determining this still need to be elucidated. It is of note however, that an adeno-associated virus (AAV)-RPE65 therapy (voretigene neparvovec) has successfully progressed through to Phase III clinical trial (CHOP/Spark Therapeutics; www.sparktx.com), as has an AAV-ND4 therapy for Leber hereditary optic neuropathy (LHON) (51). Data from many ocular gene therapy trials support the view that intraocular delivery of AAV is well tolerated in the human eye and hence represents a safe platform for gene delivery (48)(49)(50)(51). A large number of gene therapies employing AAV vectors are also at the stage of preclinical evaluation in animal IRD models, in which benefit has been demonstrated (51,(53)(54)(55)(56)(57). Thus far many gene therapies have been directed towards recessive forms of IRD, although strategies such as suppression and replacement or genome editing are being considered to address the 30% of IRDs that are dominantly inherited (53,58). The efficacy of the therapeutic approaches will be determined in part by features specific to individual IRDs. For example, the target cell type within the retina, whether photoreceptors, RPE cells or retinal ganglion cells (51,(53)(54)(55)(56)(57), among others, will influence vector choice and route of administration. Strategies such as directed evolution are also expediting the generation of AAV serotypes with a predilection for specific retinal cell types (59,60). Additionally, the severity of disease and whether retention of a relatively intact target cell population is maintained over years, thereby providing a significant timeframe for therapeutic intervention, will also greatly influence therapeutic efficacy. For severe degenerative IRDs, where few or no photoreceptors remain, optogenetic therapies may provide an alternative option by enabling remaining cells to become photosensitive (61,62). While AAV has shown substantial promise for a number of ocular indications, preclinical and clinical studies have also been undertaken with a variety of other non-viral and viral vectors, for example, nanoparticles and lentiviral vectors amongst others (63,64). Moreover, methodologies for systemic delivery of potentially therapeutic compounds into the retina by modulation of permeability at the inner blood-retina barrier have barrier have been extensively tested in animal systems and have employed AAV vectors (65). Given the rapid pace at which therapies are being developed, it is imperative to establish the genetic architecture of IRDs in different national and ethnic groups, in principle thereby facilitating participation of patients in future human clinical trials or access in the future to marketed gene therapies. In parallel with genotyping of IRDs, it is vital that patients are clinically profiled over time, as this will greatly augment our understanding of the natural histories of these ocular disorders and genotypephenotype correlations. Knowledge regarding the natural history of a disorder is of direct value to patients, and moreover, will in principle facilitate future patient participation in clinical trials and will aid in informing the choice of appropriate primary and secondary endpoints for clinical trial design. Therefore as such, identification of the disease-causing mutations is of immediate clinical relevance to the entire IRD patient population.
Conflict of Interest statement. NC, GJF, PH, PK, AP & SMW are shareholders in Spark Therapeutics.

Funding
Vision research at the Ocular Genetics Unit at TCD is supported by awards from Fighting Blindness Ireland (FB Irl), the Health