Monoallelic expression, including genomic imprinting, X-chromosome inactivation and random monoallelic expression of autosomal genes are epigenetic phenomena. Genes that are expressed in a monoallelic way may be more vulnerable to genetic or epigenetic mutations. Thus, comprehensive exploration of monoallelic expression in human brains may shed light on complex brain disorders. Autism-related disorders are known to be associated with imprinted genes on chromosome 15. However, it is not clear whether other imprinted regions or other types of monoallelic expression are associated with autism spectrum disorder (ASD). Here, we performed a genome-wide survey of allele expression imbalance (AEI) in the human brain using single-nucleotide polymorphisms (SNPs), in 18 individuals with ASD and 15 controls. Individuals with ASD had the most extreme number of monoallelic expressed SNPs in both the autosomes and the X chromosome. In two cases that were studied in detail, the monoallelic expression was confined to specific brain region or cell type. Using these data, we were also able to define the allelic expression status of known imprinted genes in the human brain and to identify abnormal imprinting in an individual with ASD. Lastly, we developed an analysis of individual-level expression, focusing on the difference of each individual from the mean. We found that individuals with ASD had more genes that were up- or down-regulated in an individual-specific manner. We also identified pathways perturbed in specific individuals. These results underline the heterogeneity in gene regulation in ASD, at the level of both allelic and total expression.

## INTRODUCTION

Autism spectrum disorders (ASDs) are defined by behavioral deficits in three core domains, social interactions, communication and stereotyped or repetitive behaviors, but the severity in each domain varies greatly from person to person. The etiology of autism has been puzzling medical researchers for several decades. It is clear that there is a substantial genetic basis for ASD, but for the majority of cases, we still cannot identify the precise genetic risk variants. One of the main reasons why it is so difficult to study the genetic basis of ASD is due to the large genetic heterogeneity. Based on recent genome-wide search for de novo mutations, ASD can be triggered by mutations in hundreds of different genes (14). Furthermore, each risk gene is altered in only a small proportion of cases. Consequently, any two individuals with ASD rarely have mutations in the same gene. This may explain why no unifying structural or neuropathological features have been conclusively identified in the brains of individuals with ASD (5). Despite the fact that the underlying biological causes of ASD are diverse, a recent study found that gene expression in the brains of individuals with ASD shows common patterns that are different from control individuals (6). The changes in gene expression observed in the brains of individuals with ASD may result from genetic variations, epigenetic mutation (without change to the DNA sequence) and developmental processes that are the result of the abnormal behaviors associated with ASD.

The involvement of epigenetic processes in ASD has been previously suggested based on the incomplete concordance between identical twins (7). It is assumed that environmental factors that may increase the risk of ASD may affect the brain thorough epigenetic processes such as changes to methylation patterns that are associated with changes in gene expression (8). Until now most of the efforts to identify risk genes for ASD relied on DNA sequences, whereas epigenetic mutations were less explored. The connection between ASD and epigenetic regulatory mechanisms stems from the identification of genetic mutations in imprinted regions and genes that control epigenetic processes, such as MECP2 and other chromatin regulators (9,10).

One of the most common chromosomal abnormalities in ASD is maternally derived duplications of the imprinted region on chromosome 15q11–13 (11,12). Still it is not clear how many other imprinted regions are associated with autism or other neurodevelopmental disorders. In humans, there are ∼80 protein and RNA genes that are known to be imprinted (based on the Geneimprint database). However, the number of imprinted genes recently became a matter of lively debate, when a study reported more than a thousand loci with a parent-of-origin effect in the mouse brain, and hundreds of genes with a sex-specific effect based on RNA sequencing (RNA-seq) (13,14). A recently published report challenged these results, stressing the importance of validation and the potential pitfalls when discovering allele-specific expression from RNA-seq (15). The genes so far found and validated to be imprinted are expressed in a wide range of tissues but are particularly highly expressed in the placenta and in the brain (16). Furthermore, a recent genome-wide linkage study of >1200 families has demonstrated a parent-of-origin linkage to autism, with strong signals in regions on the paternal copies of chromosomes 4, 15 and 20, as well as suggestive linkage on maternal copies of chromosomes 5, 6, 7 and 9 (17).

Imprinting is not a cause of autism itself, but rather is a normal process which, when disrupted, can increase the risk of developing autism. There are several explanations for the possible involvement of imprinted genes in autism. The imprinted brain theory of autism suggests that autism is a disorder of the extreme imprinted brain (18). Based on this theory, autism is caused by imbalances that involve increased effects of the ‘paternal brain’ relative to the ‘maternal brain’. Imprinting has also been hypothesized to explain the sex difference in autism, through the proposed action of unknown paternally imprinted loci on the X chromosome (19). In addition to these theories, autism has been strongly associated with chromosomal abnormalities in the imprinted region in chromosome 15q (7,20). These include Angelman and Prader–Willi Syndromes, as well the 15q duplication syndrome, which occurs in up to 5% of individuals with ASD (20). Imprinted genes may also contribute to autism indirectly as targets of other genes. Evidence for such a regulatory connection has been observed between MECP2 (the gene associated with Rett syndrome) and the imprinted gene UBE3A (associated with Angelman syndrome) (21). While different possible mechanisms may underlie the association between imprinted genes and autism, imprinted genes are promising candidates for association with autism because of their functional haploid state. This feature may make them extremely vulnerable to rare genetic mutations because the gene may become utterly inactive. Moreover, a single epigenetic change may lead to loss of imprinting, leading to biallelic expression and to gene dysregulation. Both genetic and environmental factors can affect the imprinting process and alter the level of expression of imprinting genes. Thus, epigenetic alterations may be the biological targets through which environmental factors can cause autism. In summary, imprinted genes may be associated with autism because they are involved in brain development and also because they may be more vulnerable to genetic or epigenetic mutations. But we still do not know the full contribution of genomic imprinting to normal brain development and function and how important the role of imprinting is in increasing the risk for autism.

Genomic imprinting is a unique phenomenon wherein genes are expressed in a monoallelic way, and the choice of which allele is expressed is determined by the parental origin of the allele. For most genes the maternally and paternally derived copies of each gene are simultaneously expressed at comparable levels (termed biallelic expression). In most other cases of monoallelic expression, the choice is random in each cell. The best studied example of random monoallelic expression, which is also linked to disease, is X-chromosome inactivation. X-chromosome inactivation involves the silencing of almost the entire X chromosome, early in development, in most female mammals. The choice of which of the two X chromosomes will be inactivated is totally random, and once initiated it is propagated to the daughter cells. Women who carry an X-linked mutation in a heterozygote state have two populations of cells: cells that express the wild-type allele, and cells that express only the mutant allele. Although the expected proportion of cells expressing the mutated gene should be 50%, in many cases there could be skewing of the inactivation owing to positive or negative cell selection. The degree of skewing can modulate the severity of the phenotypes in women who are carriers of X-linked mutations (22). One study reported an excess of skewed X-inactivation in blood cells from individuals with ASD, but this was not replicated by other studies (23,24).

Beside X inactivation and parentally imprinted genes, other autosomal genes have also been shown to be expressed in a monoallelic way. Among these are genes encoding for immunoglobulins, T-cell receptors, interleukins (25), odorant receptors (26) and protocadherin genes (27,28). Furthermore, a genome-wide study for monoallelic expression suggests that monoallelic expression is a widespread phenomenon, especially for genes involved in cell–cell communication in the immune system and genes that are expressed in the brain (29). So far, screening for monoallelic expression has been performed on lymphoblastoid cell lines (LCLs) or stem cells (30) or was focused on a small number of genes (31). We previously performed a screen for monoallelic expression in LCLs from ASD individuals, identified an excess of monoallelic regions in cases and identified a rare duplication that is likely to be the cause of one of the monoallelic expressed regions (32). Monoallelic expression was also found previously to be the result of epigenetic dysregulation in ASD. The allelic expression of the genes that encode the GABAA-receptor subunits has been shown to be monoallelic or strongly imbalanced in samples from ASD but showed balanced expression in unaffected control samples (33).

In this study, we report a genome-wide screen for monoallelic expression in the brains of individuals with ASD and controls. Specifically we searched for abnormalities in different types of epigenetically driven monoallelic expression, including genomic imprinting, X-inactivation and autosomal random monoallelic expression. Usually, random monoallelic expression cannot be detected from the analysis of a mixture of cells, as each cell randomly expresses a different allele, resulting in a pattern that is indistinguishable from biallelic expression. However, if across different cells there is a strong skewing toward one of the alleles, this could be detected. We show that similarly to the genetic findings in ASD, which are very heterogeneous, the brains of individuals with ASD show diverse types of allelic expression abnormalities.

## RESULTS

### Rare events of monoallelic expression in ASD cases and controls

We performed a genome-wide scan of allele expression imbalance (AEI) in brain tissues from the prefrontal cortex (BA9) of 33 individuals, 18 with autism (13 males and 5 females, aged 2–60 years old) and 15 controls (12 males and 3 females, aged 0–64 years), obtained from the Autism Tissue Program (ATP) (Supplementary Material, Table S1). For these individuals, genotypes from both Affymetrix SNP 6.0 arrays and Illumina 1M arrays were available (34). We used a previously described paradigm (29,32,35), in which measurements from single-nucleotide polymorphism (SNP) arrays from both the genomic DNA and the cDNA were compared with discover events of AEI. To accurately identify monoallelic expression at the level of one SNP, we implemented a very strict quality control pipeline (detailed in Supplementary Material, Fig. S1 and Materials and Methods). First, we removed 4 out of the 33 individuals that showed a high rate of discordance between the genomic DNA and the cDNA as well as a high rate of novel alleles appearing in the cDNA (which we termed illogical genotypes). Our final sample of 29 individuals included 6 females and 23 males, among them 15 were cases and 14 were control. At the SNP level, we removed SNPs, which had <3 heterozygous genotypes. Furthermore, we imposed a cutoff for the total signal to be considered as an expressed SNP and removed SNPs in which <80% of the individuals were above this cutoff. Additionally, we excluded any SNPs in which illogical genotypes were found, and SNPs in which the allele ratios in the cDNA did not differ between different genotypes (as was determined based on genomic DNA). Lastly, to prevent false positives owing to erroneous calls in the genomic DNA, we used measurements from Illumina 1M SNP arrays to impute the Affymetrix array genotypes and excluded genotype calls, which were discordant between the array and the imputation. Following these stringent filters, out of 27 993 genes (including many RNA genes) in the UCSC hg19 build, 8050 were measured in our experiment. Each individual on average had 4287 genes with at least 1 informative SNP.

To identify rare events of autosomal monoallelic expression from the allelic expression data, we developed an algorithm that gives a score of AEI for each heterozygote measurement based on the distance from the homozygote clusters. To account for intra-SNP variance, we also calculated a Z-score for each individual based on all heterozygotes. A score of 0.75, indicating a measurement that is three-fourth of the way between the heterozygote and homozygote cluster, and a Z-score of >2 were taken as a threshold. We identified 1133 putative SNPs (2.8% of informative SNPs) under these criteria but filtered out manually distributions not fitting monoallelic expression, resulting in 338 SNPs (Supplementary Material, Table S2). Out of the 338 SNPs, 61 (18%) were outside of genes, and the rest were inside genes. Detection of monoallelic expression was not biased by transcript length, the distance of the SNP from the transcription start site, the expression level and GC content of the gene (Supplementary Material, Fig. S2). Thirty SNPs were chosen randomly for validation using Sanger sequencing (Supplementary Material, Table S3, Fig. 1A). Ten (33%) were found to be monoallelic (>10 allele ratio), eleven (30%) were biallelic with AEI (>1.5 allele ratio) and eight were balanced (27%). The remaining SNP was found to be homozygote in the genomic DNA. We used the DAVID bioinformatics tool (36) to look for functional enrichment in the genes identified with signals in agreement with monoallelic expression, but no specific groups were found. This is concordant with the results of a previous study in mice, which found random monoallelic expression to affect a heterogeneous group of genes with no clear functional enrichment (37). We could not find a significant association between the relative number of AEI events and confounding factors, such as age, tissue and RNA quality (Supplementary Material, Fig. S3A). Reassuringly, the relative number of suspected events also remained fairly constant across samples when we used different thresholds to detect expressed SNPs (Supplementary Material, Fig. S3B and C).

Figure 1.

Rare monoallelic expression in the prefrontal cortex (BA9) and the cerebellum of individuals with autism and controls. (A) Rare monoallelic expression in sample AN13295. The results for SNP rs573890 are visualized by comparing the array signal level from allele B against the signal of allele A, in the genomic DNA (gDNA—left) and the cDNA (right). Color of points corresponds to gDNA genotypes, AN13295 is highlighted in turquoise. The effect observed in the array is validated using Sanger sequencing of gDNA (left) and cDNA (right). (B) Rates of monoallelic expression in different individuals. Specific individuals, in particular AN01570, show a very high load of rare monoallelic expression, whereas others show very few events or none at all. Colors correspond to diagnosis. (C) Comparison of monoallelic events between the cerebellum and BA9. None of the events were shared between cerebellum and BA9 for AN00764. The high rate of monoallelic expression in AN01570 appears to be restricted to BA9 region.

Figure 1.

Rare monoallelic expression in the prefrontal cortex (BA9) and the cerebellum of individuals with autism and controls. (A) Rare monoallelic expression in sample AN13295. The results for SNP rs573890 are visualized by comparing the array signal level from allele B against the signal of allele A, in the genomic DNA (gDNA—left) and the cDNA (right). Color of points corresponds to gDNA genotypes, AN13295 is highlighted in turquoise. The effect observed in the array is validated using Sanger sequencing of gDNA (left) and cDNA (right). (B) Rates of monoallelic expression in different individuals. Specific individuals, in particular AN01570, show a very high load of rare monoallelic expression, whereas others show very few events or none at all. Colors correspond to diagnosis. (C) Comparison of monoallelic events between the cerebellum and BA9. None of the events were shared between cerebellum and BA9 for AN00764. The high rate of monoallelic expression in AN01570 appears to be restricted to BA9 region.

We proceeded to test whether monoallelic expression was enriched in the autism cohort. While in the autism cohort, there were 244 events, and in the control cohort only 94, plotting the distribution of these events across the samples revealed that specific individuals harbored many events, and others few or none at all (Fig. 1B). Correspondingly, on average, individuals with autism did not have significantly more events of AEI than controls (P = 0.27, two-sided Student's t-test). This further suggested that the observed difference does not reflect a global trend. Strikingly, however, one individual with autism, labeled AN01570, harbored 98 events. In the validation attempt using Sanger sequencing, four out of nine (44%) tested SNPs for that sample were monoallelic and an additional three (33%) were imbalanced (2.4–3.5 allele ratios). To test whether this widespread monoallelic expression is specific to BA9, we performed a genome-wide scan for AEI with cerebellar tissue from four individuals, including AN01570, and compared it with the results from BA9. To account for differences in expression between the cerebellum and BA9, which may affect our power to detect events in the cerebellum, we applied stringent filters to include only SNPs that are expressed in similar levels in both tissues. In these four samples, we found 108 signals in both brain regions, which passed the above thresholds and quality control (Supplementary Material, Table S4). In individual AN01570, 69 events remained in BA9, but strikingly, only 7 were found in the cerebellum, 3 of them shared between the cerebellum and BA9 (Fig. 1C).

### Skewed X-inactivation

Skewed X-inactivation is a common feature of disorders with X-linked mutations (38,39). Skewed X-inactivation in ASD was tested before mainly in lymphocytes (38,39). We determined the degree of skewed X-inactivation in the brain of eight female samples, including five cases and three controls. To this end, we calculated the concordance rate between cDNA and genomic DNA for heterozygote genotypes for SNPs in the X chromosome and compared it with the rate in the autosomes (Fig. 2A). We found that the concordance was generally lower in the X chromosome relative to autosomes and was showing a non-significant trend (P = 0.25, Wilcoxon rank-sum test) to be lower in cases compared with control brain samples. Among the brain samples, we focused on one individual with autism (AN03632), which had the most extreme difference in concordance rate in the X chromosome compared with the autosomes (Fig. 2A). Validation by Sanger sequencing of an SNP located in the X chromosome showed monoallelic expression in individual AN03632 and confirmed the small degree of skewed inactivation for other individuals (Supplementary Material, Fig. S4). To further show that this reflects a skewed X-inactivation, we visualized the allelic signal ratio inside and outside the pseudoautosomal region, as genes in that regions are known to escape X-inactivation (40). This includes two regions, named PAR1 and PAR2. As PAR2 harbored only 25 SNPs in the SNP 6.0 array and no SNPs were heterozygote in AN03632, we concentrated on PAR1, located on Xp22 (chrX:60001-2699520) (Fig. 2B). We found that heterozygous SNPs inside the PAR1 region retained equal expression of both alleles, whereas many of the SNPs outside PAR1 showed a distribution that fits monoallelic expression.

Figure 2.

Biased X-inactivation in an individual with autism. (A) The rate of heterozygote genotypes in the genomic DNA that changed to homozygote in the cDNA is shown for the autosomes (distribution plot) and the X chromosome (indicated by ‘X’). (B) The distribution of the proportion of allele A was plotted for heterozygote SNPs residing inside (turquoise) and outside (Red) PAR1, a region known to escape inactivation. SNPs outside the region showed monoallelic expression whereas SNPs inside the region expressed from both alleles, suggesting a bias in X-inactivation. (C) Biased X-inactivation shows cell-type specificity in AN03632. (Top) The proportion of allele A in heterozygote SNPs residing in genes enriched in specific neural cell types plotted for genomic DNA (gDNA—green) and cDNA (purple). The number of informative SNPs in each group of genes is indicated by N. Monoallelic expression was observed for oligodendrocytes, astrocytes and neurons, whereas a more balanced signal was observed in the microglia genes. (Bottom) Validation of these effects by Sanger sequencing of specific SNPs in the genomic DNA compared with the cDNA recapitulated the strong monoallelic signal in oligodendrocytes, astrocytes and neurons compared with the microglia.

Figure 2.

Biased X-inactivation in an individual with autism. (A) The rate of heterozygote genotypes in the genomic DNA that changed to homozygote in the cDNA is shown for the autosomes (distribution plot) and the X chromosome (indicated by ‘X’). (B) The distribution of the proportion of allele A was plotted for heterozygote SNPs residing inside (turquoise) and outside (Red) PAR1, a region known to escape inactivation. SNPs outside the region showed monoallelic expression whereas SNPs inside the region expressed from both alleles, suggesting a bias in X-inactivation. (C) Biased X-inactivation shows cell-type specificity in AN03632. (Top) The proportion of allele A in heterozygote SNPs residing in genes enriched in specific neural cell types plotted for genomic DNA (gDNA—green) and cDNA (purple). The number of informative SNPs in each group of genes is indicated by N. Monoallelic expression was observed for oligodendrocytes, astrocytes and neurons, whereas a more balanced signal was observed in the microglia genes. (Bottom) Validation of these effects by Sanger sequencing of specific SNPs in the genomic DNA compared with the cDNA recapitulated the strong monoallelic signal in oligodendrocytes, astrocytes and neurons compared with the microglia.

Given that not all the SNPs on the X chromosome were monoallelic in this brain sample (AN03632), we hypothesized that the skewed X-inactivation may be cell type specific. The brain consists of different cell types and if some cell types have random X-inactivation, the genes that are specific to those cells will show biallelic expression. As the RNA that we used for the allelic expression (AE) measurements is extracted from a heterogeneous brain sample, the measurements are averages across the different cell types and according to the pattern of expression of each gene. To test our hypothesis, we first used an ‘in silico dissection’ approach to divide the AE results into different cell types. SNPs were assigned to genes and to specific cell types based on our previously published gene co-expression network analysis in the human brain that identified modules of genes that correspond to different neural cell types (41). Using this approach, we assembled the SNPs into groups corresponding to oligodendrocytes, astrocytes, neurons or microglia and plotted the distribution of the allele ratios in the cDNA and the genomic DNA in heterozygote SNPs (Fig. 2C). We found that in all cell types except microglia, the distribution of AE fitted a case of monoallelic expression. To validate these results, we performed Sanger sequencing of genes that show a cell-type-specific expression pattern (Fig. 2C). The genes were chosen based on a survey of cell-type-specific gene expression in the mouse brain (HEPH for astrocytes, KLHL4 for oligodendrocytes and ATP2B3 for neurons) (42) or based on a gene co-expression network of human brain (RGN for microglia) (41). The results of the sequencing were in agreement with the ‘in silico dissection’, showing a strong bias toward one allele (allele ratio >4.0) in marker genes for astrocytes, oligodendrocytes and neurons, but a much more modest bias in the marker for microglia. As our analysis also showed that not all SNPs in microglia-specific genes are monoallelic (Fig. 2C), we validated also a monoallelic expressed SNP in a microglia-specific gene (P2RY10), along with an additional monoallelic SNP in the other cell types (Supplementary Material, Fig. S5). The results of the sequencing again corroborated the array, with all SNPs found to be strongly biased or monoallelic expressed. Extending the analysis to other females with ASD showed a similar trend of skewed X-inactivation being more dominant in genes that correspond to oligodendrocytes, astrocytes and neurons relative to microglia (Supplementary Material, Fig. S6). In contrast, in the control sample (AN02456) that showed the strongest signal for skewed X-inactivation (among the non-ASD samples), the bias was most prominent in the microglia module.

### A screen of imprinted genes shows loss of imprinting in an individual with ASD

It was previously argued that genomic imprinting can help to explain the origin of ASD. To test this hypothesis, we screened all BA9 brain samples for events of abnormal imprinting. We first examined the imprinting status of all known imprinted genes in the BA9 samples. A gene was considered imprinted if it was monoallelic expressed across most samples. To define the direction of imprinting (maternal or paternal expression), we used SNP arrays to genotype the parents of two individuals (AN16115 and AN17138). Out of 131 known imprinted genes, 55 were non informative and 16 were not expressed. Among the 60 expressed and informative genes, 23 showed AE patterns consistent with imprinting in a parental direction consistent with previous studies (examined when parents were informative) and 37 were biallelic (Supplementary Material, Table S5). The biallelic expression could theoretically result from SNPs being included in both the sense and antisense of the transcript. To examine this possibility, we assessed the strand-specific expression pattern of all the imprinted genes based on strand-specific RNA-seq data from two individuals (AN12457 and AN11989). Out of the 37 biallelic genes, only three genes were expressed from both strands. Furthermore, in 10 genes, we could identify polymorphic SNPs in the strand-specific RNA-seq data and validate the biallelic expression.

Among the genes found to be imprinted, several genes were showing strong AEI, but were not completely monoallelic (MEST, ZNF597, MEG3 and GRB10) (Supplementary Material, Fig. S7). For three genes (KCNQ1, INPP5F and ZNF331), the monoallelic expression was confined to specific isoforms (Fig. 3). For INPP5F, we found a pattern similar to findings in mice and in human fetus (43,44), where only the short isoforms of INPP5F are imprinted. Another gene, ZNF331, which was recently reported to be imprinted (45,46) showed a more complex pattern. The short isoforms was monoallelic, the long isoform was biallelic and the intermediate isoforms showed incomplete silencing. Lastly, combining the array results with strand-specific RNA-seq data, we found that all monoallelic SNPs in KCNQ1 resided inside a noncoding RNA, expressing stronger in this tissue and on the opposite strand.

Figure 3.

Complex imprinting patterns in the dorsolateral prefrontal cortex (BA9). Variable allelic expression patterns in INPP5F (left), ZNF331 (middle) and KCNQ1 (right). (Top figure) Strand-specific RNA-seq from one representative individual. The y-axis is the log10-transformed coverage in the region (green is expression from the positive strand, and purple is from the negative strand). (Middle figure) The allelic expression status of SNPs within the gene regions based on the SNP arrays. (Bottom) Annotated transcripts based on RefSeq. In INPP5F, AEI is observed only for SNPs located within the short transcript (NM_001243194, known as INNP5f_v2). The short transcript also shows the highest expression in the RNA-seq. For ZNF331, an SNP within the private intron of the largest transcript is biallelic, whereas SNPs within the two shortest transcripts are monoallelic. SNPs in the middle, located in regions shared by the largest transcript and the intermediate transcripts, show AEI. In KCNQ1, monoallelic expression is observed for a noncoding RNA expressing strongly from the antisense.

Figure 3.

Complex imprinting patterns in the dorsolateral prefrontal cortex (BA9). Variable allelic expression patterns in INPP5F (left), ZNF331 (middle) and KCNQ1 (right). (Top figure) Strand-specific RNA-seq from one representative individual. The y-axis is the log10-transformed coverage in the region (green is expression from the positive strand, and purple is from the negative strand). (Middle figure) The allelic expression status of SNPs within the gene regions based on the SNP arrays. (Bottom) Annotated transcripts based on RefSeq. In INPP5F, AEI is observed only for SNPs located within the short transcript (NM_001243194, known as INNP5f_v2). The short transcript also shows the highest expression in the RNA-seq. For ZNF331, an SNP within the private intron of the largest transcript is biallelic, whereas SNPs within the two shortest transcripts are monoallelic. SNPs in the middle, located in regions shared by the largest transcript and the intermediate transcripts, show AEI. In KCNQ1, monoallelic expression is observed for a noncoding RNA expressing strongly from the antisense.

We next examined each individual sample for loss of imprinting based on biallelic expression of SNPs within the imprinted genes. We found only one case of loss of imprinting, which was in an individual with ASD (AN00493). In that individual, several SNPs in the imprinted region of chromosome 15q11–13 were expressed in a biallelic way (Fig. 4A). The biallelic SNPs were in two clusters, one within the SNRPN gene and the other within or near the UBE3A gene. Based on the results of copy number variation analysis, the 15q11–13 region has a normal copy number in individual AN00493 (34). Sanger sequencing of an SNP in the UBE3A gene showed that the gene is indeed expressed from both alleles in AN00493, while being monoallelic expressed in other individuals (Fig. 4B). We also noted for AN00493 that the signals from both alleles at many SNPs within the 15q11–13 region were considerably lower compared with other individuals (Fig. 4A). This reduction in total expression was more extreme [<2 standard deviations (SD) from the mean] particularly at the two regions (SNRPN and UBE3A genes) where SNPs were expressed in a biallelic way (Fig. 4A).

Figure 4.

Loss of imprinting in individual AN00493 in chromosome 15. (A) (Top) The status of heterozygote SNPs at the cDNA level for the AN00493 individual. Biallelic expressed SNPs (red) are observed in UBE3A and SNRPN, whereas the region between them shows monoallelic expression (turquoise). (Middle) Relative expression based on the total signal (sum of both alleles) of SNPs on the array for individual AN00493. For each SNP, values are the number of SDs from the mean expression of the other brain samples (Z-score). Signals below two SDs from the mean are in teal (bluish-green). Decreased expression is more noticeable in UBE3A-ATS and SNRPN. (Bottom) Normal expression patterns in the region based on strand-specific RNA-seq of representative unrelated individuals. The y-axis is the log10-transformed coverage in the region (green is expression from the positive strand and purple is from the negative strand). Most expression signal (even inside UBE3A gene) derives from the positive strand. (B) Biallelic expression of SNP rs4906708, which resides inside UBE3A. (Top) Data from the SNP array using genomic DNA (gDNA) and cDNA. Colors correspond to genotypes in the gDNA. (Bottom) Sanger sequencing validation of the biallelic expression in AN00493, along with two other samples showing monoallelic expression. (C) An illustration of the imprinted region in chromosome 15, delineating regions of paternal expression from the forward strand (turquoise) and maternal expression from the reverse strand (red) (D) (Left) Expression microarray measurements for AN00493 (red) compared with other brain samples (green) shows a decrease in expression of SNRPN and SNURF genes in AN00493, but not for UBE3A. (Right) Quantitative real-time PCR validation, plotted as the Δct between the gene of interest and a reference control gene, shows decreased expression of the UBE3A-ATS in AN00493, but not for the UBE3A gene.

Figure 4.

Loss of imprinting in individual AN00493 in chromosome 15. (A) (Top) The status of heterozygote SNPs at the cDNA level for the AN00493 individual. Biallelic expressed SNPs (red) are observed in UBE3A and SNRPN, whereas the region between them shows monoallelic expression (turquoise). (Middle) Relative expression based on the total signal (sum of both alleles) of SNPs on the array for individual AN00493. For each SNP, values are the number of SDs from the mean expression of the other brain samples (Z-score). Signals below two SDs from the mean are in teal (bluish-green). Decreased expression is more noticeable in UBE3A-ATS and SNRPN. (Bottom) Normal expression patterns in the region based on strand-specific RNA-seq of representative unrelated individuals. The y-axis is the log10-transformed coverage in the region (green is expression from the positive strand and purple is from the negative strand). Most expression signal (even inside UBE3A gene) derives from the positive strand. (B) Biallelic expression of SNP rs4906708, which resides inside UBE3A. (Top) Data from the SNP array using genomic DNA (gDNA) and cDNA. Colors correspond to genotypes in the gDNA. (Bottom) Sanger sequencing validation of the biallelic expression in AN00493, along with two other samples showing monoallelic expression. (C) An illustration of the imprinted region in chromosome 15, delineating regions of paternal expression from the forward strand (turquoise) and maternal expression from the reverse strand (red) (D) (Left) Expression microarray measurements for AN00493 (red) compared with other brain samples (green) shows a decrease in expression of SNRPN and SNURF genes in AN00493, but not for UBE3A. (Right) Quantitative real-time PCR validation, plotted as the Δct between the gene of interest and a reference control gene, shows decreased expression of the UBE3A-ATS in AN00493, but not for the UBE3A gene.

Imprinting in the 15q imprinted region is under the control of the Prader–Willi-Syndrome Imprinting Control Region (PWS-ICR). Thus, loss of imprinting could be a result of aberrant methylation. We determined the methylation status of the PWS-ICR in individual AN00493, compared with three samples, which showed normal monoallelic expression pattern in UBE3A (Supplementary Material, Table S6). All samples showed patterns consistent with hemimethylation, with no significant differences in methylation between samples.

Silencing of the paternal copy of UBE3A is believed to be mediated by a noncoding RNA, maternally transcribed from the opposite strand throughout most of the UBE3A gene (UBE3A-ATS) (47,48). The UBE3A-ATS is part of a larger transcript that initiates at the SNURF-SNRPN gene (49) (Fig. 4C). We reasoned that the observed biallelic expression may be the result of the transcription from both strands together with an abnormally low expression of the antisense noncoding RNA. Using strand-specific RNA-seq from another brain BA9 sample, we determined the normal patterns of expression in the region (Fig. 4A). As expected, we found that the entire region is expressed from the forward strand (+strand), with the highest expression at the snoRNA clusters (SNORD116 also called HBII-85 and SNORD115 also called HBII-52). On the negative strand, we found the expected expression at the UBE3A locus, but surprisingly we also found expression on the negative strand within the SNRPN gene, at the same region that contains the biallelic expressed SNPs in individual AN00493. Thus, the biallelic SNPs in individual AN00493 are found in regions that show both low expression of the large noncoding RNA and are also expressed from both strands.

To further test this observation, we analyzed the expression pattern of three genes (SNRPN, SNURF and UBE3A), which were targeted by an expression array (6). We compared the level of expression in AN00493 relative to the other brain samples (Fig. 4D). The AN00493 individual had one of the lowest expression levels for probes located in the SNRPN and SNURF genes, but with close to average expression for probe within the UBE3A gene. We also determined the expression levels of both the UBE3A gene and the UBE3A-ATS, using quantitative Real-Time PCR (Fig. 4D). For UBE3A, we designed primers on an exon junction, to measure only the gene. We found that despite the biallelic expression, UBE3A was not elevated in AN00493 compared with other individuals. However, the UBE3A-ATS was found to be strongly down-regulated.

### Per-individual expression patterns show specific abnormalities in individuals with ASD

Changes of gene expression in autism brains were previously shown to converge on specific groups of genes (6), but our allelic expression analysis points to a large heterogeneity. To examine this apparent contradiction, we developed a method to study individual-level expression based on expression profiling. Unlike the common practice of searching for differences between groups of cases and controls, in this analysis, we determined up- and down-regulated genes and pathways in an individual-specific manner. To that end, we analyzed total gene expression levels in 33 individuals, including 19 that were also studied for allelic expression. For each individual and each gene, we calculated the fold-change difference between the individual expression and the mean expression across all other samples (Fig. 5). We found that ASD sample tended to show higher variability in this measure compared with controls (P = 2.75 × 10−4, two-sided Wilcoxon rank-sum test) (Fig. 5A). Based on this measure, we extracted, for each individual, a list of genes that are up- or down-regulated compared with the rest of the samples. To account for the variability in the expression, an individualized Z-score was calculated. Only genes with fold change of >2 (or <0.5) and |Z-score| of >1 were considered to be up-regulated or down-regulated. Individuals with autism had significantly more differentially expressed genes than controls, including more up-regulated and down-regulated genes (Table 1). Thus, this analysis shows divergence in gene expression across ASD samples.

Table 1.

Per-individual differentially expressed genes analysis

Direction ASD (average) Controls (average) P-value (Wilcoxon rank-sum test)
Up-regulated 118 13.44 9.04 × 10−4
Down-regulated 111.69 24.86 6.95 × 10−3
Both 229.69 38.31 7.92 × 10−4
Direction ASD (average) Controls (average) P-value (Wilcoxon rank-sum test)
Up-regulated 118 13.44 9.04 × 10−4
Down-regulated 111.69 24.86 6.95 × 10−3
Both 229.69 38.31 7.92 × 10−4
Figure 5.

Individual-specific gene expression in BA9 of individuals with autism. (A) Heterogeneity of total gene expression levels. For each individual, the SD across all fold changes is shown. Autism samples (red dots) had significantly higher SD than controls (turquoise dots) (P = 2.75 × 10−4) (B and C) Functional enrichment of differentially expressed genes. For each category, the change is shown as the average log fold change. (B) Genes down-regulated in AN01570 were enriched with oligodendrocyte-associated functional terms including myelination and glial cell differentiation. (C) Genes up-regulated in AN00493 were enriched with genes involved in chemokine-mediated immune response.

Figure 5.

Individual-specific gene expression in BA9 of individuals with autism. (A) Heterogeneity of total gene expression levels. For each individual, the SD across all fold changes is shown. Autism samples (red dots) had significantly higher SD than controls (turquoise dots) (P = 2.75 × 10−4) (B and C) Functional enrichment of differentially expressed genes. For each category, the change is shown as the average log fold change. (B) Genes down-regulated in AN01570 were enriched with oligodendrocyte-associated functional terms including myelination and glial cell differentiation. (C) Genes up-regulated in AN00493 were enriched with genes involved in chemokine-mediated immune response.

We next focused on two individuals: AN00493 that showed loss of imprinting in 15q and AN01570 that had the largest number of monoallelic events. Up- and down-regulated genes in these individuals were separately analyzed for functional enrichment using the GO-Elite program (Fig. 5B and C). AN00493 harbored 218 up-regulated and 50 down-regulated genes. The up-regulated genes were highly enriched in genes involved in immune response, and especially chemokines (Fig. 5B). The down-regulated genes showed nominal enrichment for processes related to synaptic function, but this was not significant after multiple testing. AN01570 harbored 190 down-regulated genes and 27 up-regulated genes. The down-regulated genes were enriched with genes involved in myelination, and other processes related to oligodendrocytes, which may reflect a depletion of this cell type in that sample (Fig. 5C). Up-regulated genes were too few for functional enrichment analysis.

## DISCUSSION

Using SNP arrays, we measured allelic expression in the human brain across the genome and identified SNPs showing monoallelic expression in brain samples from individuals with ASD and controls. We concentrated on three epigenetic processes that result in monoallelic expression: autosomal monoallelic expression, X-inactivation and genomic imprinting. We found that individuals with ASD showed abnormalities in the above-mentioned three types of monoallelic expression. It is important to note that the causes for the changes observed in monoallelic expression among ASD individuals are unknown. The source of variation could be genetic or environmental. Similar to recent genetic studies of ASD that could not identify genes or mutations that account for >1–2% of cases, we did not find a specific type of epigenetic abnormality that is shared across different ASD individuals. Similarly, by studying total gene expression using a new individual-specific analysis, we could show the divergence of ASD brains from the mean expression levels. This suggests that while ASD brains are different from controls, they tend to be dissimilar among themselves. Although the sample size is relatively small, our study reemphasizes that ASD is etiologically heterogeneous not only at the genetic basis but probably also in epigenetic and environmental factors.

We concentrated here on three ASD subjects that had the most extreme or abnormal types of monoallelic expression. In one individual with ASD, we found 10 times more instances of monoallelic expression across the genome than the average count. Surprisingly, in that individual, most of the monoallelic expressed SNPs were restricted to the BA9 but were biallelic in the cerebellum. What might cause such widespread monoallelic expression is not clear. One possible explanation could be that this is the result of a general abnormality in epigenetic regulation (50). The other possible explanation might be related to the fact that this individual exhibited an extremely large brain size (2100 g at the age of 18 years). Random monoallelic expression should not be normally seen in a large number of polyclonal cells. However, if the macrocephaly is caused by rapid proliferation of a small subset of cells, it may lead to a more clonal cell population. Individual-specific analysis of gene expression suggested the depletion of oligodendrocytes and myelination in the prefrontal cortex of that individual. The reduction of myelin-associated genes may suggest that the increase in brain volume is caused by excessive numbers or rates of growth of neurons. It could also reflect aberrant connectivity in that sample. Autism has been long associated with aberrant connectivity, although it has mostly been suggested to be associated with global increase in white matter (51). Recently, however, it was found that individuals with autism had reduced intrinsic wiring inside cortical columns, and this reduction was associated with the severity of autistic features (52).

We also found that females with ASD tended to have a more skewed X-inactivation (although as a group, this was not significant). This observation is not surprising because skewed X-inactivation was found previously to be more common in X-linked mental retardation disorders including Rett syndrome (38,39). The observations of skewed X-inactivation were previously mainly based on blood cells, but the relationship with the brain is less studied. Here, using a new ‘in silico dissection’ approach, we found that the extreme skewed X-inactivation observed in the brain of an individual with ASD was confined to neural cells but was less prominent in microglia-specific genes. A similar trend was observed in two other brain samples from females with ASD. This finding of cell type-specific skewed X-inactivation suggests two important conclusions. First, it shows that the degree of skewed X-inactivation measured in blood cells does not necessarily reflect the degree in the brain. Second, it suggests that the skewed X-inactivation is caused by selection against or in favor of alleles that may be important specifically during the proliferation of neural progenitor cells. Still it is not clear how skewed X-inactivation influences ASD risk and severity.

We screened all the known imprinted genes and identified which of these genes are expressed in a monoallelic or biallelic way in the human brain (BA9 & cerebellum). Using microarrays alone to determine the imprinting status of genes may be confounded by transcription from the antisense that could result in a false biallelic expression signal. Using strand-specific RNA-seq, we were able to exclude or identify such antisense transcripts, and for several genes even identify strand-specific biallelic expressed SNPs, affirming the biallelic expression of those genes. Among the known imprinted genes confirmed in our study to be monoallelic expressed in our samples, we then searched for specific individuals with abnormal imprinting. To the best of our knowledge, our study is the first to screen across the genome for the status of imprinted genes in the human brain and to search systematically for imprinting abnormalities in ASD. Despite previous theories about the possible involvement of imprinted genes in ASD (18), we found only one case of loss of imprinting in an individual with ASD. In that individual, two clusters of SNPs within the 15q11–13 imprinted region were biallelic, a region that is known to be associated with ASD risk. The best explanation for the biallelic expression is a combination of strong reduction in the expression of the large noncoding RNA that is transcribed across the region from the paternal allele. We identified biallelic expressed SNPs only at regions that showed expression on the opposite strand. This observation is in agreement with a recent study that showed that UBE3A-ATS silence the paternal copy of UBE3A in mice (48).

In summary, our study identified diverse epigenetic abnormalities in the brain of individuals with ASD, which include both the increase of monoallelic expression in genes that are normally expressed in a biallelic way and the loss of monoallelic expression in imprinted genes. This heterogeneity was reflected both at the level of allelic regulation and at the level of total gene expression, with autism samples showing more extreme individualized differences and globally a transcription profile dissimilar from each other and the controls. Altogether, these results suggest that while much effort is concentrated on discovering genetic convergence, there is considerable heterogeneity in the allelic and global genetic regulation between individuals with autism.

## MATERIALS AND METHODS

### Sample acquisition, DNA and RNA extraction from BA9 and cerebellum

Postmortem tissue samples were obtained from the ATP and the NICHD Brain and Tissue Bank for Developmental Disorders. Genomic DNA was extracted with a QIAamp DNA Blood Mini Kit (QIAGEN, Valencia, CA). DNA from parents of two individuals was obtained from Autism Genetic Research exchange, and genotypes using the Affymetrix SNP 6.0 array. In the postmortem brain tissue, RNA was extracted using the RNeasy® Lipid Tissue Mini Kit (Qiagen Ltd, Crawley, UK) according to the manufacturer's protocol. The RNA was treated with DNase to eliminate genomic DNA contamination. The absence of DNA contamination was verified by gel electrophoresis. RNA quality was evaluated using Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, CA, USA), with an average of RNA integrity number of 8.1. mRNA was enriched from brain total RNA employing the MicroPoly(A) Purist Kit (Ambion, Austin, TX, USA). The RNA was then converted into double-stranded cDNA and tested for gDNA contamination using PCR of genic and nongenic regions (primers are available upon request). The double-stranded cDNA replaced the genomic DNA in the Affymetrix SNP array 6.0 genotyping protocol. Genotyping analysis was performed using the Affymetrix Genotyping Console (Affymetrix, Santa Clara, CA, USA).

### Allele expression imbalance at the level of one SNP

All statistical analyses were performed using the R-project for Statistical Computing (http://www.r-project.org), and plotting was performed using the R packages ggplot2 and ggbio (53,54). Discovering a true event of AEI rests upon determining that an SNP heterozygous in the genomic DNA is showing a bias toward one of the alleles in the cDNA. To assess the AEI at the level of one SNP, we developed strict quality control measures to limit errors at the SNP and individual level (Supplementary Material, Fig. S1). The per-individual rate of changes from heterozygous genotype in the genomic DNA to a homozygous genotype in the cDNA in all autosomes was determined. Four individuals showed a markedly higher rate, indicating that these calls may be the results of error. To further assess the origin of these global changes, we determined the rates of a novel allele, not found in the genomic DNA, appearing in the cDNA (a phenomenon hereafter referred to as ‘illogical genotypes’). The idea was that if a homozygous SNP in the genomic DNA is found to be heterozygous, or homozygous for the other allele, in the cDNA, such an illogical genotype is most likely the result of an error (excluding the possibility of RNA editing). Indeed, the four samples that showed a high global rate of heterozygote to homozygote transition also showed a higher rate of ‘illogical genotypes’. These samples were excluded from further analyses of identifying monoallelic expression at the level of one SNP, but the females were used for measuring skewed X-inactivation because it includes the average effect of all SNPs on the X chromosome and rests upon comparing the X chromosome with the autosomes.

We proceeded to apply a strict per-SNP filter to rule out non-expressing or otherwise error-prone SNPs. First, we removed any SNP in which even a single individual showed an illogical cDNA genotype, as defined earlier. Furthermore, a cutoff of 4000 for the total signal level from the two alleles was applied to determine expressing SNPs, and SNPs in which <80% of the individuals were above the cutoff were excluded from further analyses. As an additional cutoff for expression, we looked at the ratio between allele A and allele B and determined whether this ratio was different between the different genotypes using non-parametric Kruskal–Wallis test. We chose a stringent cutoff of P-value of 5 × 10−4, to decrease the number of false positives and distinguish truly expressing genes. Next, we performed an analysis to prevent rare cases when error in the genomic DNA genotyping could lead to a false-positive AEI signal. To that end, the Illumina array information available for these individuals was used to impute the genotypes in the Affymetrix array using the Beagle software (55). We first generated datasets in which each one individual's Affymetrix calls were converted to missing. Then, these missing calls were imputed, so each time only one individual's Affymetrix genotypes were imputed and the rest were used as reference. Genotype calls discordant between the array and the imputation were removed. Lastly, SNPs with <3 heterozygote calls were filtered out. Following all these filters, ∼10% of the SNPs in the array were taken for further analyses.

### A graph-based method for rare allelic expression imbalance

We designed an algorithm to identify rare AEI events in at the level of one SNP. It is based on the idea that when looking at the two-dimensional space defined by the levels of signal from allele A (Signal A) and levels of the signal from allele B (Signal B), the different genotypes tend to form clusters in that space (see Fig. 1A for an example). An SNP showing AEI would be expected to display a shift, in the cDNA microarray, from the heterozygous cluster toward a homozygous cluster. We exploited the fact that contrary to the array measurements in the genomic DNA, in which the points in each genotype cluster are spread around some centroid point, in the cDNA, measurements in each cluster tended to show a linear correlation between Signal A and Signal B. We therefore calculated a linear regression line for each of the genotype clusters and then measured the distance between each heterozygous measurement and the different lines. We defined two measures, one (dhet) denoted the distance of the SNP from the heterozygote cluster. The second (dhom) denoted the distance of the SNP from the homozygote cluster closest to it. The allelic expression signal was calculated as the proportion of the Euclidean distance of each heterozygous measurement from the heterozygous line, from the Euclidean distance between both the heterozygous line and the homozygous line closest to it:

$$\hbox{AEI} = \displaystyle{{{d_{{\rm het}}}} \over {{d_{{\rm het}}} + {d_{\hom }}}}.$$

In cases where the measurement was more extreme than the homozygote line, this changed to $$\hbox{AEI} = {d_{{\rm het}}}\hbox{/}{d_{{\rm het}}} - {d_{\hom }}$$ allowing a score of >1.

We chose this measure as it provides an intuitive interpretation, whereby a result of <1 can be thought of as a bias in expression toward one of the alleles, whereas a result close to 1 or above it could indicate an event of monoallelic expression. To prevent a case where an allelic expressed sample would bias the heterozygous line and reduce our power to find true events, we looked at the distribution of the ratios of signal A to signal B and calculated the inter-quartile range (IQR). Samples that fell further than 1.5 IQRs from the 1st or 3rd quartile of the distribution were not used in calculating the heterozygous regression line. A drawback of our approach is that we inevitably miss events where a heterozygous SNP becomes homozygous in the cDNA, and no other homozygotes exist in the entire sample. However, we expected such events to be very rare, and in any case very difficult to identify accurately. To account for dispersion in the heterozygous cluster, which could potentially lead to false positives, we also calculated a Z-score for each heterozygote sample. As a threshold for declaring strong AEI, samples were chosen if they presented at least a score of 0.75 (reflecting a three-fourth shift toward a homozygote cluster) and a Z-score of 2 (reflecting a relative uniqueness of such event across the samples for that SNP). All putative results were plotted and manually visualized to determine true positives. A random sample of signals was chosen for validation using Sanger sequencing, and allele ratio was determined using the PeakPicker software (56).

### Comparing BA9 and cerebellum

To compare the AEI between BA9 and the cerebellum, we first refined the list of SNPs to include only those that show a similar pattern of expression in both tissues. To that end, we determined the maximum and minimum signal level of each allele for each genotype in the cerebellum and the BA9 samples. We required that for at least one genotype, all cerebellum samples would fall within this range, indicating a comparable level of expression and a similar behavior on the array. We then proceeded to analyze the cerebellum and BA9 samples together, using the same pipeline outlined above for the BA9 alone.

### Determining skewed X-inactivation

To infer the status of X-inactivation in the female samples, first the rate of heterozygote genomic DNA to homozygote cDNA conversion was calculated for the autosomal chromosomes as well as the X chromosome. To determine the allelic expression profile of genes within the PAR1 region of chromosome X, the boundaries of the region were extracted from the UCSC genome browser. A cutoff of 4000 was imposed, and the distribution of the proportion of allele A was plotted for heterozygote SNPs inside and outside of the PAR1 region.

### In silico dissection for cell-type-specific skewed X-inactivation

To determine whether the observed skewed X-inactivation showed cell-type specificity, we used the results of a previously published gene co-expression network based on adult human brain, which included modules highly enriched for specific neural cell types (41). For oligodendrocytes and astrocytes, one module showing a very strong enrichment was taken. For neurons, three different modules showing a strong enrichment were combined together for the analysis. SNPs in chromosome X were distributed between the modules, and the distribution of the proportion of allele A was plotted for the genomic DNA and the cDNA. For each of the cell types, a gene was chosen for validation using Sanger sequencing. For astrocytes, oligodendrocytes and neurons, a gene was chosen based on a published database of cell-type-specific gene expression in the mouse brain (42). For microglia, a gene from the microglia module was chosen. Allele ratios were determined using the PeakPicker software (56).

### Determining the allelic expression status of imprinted genes and screening for abnormal events

A list of genes with either a confirmed, conflicting or unknown imprinting status was downloaded from the Geneimprint database on 30 June 2013 and supplemented by mining the literature for reports of imprinting. For the list of genes, Refseq transcripts were downloaded from the UCSC genome browser. For each transcript, we chose all SNPs with at least two heterozygotes, and in which the total signal level (allele A + allele B) of at least half of the sample was >4000. The signal levels of SNPs in these genes were visualized, and imprinting status was manually determined. Genomic DNA from parents of two individuals, AN16115 and AN17138, was genotyped using Affymetrix SNP 6.0 Arrays, to determine direction of imprinting.

### Methylation analysis in PWS-IC region

Methylation in the Prader–Willi syndrome imprinting center (PWS-IC) was determined using pyrosequencing. Pyrosequencing was performed at EpigenDX (Worcester, MA, USA). The assay included a total of 20 CpG sites, located at chromosome 15:25,200,036–25,200,202 (Supplementary Material, Table S6). Four samples were assayed, including AN00493 and three which showed monoallelic expression in UBE3A.

### Gene expression analysis

RNA sequencing was performed on two individuals with autism, AN12457 and AN11989. Libraries were prepared using SOLiD™ Total RNA-Seq Kit (Applied Biosystems) following the Whole Transcriptome protocol. Size selection was performed using the E-Gel® EX Agarose Gels (Invitrogen). Libraries were quantified and validated using the Agilent 2100 Bioanalyzer (Agilent technologies). Library was quantified using the KAPA ABI SOLiD library quantification Kit (KAPA Biosciences). Sequencing of 50-bp reads was done using the ABI SOLiD 3 plus analyzer, following Applied Biosystems protocols. RNA-sequencing data were aligned to the human genome (UCSC build hg19) using the SHRiMP2 alignment software (57). For AN12457, 34621265 reads were generated, of which 24568591 (70.96%) were mapped. For AN11989, 30236974 reads were generated, of which 40% were mapped. In all figures, only the expression in AN12457 is visualized, as the expression pattern was similar between the two samples in all cases shown.

Previously published (6) microarray data were acquired from GEO and reanalyzed using the Lumi R package (58). We determined the relative expression of each individual in the genes SNRPN, UBE3A and SNURF using a Z-score.

Real-time PCR was performed using the PerfeCTa SYBR Green FastMix, Rox (Quanta). Fluorescence was monitored and analyzed in a Bio-Rad C1000 Thermal Cycler with a CFX96 real-time system. All experiments were performed in triplicate, and the values of each sample were first averaged across the triplicate. Subsequently, values of gene expression in each gene of interest were normalized first to the mean of a reference gene (CHL1), chosen as it was previously found to have stable expression in >100 brain samples (59). Reference normalized values were then normalized to the mean of all samples of the gene of interest.

### Per-individual expression patterns

To perform an individualized expression analysis, we compared each individual's microarray gene expression against the entire sample cohort. This was done using two separate measures. The first was an individualized fold change, which was defined for the expression E of each individual i and each gene g as the ratio $$E_i^g \hbox{/}({E^g}).$$ The second was a Z-score:

$$\displaystyle{{E_i^g - ({E^g})} \over {\sqrt {\sum\nolimits_i {{{(E_i^g - ({E^g}))}^2}} } }}.$$

Functional enrichment was determined using the GO Elite software(60). P-values were determined by 10 000 permutations.

## FUNDING

This research was supported by grants from the National Institute for Psychobiology in Israel, the Legacy Heritage Fund program of the Israel Science Foundation (grant no. 1998/08), and Israel Science Foundation (grant no. 688/12). E.B.D. is supported by the Dennis Weatherstone Pre-doctoral Fellowship from Autism Speaks (grant no. 8595).

## ACKNOWLEDGEMENTS

We are grateful to the families of individuals with autism for the thoughtful donation of postmortem brain tissues for research and for the efforts of the Autism Tissue Program at the Harvard Brain Tissue Resource Center, Belmont, MA, USA, and the NICHD Brain and Tissue Bank for Developmental Disorders at the University of Maryland, Baltimore, MD, USA.

Conflict of Interest statement. None declared.

## REFERENCES

1
Iossifov
I.
Ronemus
M.
Levy
D.
Wang
Z.
Hakker
I.
Rosenbaum
J.
Yamrom
B.
Lee
Y.
Narzisi
G.
Leotta
A.
et al.
De novo gene disruptions in children on the autistic spectrum
Neuron

2012
74
285
299
2
Neale
B.M.
Kou
Y.
Liu
L.
Ma'ayan
A.
Samocha
K.E.
Sabo
A.
Lin
C.-F.
Stevens
C.
Wang
L.-S.
Makarov
V.
et al.
Patterns and rates of exonic de novo mutations in autism spectrum disorders
Nature

2012
485
242
245
3
O'Roak
B.J.
Vives
L.
Girirajan
S.
Karakoc
E.
Krumm
N.
Coe
B.P.
Levy
R.
Ko
A.
Lee
C.
Smith
J.D.
et al.
Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations
Nature

2012
485
246
250
4
Sanders
S.J.
Murtha
M.T.
Gupta
A.R.
Murdoch
J.D.
Raubeson
M.J.
Willsey
A.J.
Ercan-Sencicek
A.G.
DiLullo
N.M.
Parikshak
N.N.
Stein
J.L.
et al.
De novo mutations revealed by whole-exome sequencing are strongly associated with autism
Nature

2012
485
237
241
5
Amaral
D.G.
Schumann
C.M.
Nordahl
C.W.
Neuroanatomy of autism
Trends Neurosci.

2008
31
137
145
6
Voineagu
I.
Wang
X.
Johnston
P.
Lowe
J.
Transcriptomic analysis of autistic brain reveals convergent molecular pathology
Nature

2011
474
380
384
7
Persico
A.M.
Bourgeron
T.
Searching for ways out of the autism maze: genetic, epigenetic and environmental clues
Trends Neurosci.

2006
29
349
358
8
Lasalle
J.M.
Epigenomic strategies at the interface of genetic and environmental risk factors for autism
J. Hum. Genet.

2013
10.1038/jhg.2013.49
9
Amir
R.E.
Van den Veyver
I.B.
Wan
M.
Tran
C.Q.
Francke
U.
Zoghbi
H.Y.
Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2
Nat. Genet.

1999
23
185
188
10
Ben-David
E.
Shifman
S.
Combined analysis of exome sequencing points toward a major role for transcription regulation during brain development in autism
Mol. Psychiatry

2013
18
1054
1056
11
Cook
E.H.
Lindgren
V.
Leventhal
B.L.
Courchesne
R.
Lincoln
A.
Shulman
C.
Lord
C.
Courchesne
E.
Autism or atypical autism in maternally but not paternally derived proximal 15q duplication
Am. J. Hum. Genet.

1997
60
928
934
12
Abrahams
B.S.
Geschwind
D.H.
Advances in autism genetics: on the threshold of a new neurobiology
Nat. Rev. Genet.

2008
9
341
355
13
Gregg
C.
Zhang
J.
Weissbourd
B.
Luo
S.
Schroth
G.P.
Haig
D.
Dulac
C.
High-resolution analysis of parent-of-origin allelic expression in the mouse brain
Science

2010
329
643
648
14
Gregg
C.
Zhang
J.
Butler
J.E.
Haig
D.
Dulac
C.
Sex-specific parent-of-origin allelic expression in the mouse brain
Science

2010
329
682
685
15
DeVeale
B.
van der Kooy
D.
Babak
T.
Critical evaluation of imprinted gene expression by RNA-Seq: a new perspective
PLoS Genet.

2012
8
e1002600
16
Tycko
B.
Morison
I.M.
Physiological functions of imprinted genes
J. Cell. Physiol.

2002
192
245
258
17
D.
Cheslack-Postava
K.
C.
Newschaffer
C.
Chakravarti
A.
Arking
D.E.
Feinberg
A.
Fallin
M.D.
Parent-of-origin effects in autism identified through genome-wide linkage analysis of 16,000 SNPs
PLoS One

2010
5
8
18
C.
Crespi
B.
Imbalanced genomic imprinting in brain development: an evolutionary basis for the aetiology of autism
J. Evol. Biol.

2006
19
1007
1032
19
Skuse
D.H.
Imprinting, the X-chromosome, and the male brain: explaining sex differences in the liability to autism
Pediatr. Res.

2000
47
9
16
20
Schanen
N.C.
Epigenetics of autism spectrum disorders
Hum. Mol. Genet.

2006
15
Spec No
R138
R150
21
Samaco
R.C.
Hogart
A.
LaSalle
J.M.
Epigenetic overlap in autism-spectrum neurodevelopmental disorders: MECP2 deficiency causes reduced expression of UBE3A and GABRB3
Hum. Mol. Genet.

2005
14
483
492
22
Franco
B.
Ballabio
A.
X-inactivation and human disease: X-linked dominant male-lethal disorders
Curr. Opin. Genet. Dev.

2006
16
254
259
23
Z.
Bittel
D.C.
Veatch
O.J.
Kibiryeva
N.
Butler
M.G.
Brief report: non-random X chromosome inactivation in females with autism
J. Autism Dev. Disord.

2005
35
675
681
24
Gong
X.
Bacchelli
E.
Blasi
F.
Toma
C.
Betancur
C.
Chaste
P.
Delorme
R.
Durand
C.M.
Fauchereau
F.
Botros
H.G.
et al.
Analysis of X chromosome inactivation in autism spectrum disorders
Am. J. Med. Genet. B. Neuropsychiatr. Genet.

2008
147B
830
835
25
C.
Ferguson-Smith
A.C.
Epigenetics: monoallelic expression in the immune system
Curr. Biol.

2002
12
R108
R110
26
Chess
A.
Simon
I.
Cedar
H.
Axel
R.
Allelic inactivation regulates olfactory receptor gene expression
Cell

1994
78
823
834
27
Esumi
S.
Kakazu
N.
Taguchi
Y.
Hirayama
T.
Sasaki
A.
Hirabayashi
T.
Koide
T.
Kitsukawa
T.
S.
Yagi
T.
Monoallelic yet combinatorial expression of variable exons of the protocadherin-alpha gene cluster in single neurons
Nat. Genet.

2005
37
171
176
28
Chess
A.
Nat. Genet.

2005
37
120
121
29
Gimelbrant
A.
Hutchinson
J.N.
Thompson
B.R.
Chess
A.
Widespread monoallelic expression on human autosomes
Science

2007
318
1136
1140
30
Jeffries
A.R.
Perfect
L.W.
Ledderose
J.
Schalkwyk
L.C.
Bray
N.J.
Mill
J.
Price
J.
Stochastic choice of allelic expression in human neural stem cells
Stem Cells

2012
30
1938
1947
31
Buonocore
F.
Hill
M.J.
Campbell
C.D.
P.B.
Jeffries
A.R.
Troakes
C.
Hortobagyi
T.
Williams
B.P.
Cooper
J.D.
Bray
N.J.
Effects of cis-regulatory variation differ across regions of the adult human brain
Hum. Mol. Genet.

2010
19
4490
4496
32
Ben-David
E.
Granot-Hershkovitz
E.
Monderer-Rothkoff
G.
Lerer
E.
Levi
S.
Yaari
M.
Ebstein
R.P.
Yirmiya
N.
Shifman
S.
Identification of a functional rare variant in autism using genome-wide screen for monoallelic expression
Hum. Mol. Genet.

2011
20
3632
3641
33
Hogart
A.
Nagarajan
R.P.
Patzel
K.A.
Yasui
D.H.
Lasalle
J.M.
15q11-13 GABAA receptor genes are normally biallelically expressed in brain yet are subject to epigenetic dysregulation in autism-spectrum disorders
Hum. Mol. Genet.

2007
16
691
703
34
Wintle
R.F.
Lionel
A.C.
Hu
P.
Ginsberg
S.D.
Pinto
D.
Thiruvahindrapduram
B.
Wei
J.
Marshall
C.R.
Pickett
J.
Cook
E.H.
et al.
A genotype resource for postmortem brain samples from the autism tissue program
Autism Res.

2011
4
89
97
35
Ge
B.
Pokholok
D.K.
Kwan
T.
Grundberg
E.
Morcos
L.
Verlaan
D.J.
Le
J.
Koka
V.
Lam
K.C.L.
Gagne
V.
et al.
Global patterns of cis variation in human cells revealed by high-density allelic expression analysis
Nat Genet

2009
41
1216
1222
36
Huang
D.W.
Sherman
B.T.
Lempicki
R.A.
Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources
Nat. Protoc.

2008
4
44
57
37
Zwemer
L.M.
Zak
A.
Thompson
B.R.
Kirby
A.
Daly
M.J.
Chess
A.
Gimelbrant
A.A.
Autosomal monoallelic expression in the mouse
Genome Biol.

2012
13
R10
38
Plenge
R.M.
Stevenson
R.A.
Lubs
H.A.
Schwartz
C.E.
Willard
H.F.
Skewed X-chromosome inactivation is a common feature of X-linked mental retardation disorders
Am. J. Hum. Genet.

2002
71
168
173
39
Zoghbi
H.Y.
Percy
A.K.
Schultz
R.J.
Fill
C.
Patterns of X chromosome inactivation in the Rett syndrome
Brain Dev.

1990
12
131
135
40
Blaschke
R.J.
Rappold
G.
The pseudoautosomal regions, SHOX and disease
Curr. Opin. Genet. Dev.

2006
16
233
239
41
Ben-David
E.
Shifman
S.
Networks of neuronal genes affected by common and rare variants in autism spectrum disorders
PLoS Genet.

2012
8
e1002556
42
Cahoy
J.D.
Emery
B.
Kaushal
A.
Foo
L.C.
Zamanian
J.L.
Christopherson
K.S.
Xing
Y.
Lubischer
J.L.
Krieg
P.A.
Krupenko
S.A.
et al.
A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function
J. Neurosci.

2008
28
264
278
43
Choi
J.D.
Underkoffler
L.A.
Wood
A.J.
Collins
J.N.
Williams
P.T.
Golden
J.A.
Schuster
E.F.
Loomes
K.M.
Oakey
R.J.
A novel variant of Inpp5f is imprinted in brain, and its expression is correlated with differential methylation of an internal CpG island
Mol. Cell. Biol.

2005
25
5514
5522
44
Monk
D.
Arnaud
P.
Frost
J.M.
Wood
A.J.
Cowley
M.
Martin-Trujillo
A.
A.
Iglesias Platas
I.
Camprubi
C.
Bourc'his
D.
et al.
Human imprinted retrogenes exhibit non-canonical imprint chromatin signatures and reside in non-imprinted host genes
Nucleic Acids Res.

2011
39
4577
4586
45
Pollard
K.S.
Serre
D.
Wang
X.
Tao
H.
Grundberg
E.
Hudson
T.J.
Clark
A.G.
Frazer
K.
A genome-wide approach to identifying novel-imprinted genes
Hum. Genet.

2008
122
625
634
46
Daelemans
C.
Ritchie
M.E.
Smits
G.
Abu-Amero
S.
Sudbery
I.M.
Forrest
M.S.
Campino
S.
Clark
T.G.
Stanier
P.
Kwiatkowski
D.
et al.
High-throughput analysis of candidate imprinted genes and allele-specific gene expression in the human term placenta
BMC Genet.

2010
11
25
47
Chamberlain
S.J.
Lalande
M.
Angelman syndrome, a genomic imprinting disorder of the brain
J. Neurosci.

2010
30
9958
9963
48
Meng
L.
Person
R.E.
Beaudet
A.L.
Ube3a-ATS is an atypical RNA polymerase II transcript that represses the paternal expression of Ube3a
Hum. Mol. Genet.

2012
21
3001
3012
49
Runte
M.
The IC-SNURF-SNRPN transcript serves as a host for multiple small nucleolar RNA species and as an antisense RNA for UBE3A
Hum. Mol. Genet.

2001
10
2687
2700
50
Grafodatskaya
D.
Chung
B.H.Y.
Butcher
D.T.
Turinsky
A.L.
Goodman
S.J.
Choufani
S.
Chen
Y.-A.
Lou
Y.
Zhao
C.
Rajendram
R.
et al.
Multilocus loss of DNA methylation in individuals with mutations in the histone H3 lysine 4 demethylase KDM5C
BMC Med. Genomics

2013
6
1
51
Minshew
N.J.
Williams
D.L.
The new neurobiology of autism: cortex, connectivity, and neuronal organization
Arch. Neurol.

2007
64
945
950
52
Ecker
C.
Ronan
L.
Feng
Y.
Daly
E.
Murphy
C.
Ginestet
C.E.
Brammer
M.
Fletcher
P.C.
Bullmore
E.T.
Suckling
J.
et al.
Intrinsic gray-matter connectivity of the brain in adults with autism spectrum disorder

2013
110
13222
13227
53
Wickham
H.
ggplot2: Elegant Graphics for Data Analysis (Use R!)

2009
Springer, New York
54
Yin
T.
Cook
D.
Lawrence
M.
ggbio: an R package for extending the grammar of graphics for genomic data
Genome Biol.

2012
13
R77
55
Browning
B.L.
Browning
S.R.
A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals
Am. J. Hum. Genet.

2009
84
210
223
56
Ge
B.
Gurd
S.
Gaudin
T.
Dore
C.
Lepage
P.
Harmsen
E.
Hudson
T.J.
Pastinen
T.
Survey of allelic expression using EST mining
Genome Res.

2005
15
1584
1591
57
David
M.
Dzamba
M.
Lister
D.
Ilie
L.
Brudno
M.
SHRiMP2: sensitive yet practical SHort Read Mapping
Bioinformatics

2011
27
1011
1012
58
Du
P.
Kibbe
W.A.
Lin
S.M.
lumi: a pipeline for processing Illumina microarray
Bioinformatics

2008
24
1547
1548
59
G.
Shifman
S.
The genetic variation of RELN expression in schizophrenia and bipolar disorder
PLoS One

2011
6
7
60
Zambon
A.C.
Gaj
S.
Ho
I.
Hanspers
K.
Vranizan
K.
Evelo
C.T.
Conklin
B.R.
Pico
A.R.
Salomonis
N.
GO-Elite: a flexible solution for pathway and ontology over-representation
Bioinformatics

2012
28
2209
2210