An emerging neuropathological theory of Autism, referred to here as “the neural unreliability thesis,” proposes greater variability in moment-to-moment cortical representation of environmental events, such that the system shows general instability in its impulse response function. Leading evidence for this thesis derives from functional neuroimaging, a methodology ill-suited for detailed assessment of sensory transmission dynamics occurring at the millisecond scale. Electrophysiological assessments of this thesis, however, are sparse and unconvincing. We conducted detailed examination of visual and somatosensory evoked activity using high-density electrical mapping in individuals with autism (N = 20) and precisely matched neurotypical controls (N = 20), recording large numbers of trials that allowed for exhaustive time-frequency analyses at the single-trial level. Measures of intertrial coherence and event-related spectral perturbation revealed no convincing evidence for an unreliability account of sensory responsivity in autism. Indeed, results point to robust, highly reproducible response functions marked for their exceedingly close correspondence to those in neurotypical controls
Reports have recently emerged pointing to the possibility that evoked sensory-neural responses might show greater trial-to-trial variability in individuals with an autism spectrum disorder (ASD) (Milne 2011; Dinstein et al. 2012, 2015; Edgar et al. 2013; Haigh et al. 2015). The general notion is that signal averaging procedures typically used in neurophysiological and neuroimaging studies may obscure the fact that there are ongoing and presumably relatively dramatic fluctuations in response stability to individual events. Precisely, how such variability would contribute to the ASD phenotype is not clear, but one could certainly speculate that perceptual abilities might be degraded and that learning would in turn be impacted. One of the main reports to claim this response instability in ASD was provided by a recent functional magnetic resonance imaging (fMRI) study (Dinstein et al. 2012). These investigators examined the blood oxygenation level-dependent (BOLD) response to basic visual, auditory and somatosensory stimulation in a cohort of 12 ASD adults while the participants performed a temporally uncorrelated central fixation task. They reported significantly greater standard deviations of responses across all 3 sensory systems, and lower overall signal-to-noise ratios (SNRs) in their ASD cohort relative to age-matched neurotypical (NT) controls, concluding that “evoked responses” were “unreliable” in autism. These studies have fueled an ongoing theoretical discussion on the role of noise in sensory processing in autism (Simmons et al. 2009; Brock 2012; Pellicano and Burr 2012; Uhlhaas and Singer 2012; Davis and Plaisted-Grant 2015; Dinstein et al. 2015; Simmons and Milne 2015).
However, we would hold that considerable caution should be exercised before inferring trial-to-trial variability of the evoked neural response on the basis of trial-to-trial instability in the BOLD response. The BOLD signal represents changes in blood flow over a period of seconds, whereas the electro-dynamics of neural transmission occur on the order of milliseconds (Schroeder et al. 1998; Foxe and Simpson 2002), and a complex and indirect relationship exists between these 2 signals (Harris et al. 2011). It cannot simply be assumed that variability in cerebral blood flow has direct implications for the integrity of the neuro-electric signal. To our knowledge, there has been only one study to date to directly assess what we will term the “neural unreliability thesis” using recordings of the brain's electrical response (Milne 2011). Participants in that study were asked to perform a visual animal matching task while task-irrelevant Gabor grating stimuli of varying spatial frequencies were also presented (0.5, 1, 4, and 8 cycles-per-degree). Single-trial analyses showed less reliable intertrial coherence (ITC) in the alpha-band range and greater median absolute deviation of the visual evoked potential (VEP) during processing of the 8 cycle conditions in ASD. The other stimulus conditions were not similarly analyzed for reliability. In an earlier paper using the same stimuli but where the basic averaged VEP was assessed, this research group had shown response differences between ASD and NT children for the 4 and 8 cycle conditions, but not for the 0.5 and 1 cycle conditions (Milne et al. 2009). Thus, taking both studies into consideration, there were conditions where the standard VEP was indistinguishable between groups and so far as the analyses provided in the second of these papers allows, it appears to have only been in one condition, and mostly within a restricted frequency band around alpha, that support for a neural unreliability account emerged. Presumably, a general deficit in response reliability would not be expected to distinguish between conditions in this way. Clearly, the case for the unreliability thesis requires further examination.
In work from our own research group using the high-density evoked potential technique across a number of large-cohort ASD studies, sensory-perceptual functioning across all 3 primary sensory modalities has been assessed and we have been struck by the general robustness and highly typical morphology of the evoked potential (Russo et al. 2010; Brandwein et al. 2013; Fiebelkorn et al. 2012; Frey et al. 2013). In most of these studies, early evoked potentials were decidedly similar across the two groups, although as was the case in the Milne et al (2009) study, specific stimulus manipulations did tease the groups apart. Again, a reasonable question to ask is whether sensory responses across modalities could show such similar morphology in the averaged evoked potential if there was an underlying core deficit in response reliability (i.e., a more variable and noisy response at the single-trial level). If the unreliability thesis is correct, then a number of straightforward predictions can be made about the evoked response; 1) the averaged evoked potential (a) should be broader and should show delayed peaks for all components, and hence (b) the frequency decomposition of the evoked response should show a lower pass-band; 2) there should be greater variability in (a) phase and/or (b) amplitude dispersion across single trials.
Here, we examined these predictions in children on the autism spectrum (N = 20) and a precisely matched cohort of NT children using high-density electrical recordings of the VEP and the somatosensory evoked potential with large trial numbers per participant (average N = 240). To do this, we analyzed the evoked and frequency data at a group level and time frequency at an individual participant level. We then unfurl how the average evoked response and the single-trial predictions are related. Finally, we simulate both a temporal and amplitude jitter at the single-trial level on data from the NT children to illustrate the predictions of the unreliability thesis and to quantify the sensitivity and power of the current measures to small temporal and amplitude perturbations.
Materials and Methods
For each of the somatosensory and visual studies, data are reported from 20 children with ASD and 20 age, gender, verbal intelligence quotient (VIQ) and full scale intelligence quotient (FSIQ) matched NT children (see Table 1). There was an overlap of 9 ASD and 7 NT participants for the visual and somatosensory conditions. All participants were aged between 7 and 15 years and had normal or corrected-to-normal vision. Exclusion criteria for the NT group included a history of developmental, psychiatric, or learning difficulties as assessed by a parent history questionnaire. All children were screened for attention deficit/hyperactivity disorder (ADHD). Only NT children were excluded if their parents endorsed 6 items or more of inattention or hyperactivity on a DSM-IV ADHD behavioral checklist. NT children were also excluded if they had a biological first-degree relative with a known developmental disorder. Children with ASD were not excluded for presenting symptoms of inattention and hyperactivity, as such symptoms are very common in ASD, and a diagnosis of an ASD precludes a comorbid diagnosis of ADHD. List of ASD participant's medication included inLord et al. 1994) and the Autism Diagnostic Observation Schedule (Lord et al. 1999), and clinical judgment. Intellectual functioning was assessed using the Wechsler Abbreviated Scales of Intelligence (WASI, Wechsler 1999).
|Age (mean ± SD)||11.2 ± 2.3||10.9 ± 2.3||0.7||11.0 ± 2.3||10.7 ± 2.3||0.7|
|VIQ (mean ± SD)||111.8 ± 15.7||101. ± 17.5||0.04||111.8 ± 12.0||108.4 ± 18.0||0.1|
|FSIQ (mean ± SD)||109.1 ± 12.4||108.4 ± 17.1||0.9||113.5 ± 13.3||105.7 ± 17.5||0.6|
|No. of males||19||19||18||18|
|Age (mean ± SD)||11.2 ± 2.3||10.9 ± 2.3||0.7||11.0 ± 2.3||10.7 ± 2.3||0.7|
|VIQ (mean ± SD)||111.8 ± 15.7||101. ± 17.5||0.04||111.8 ± 12.0||108.4 ± 18.0||0.1|
|FSIQ (mean ± SD)||109.1 ± 12.4||108.4 ± 17.1||0.9||113.5 ± 13.3||105.7 ± 17.5||0.6|
|No. of males||19||19||18||18|
Before participation, informed written consent was obtained from every child's parent or legal guardian, and verbal or written assent was obtained from each child. All procedures were approved by the Institutional Review Boards of the City College of the City University of New York and the Albert Einstein College of Medicine. Participants were given $12.00 an hour for their time in the laboratory. All procedures were consistent with the ethical standards laid out in the Declaration of Helsinki.
Stimuli and Procedure
The visual and somatosensory data presented here are a subset of data from ongoing studies investigating visual and somatosensory processing in typical development and in autism.
Stimuli were 100% contrast black and white checkerboard annuli (6.5 cm diameter, 1 cm width, 4° × 4°, white luminance of 120 cd/m2, black luminance of 0.2 cd/m2) centered against a gray (luminance = 25 cd/m2) background. A fixation-cross was always present on the screen, including during checkerboard presentation (see
Participants sat in a dark sound attenuated electrically shielded booth (Industrial Acoustics Company), 90 cm from a 34×55 cm LCD computer screen (ViewSonic VP2655wb). They were instructed to minimize head movements and blinking while fixating on the cross at the center of the screen. They performed a change detection task to ensure fixation in which they were asked to respond to a color change (from red to green, lasting 33 ms) of the fixation-cross with a mouse button press using the right index finger. The presentation of the checkerboard stimuli was temporally unrelated to this central fixation task. The visual stimuli were presented in a block of 100 stimuli at an ISI of 1050 ± 50 ms (Andrade et al. 2015, 2016).
Somatosensory stimuli of 50 ms duration were generated using an in-house custom-built vibrotactile stimulator. The stimulator comprises a small (4 × 8 mm, 1.1 g) powerful (1.2 G-force, 200 Hz) vibration motor, as typically found in a cellphone, encased in a hard plastic tubing enclosure and affixed to the participant's right wrist along the median nerve using Velcro strapping. Each individual's wrist circumference was first measured and then an additional 2 cm of strapping was added to accommodate the stimulation device (see
Participants sat in a dark sound attenuated booth (Industrial Acoustics Company) while viewing a movie of their choice with sound on during stimulus delivery. They were instructed to ignore the somatosensory stimuli. In all, 500 somatosensory stimuli were presented in a single block, using an ISI of 1000 ms (Andrade et al. 2016; Uppal et al. 2016).
For both experiments, continuous electroencephalographic (EEG) data were recorded using a Biosemi ActiveTwo 70-channel (64 scalp channels and 6 external electrodes) system, at a digitization rate of 512 Hz with an open pass-band from DC to 150 Hz. The continuous EEG was recorded referenced to a common mode sense (CMS) active electrode and a driven right leg (DRL) passive electrode. CMS and DRL, which replace the ground electrodes used in conventional systems, form a feedback loop, thus rendering them references (for a description of the BioSemi active electrode system referencing and grounding conventions, visit www.biosemi.com/faq/cms&drl.htm).
EEG/Event-Related Potential Processing and Analyses
Using custom MATLAB scripts, the continuous data were band-pass filtered offline between 0.1 and 45 Hz (24 dB/octave). Epochs of 1000 ms with 400 ms prestimulus were extracted from the data. An automatic artifact rejection criterion of ±80 μV was applied across all electrodes in the array, and channels with a standard deviation of >0.5 μV were considered bad (Butler et al. 2011; Brandwein et al. 2013). Trials with more than 4 artifact channels were rejected. In trials with less than 4 such channels, any remaining bad channels were interpolated using the nearest neighbor spline (Perrin et al. 1987, 1989). The data were re-referenced to the average of all scalp channels (Nunez and Srinivasan 2006) and base-lined from −50 to 0 ms.
Overview of Analytic Approach
The same analyses were applied to each of the two data sets (visual and somatosensory) with the goal of investigating the consistency of the neural signal in ASD compared with NT controls. A multipronged approach was taken to characterize the data. We first considered the average evoked responses, assessing the presence of between-group differences in the amplitude of the evoked responses (addressing prediction #1a: that an unreliable signal would have broader and delayed components, SNRs, an approach that has been used in previous investigations to assess the reliability of the neural response in ASD: Martineau et al. 1992; Milne 2011; Weinger et al. 2014), and averaged evoked power spectrum (addressing prediction #1b: that the frequency decomposition of an unreliable signal should have a lower pass-band—e.g., lower frequency peaks in the theta and alpha bands). Next, in our primary test of the reliability of the neural signal, we considered the single-trial data. We derived indices of the consistency of phase (addressing prediction #2a: an unreliable signal should show greater variability of phase dispersion across single trials) and power (addressing prediction #2b: an unreliable signal should have greater variability in amplitude across single trials) for frequencies between 1 and 40 Hz across trials, and assessed the presence of group differences. To illustrate the link between the average evoked response data from prediction 1 and the single-trial data from prediction 2, the individual participant SNR values were then correlated with the ITC values for 3 frequency bands, theta (5–8 Hz), alpha (9–14 Hz), and beta (15–30 Hz).
Analysis of Group Differences in the Average Evoked Responses
Amplitude of the evoked response
Statistical cluster plots (SCPs) comparing ASD with NT responses were generated to assess the presence of significant between-group differences in the amplitude of the responses. To generate SCPs, unpaired t-tests comparing the NT versus ASD evoked response at each time point, for each electrode, were performed. Significant data points (at the P ≤ 0.05 level) were then plotted as a function of time and electrode. To control for Type 1 errors, only data points that reached significance for at least 10 subsequent consecutive time points were included (which, given the 512-Hz digitization rate, excluded effects that did not last for at least 19.5 ms (Guthrie and Buchwald 1991; Brandwein et al. 2013).
SNR was measured from the global field power (GFP) of the 64 scalp channels for each participant (Lehmann and Skrandies 1980). The background noise was estimated from the prestimulus period of the GFP (−100 to −50). To represent the signal in the stimulus evoked response, GFP was taken from the first major peak of the response (90–140 ms for the visual response and 60–110 ms for the somatosensory response). The squared signal was divided by squared noise and converted to decibels in order to be scale invariant. For each condition, the resulting SNRs were compared between groups using a two-sample Kolmogorov–Smirnov test (Altschuler et al. 2012).
Averaged evoked power spectrum
The evoked power spectrum F(f) at frequency (f) of the average response was calculated using a Fast Fourier Transform convolved with a Hanning window over the 1000-ms epoch centered at 100 ms. For each participant, this yielded a power value for 1–45 Hz with 1 Hz steps. A less reliable evoked response in the ASD participants would result in a broad frequency band. The power spectra of ASD and NT groups were compared using a nonparametric randomization procedure (Maris and Oostenveld 2007).
Reliability of Neural Signal Transmission: Single-Trial Analysis
Between 4 and 40 Hz, the power spectrum Fk(f, t) at frequency (f) was calculated over a sliding time window centered at time (t) for each trial (k) using a Morlet wavelet with linearly increasing wavelet cycles from 1 cycle at 3 Hz to 3 cycles at 40 Hz (Delorme and Makeig 2004). The analysis resulted in 19 linearly spaced frequencies from 4 to 40 Hz with 2 Hz steps from −280 ms to 472 ms with 4 ms steps. From this, event-related spectral perturbations (ERSPs) and ITC were computed. ERSP represents power computed relative to prestimulus baseline for each trial. The ITC values are a measure of the consistency of the phase of the evoked response ranging between 0 and 1, and serve as the primary metric of intertrial reliability here (Tallon-Baudry et al. 1996; Delorme and Makeig 2004; Mercier et al. 2013, 2015). A result near 0 implies low reliability in the phase of the evoked response across trials and 1 implies a perfectly reliable response across epochs. For more detail, see
Due to the tradeoff between frequency resolution and time resolution when conducting time-frequency analyses, we conducted a second trial-frequency analysis to investigate the reliability and the power of lower frequencies (1–15 Hz). The power spectrum Fk(f) at frequency (f) for each trial (k) was calculated using a Fast Fourier Transform convolved with a Hanning window over the 1000-ms epoch centered at 100 ms poststimulus. This yielded a power value and phase value for each trial for each frequency from 1 to 15 Hz with 1 Hz steps. As the power was calculated on the whole epoch, amplitude rather than ERSP is reported. The phase values were used to calculate the ITC, in the same way as the main time-frequency analysis. There is an overlap of frequencies from 4 to 15 Hz between the trial-frequency analysis and the time-frequency analysis.
Nonparametric Statistical Comparison of ITC and ERSP Values Across Groups
To statistically compare the ITC and ERSP across groups, a nonparametric randomization procedure was implemented (Maris and Oostenveld 2007). For each group, the average ITC values were calculated. The observed differences between the group averages at each time point and frequency were compared with a reference distribution. The reference distribution was derived by iteratively randomizing participants between the two original data sets and calculating new group averages, which were then subtracted from each other. This procedure was performed 100 000 times to “paint” the distribution. The P value for a randomization test was calculated from the proportion of values in the reference difference distribution that exceeded the observed difference (Fiebelkorn et al. 2011). The same procedure was used to compare the ERSP at each time point and frequency between the groups. A threshold of P < 0.05 at either tail was used to define significance as we made no a priori assumptions about the directionality of possible effects. To control for false positives resulting from the multiple comparisons, P values were corrected using the false discovery rate (FDR) (Benjamini and Hochberg 1995). The FDR is a sequential Bonferroni-type procedure. Because FDR is highly conservative and thus favors certainty (Type II errors) over statistical power (Type I errors), we also report the uncorrected results. Clearly, a central premise of the current investigation is that we do not expect to see differences in the reliability of the evoked response between groups. Hence, the uncorrected results are also reported in Supplementary materials.
Event-Related Potential Analyses
Comparing the Amplitudes of the Average Evoked Responses
The evoked responses showed highly similar morphology between the ASD and NT groups for both visual and somatosensory conditions. Figure 1A shows the mean visual evoked response for ASD (light gray) and NT (dark gray) groups at 3 representative occipital electrode sites. Figure 1B shows the mean somatosensory evoked response for ASD (light gray) and NT (dark gray) groups at a frontal and a left central site (contralateral to the stimulated hand). The semitransparent shading depicts the standard error of the mean (SEM). There were no obvious differences in latency or broadness of the component responses. To highlight the similarity between participants and groups, average evoked response data and single-trial data are shown for two representative ASD participants and their aged matched NT controls in
SCPs comparing ASD with NT responses were generated to assess for the presence of significant differences in the amplitude of the response. Figure 2 shows the resulting SCPs for the visual (A) and somatosensory (B) conditions. This confirms the observation that there were only minimal differences between the groups for both conditions. For visual stimulation, differences were seen over posterior scalp at about 100 ms. For somatosensory stimulation, differences were present from about 70 to 100 ms over left central scalp.
Differences in SNRs have previously been interpreted to indicate reduced neural reliability in ASD (Milne; Dinstein et al. 2012). While there are many possible reasons for lower SNRs, and thus such a finding should be interpreted with caution (see Discussion), for comparison with previous studies we also assessed group differences in SNR in our data. Since the number of trials going into an average can influence SNR, we first submitted the number of accepted trials for the NT and ASD groups to a two-sample t-test. There were no significant differences in the number of trials between the groups for either the visual (P = 0.44) or the somatosensory experiments (P = 0.51). For each condition, the SNRs (see Materials and Methods) were compared using a two-sample Kolmogorov–Smirnov test (Altschuler et al. 2012). These failed to reveal significant differences in SNR between groups for either the visual (P = 0.12) or the somatosensory (P = 0.13) responses. Table 2 shows the mean and standard deviation of the SNR, and number of accepted trials for the ASD and NT groups, for the visual and somatosensory responses.
|SNR||34.2 ± 9.2||29.3 ± 9.1||0.12||19.0 ± 6.2||16.4 ± 8.0||0.13|
|Acc. trials||256.6 ± 82.5||237.4 ± 91.3||0.44||366.2 ± 58.2||377.6 ± 50.5||0.51|
|SNR||34.2 ± 9.2||29.3 ± 9.1||0.12||19.0 ± 6.2||16.4 ± 8.0||0.13|
|Acc. trials||256.6 ± 82.5||237.4 ± 91.3||0.44||366.2 ± 58.2||377.6 ± 50.5||0.51|
Evoked Power Spectrum
The evoked power spectrum F(f) at frequency (f) of the average response was calculated using a Fast Fourier Transform convolved with a Hanning window over the 1000-ms epoch centered at 100 ms, and compared between the ASD and NT groups using a nonparametric randomization procedure. For each participant, this yielded a power value for 1–45 Hz with 1 Hz steps. Randomization analysis revealed no significant differences between the groups’ evoked power spectra for either the visual or the somatosensory responses (Fig. 3A,B). The visual evoked power spectrum showed peaks in the low theta (4–8 Hz) and low alpha range at all 3 occipital sites for both groups (Fig. 3A). The somatosensory evoked power spectrum showed a peak in the delta and low theta (2–6 Hz) range at left parietal and fronto-central sites (Fig. 3B).
EEG Spectrum Analyses
From Figure 4, the similarity of the ITC values across the NT (top row) and ASD (middle row) groups for both visual (left side) and somatosensory (right side) conditions can be observed.
Visual response: Figure 4A shows the largest visual ITC values at ~100 ms, coinciding with the largest evoked peak at the central occipital sites. Uncorrected comparison of the ITC values between the groups resulted in very little statistical differences (see4A).
Somatosensory response: Figure 4B illustrates the somatosensory ITC values, which were largest at ~80 ms at the left parietal site for both groups, coinciding spatially and temporally with the largest evoked somatosensory response. FDR-corrected statistical comparison of the ITC values between the groups revealed no differences (third row). Overall, the comparison of the ITC data shows highly similar reliability of the evoked response for the ASD and NT groups for both the visual and somatosensory conditions. Uncorrected comparisons (
Examination of Figure 5A reveals that the visual ERSP responses were highly similar between the ASD and NT groups. The visual response for both groups showed an increase in ERSP from ~50 to ~180 ms between 4 and 40 Hz followed by a decrease in power from ~180 to 400 ms between 4 and 40 Hz at the occipital electrode sites (Fig. 5A). Examination of Figure 5B reveals that the somatosensory ERSP responses were highly similar between the ASD and NT groups, with an increase in ERSP from ~50 to ~180 ms between 4 and 40 Hz followed by a decrease in power from ~180 to 400 ms between 4 and 40 Hz. Consistent with these observations, statistical comparison of ERSP values between the two groups failed to reveal significant differences for either visual or somatosensory responses. The results here illustrate highly similar responses for specific channels; in two supplementary videos, the evoked and ITC and ESRSP data and the FDR corrected comparison for both groups are presented for each of the 64 channels for the visual (
Supplementary Time-Frequency Analysis
To account for possible differences due to volume conduction (Milne 2011), the single-trial data were transformed using a second-order spatial filter (Butler et al. 2011), the current source density (CSD), and then ITC and ERSP analysis as described above was performed on NT and ASD groups (see
Since on average, each participant had over 200 trials per condition this allowed for within participant statistical analysis.Makeig et al. 2002) for the visual and somatosensory conditions, respectively. The single participant ITC and ERSP results show a high number of individual participants in each group with statistically significant poststimulus activity, coinciding with the main ITC and ERSP responses shown in Figures 4 and 5. This further emphasizes the robust nature of the results at a single participant level and the similarities across the groups. In two supplementary videos, single participant ITC and ESRSP significance data and the subtraction of the groups are presented for each of the 64 channels for the visual (
Testing the Robustness of the Null Effects in the ITC and ERSP Data
A Bayes factor analysis was conducted to investigate evidence for the null hypothesis (that there is no difference between the ASD and NT groups) or the alternative hypothesis (that there is a difference between ASD and NT). The Bayes factor analysis is an alternative to a post hoc power analysis but has the benefit that it takes into account the sensitivity of the data to distinguish between the null and alternative hypothesis (Dienes 2014, 2016). The Jeffreys, Zellner, and Siow (JZS) Bayes factor was computed using the default effect size of 0.707 (Rouder et al. 2009). A JZS Bayes factor can be read such that a value greater than 3 favors the null hypothesis 3 times more than the alternative hypothesis, while a value less than one-third favors the alternative 3 times more than the null, values between one-third and 3 suggest that there is not enough evidence to favor either. To investigate the Null or alternative hypothesis of the ITC and ERSP data between the groups, an exploratory JZS Bayes factor analysis was calculated for all frequencies at each time point (Rouder et al. 2009). The JZS Bayes factor analysis resulted in no evidence for the alternative hypotheses at the peaks of activity, but with periods of evidence in favor of the Null hypothesis for the ITC and ERSP data, advocating for that the analysis is sufficiently powered to show similarities between the ASD and NT groups for the visual and somatosensory conditions.
Correlation Analysis Between the Single-Trial Data and the Mean Data
To investigate the relationship between average and single-trial measures of the reliability of the evoked response, Pearson's correlation coefficients were computed between the SNR and mean ITC values in the theta (4–8 Hz), alpha (8–14 Hz), and beta (14–30 Hz) bands from 90 to 140 ms for the visual response and 60–110 ms for the somatosensory response at 3 occipital electrode sites for the visual condition and at the left and frontal central electrode sites for the somatosensory condition. This analysis revealed significant correlations between SNR and ITC values (Fig. 6 and Table 3). For the visual condition, SNR was significantly correlated with both theta and beta ITC values for both groups at all electrode sites. Similarly, for the somatosensory condition SNR was significantly correlated with theta, alpha, and beta ITC values across both groups. These results show the strong link between single-trial variability and the subject mean data for the visual and somatosensory conditions across both groups. These data also attest that the lack of statistical differences between the groups observed at the average evoked level and the single-trial level is not due to variability in the single participant data but is due to correspondence of the groups.
|Left occipital||Central occipital||Right occipital||Left parietal||Frontal central|
|Left occipital||Central occipital||Right occipital||Left parietal||Frontal central|
†P < 0.1, *P < 0.05, **P < 0.01, ***P < 0.001.
Low Frequency ITC and Power Analysis
Low frequency ITC data
Figure 6 shows the ITC (top row) for frequencies between 1 and 15 Hz computed over a 1-s epoch for both groups at 3 occipital sites for the visual condition (A) and the left central and frontal central sites for the somatosensory condition (B).
In the visual condition, the largest ITC values were observed between 3 and 8 Hz for both groups at the 3 occipital sites, which coincided with the peak power of the average evoked spectrum (Fig. 3). The analysis for the ASD and NT groups resulted in similar ITC values and the unpaired randomization tests revealed uncorrected statistical differences in the alpha range for the right occipital, with larger ITC values in the NT group than in the ASD group for 3 frequencies (9 Hz, P = 0.02715; 13 Hz, P = 0.029; 14 Hz, P = 0.015).
Similar to the visual experiment, for the somatosensory experiment the largest ITC values were observed between 3 and 8 Hz for both groups, which coincided with the peak power for the average evoked spectrum (Fig. 3). The ASD and NT groups had very similar ITC values and the unpaired randomization tests revealed no statistical differences.
Low Frequency Trial Power data
Figure 7 shows the trial power values (bottom row) for frequencies between 1 and 14 Hz computed over 1-s epoch for both groups at 3 occipital sites for the visual response (A) and the left central and frontal central sites for the somatosensory response (B).
In the visual experiment, power values in the alpha range (8–12 Hz) deviated from the 1/f fall off. The unpaired randomization tests revealed no statistical differences in the power between the NT and ASD groups.
Similar to the visual experiment, in the somatosensory experiment the power values in the alpha range (8–12 Hz) deviated from the 1/f fall off. The unpaired randomization tests revealed no statistical differences in the power between the NT and ASD groups.
Simulation Analysis for Temporal and Amplitude Variability: Modeling the Unreliable Stimulus Evoked Response
To illustrate the effects of an unreliable evoked response, a temporal and amplitude jitter was introduced to the observed NT data set. To simulate a modulation of the variability of the amplitude of the evoked response a gain factor was introduced to the observed NT data set. Each trial (k) response was convolved with a ramping function that had a value of 1 in the prestimulus interval with a change in the poststimulus onsetting at ~10 ms ramping up or down to a gain value chosen randomly on a trial-by-trial basis plateauing at ~100 ms. To simulate temporal jitter variability for each trial (k), a temporal jitter was introduced to the epoched data:
Figure 9A,B shows average and standard error of the visual evoked response and the somatosensory evoked response of the average ASD (red) and NT (green) data and average simulated data (blue) for 4 latency jitter distributions; columns (1) 0 ms, (2) 2–10 ms, (3) 2–20 ms, and (4) 2–50 ms and 4 amplitude jitter; rows (1) 1 gain, (2) 0.75–1.25 gain, (3) 0.5–1.5 gain, and (4) 0.25–1.75 gain, at the central occipital site for the visual condition (A) and the frontal central site for the somatosensory condition (B). Each plot represents combinations of simulations with temporal jitter and amplitude gain variability. The top row represents simulations of only temporal jitter and the first column represents simulations of only amplitude gain. For both conditions, as the width of the temporal jitter increased, the evoked response was smoothed, with broader peaks that were delayed in latency, while the amplitude variability did not have an observable impact on the average evoked response.
Single-Trial Analysis of Simulated data
Figure 10 shows mean ITC values, respectively, for the 4 simulated latency jitters (columns) and 4 simulated amplitude variabilities (rows) at central occipital electrode site for the visual condition (A) and the frontal central for the somatosensory condition (B). Each plot represents combinations of simulations with temporal jitter and amplitude gain variability. The top row is simulations of only temporal jitter and the first column is simulations of only amplitude gain.
As in Figure 4, Figure 10 shows the largest mean ITC values for the visual condition (A) range from ~50 to ~200 ms while for the somatosensory condition (B) the largest ITC values range from ~80 to ~160 ms between 4 and ~40 Hz for the simulated latency jitter data. The consistency of the higher frequency ITC values between the observed and simulated data decreases as the range of the temporal jitter increases, while the amplitude variability did not have a visible impact on the ITC values of the simulated data.
Figure 11 shows FDR-corrected nonparametric statistical comparisons of the ITC values between the original observed ASD data and the 4 simulated temporal jitter and 4 amplitude gains on the NT data. The plots show a clear significant difference (indicated in white) in both conditions between the groups for largest temporal jitter from 2 to 50 ms (fourth column) for the 4 amplitude gains. Since temporal jitter should result in a delay in the peak of the initial evoked response, as well as a reduction in amplitude, it is not surprising that the focus of significant differences was seen to onset coincident with the first evoked response for both conditions. Uncorrected statistical comparison showed differences in ITC between the ASD and simulated data for the smallest temporal jitter of 2–10 ms coinciding with the first evoked peak (
Figure 12 shows the ERSP values for the simulated temporal jitter and amplitude variable of the NT data. The temporal jitter (columns) did not have a large impact on the ERSP values, while the amplitude variability resulted in an increase in ERSP values from ~50 to ~140 ms between 4 and 40 Hz followed by a decrease in ERSP values from ~150 to 400 ms between 4 and 30 Hz for both conditions.
Figure 13 shows the FDR-corrected comparison of the simulated data with the ASD data, the analysis revealed larger ERSP values for the simulated data. For the visual response, there were significant differences between the largest amplitude variable range (0.25–1.75) noise-simulated and the observed data in frequencies above 20 Hz onsetting at ~100 ms. For the somatosensory response, there were statistical differences onsetting at ~70 ms across all frequencies for two simulated data sets with the variable amplitude gain ranges (0.5–1.5) and (0.25–1.75). The uncorrected comparisons illustrated in
A temporal jitter in the simulated data resulted in significant differences in single-trial reliability (ITC). Similarly, introducing amplitude variability in the simulated data resulted in significant differences in the single-trial power (ERSP). These simulated effects of signal variability clearly illustrate the theoretical predictions of an unreliable evoked response, predictions that are not supported by comparison between the ASD and NT responses for either visual or somatosensory stimulation.
The electrophysiological data presented here point to a very high degree of reliability in the evoked responses of individuals with an ASD, a degree of reliability that is essentially indistinguishable from that measured in age-, gender-, and IQ- matched NT controls, contrary to the predictions of the neural unreliability thesis. We outline in the next sections that while there is certainly accumulating evidence for specific atypicalities in sensory processing in ASD, these processing differences do not conform to an account that proposes a straightforward indiscriminate bottom up failure, as implied by the unreliability thesis. Rather, more nuanced models will be required to account for the specificity of the sensory processing deficits that are emerging in ASD.
There are a number of models of autism that can be classed as domain general in that they propose system-wide neuropathological deficits of one variety or another. For example, the “central coherence” thesis proposes a deficit in gestalt or global processing, and as such, all assays of said functions would be expected to reveal deficits (Frith and Happe 1994). The unreliability thesis falls into this class of model in that a general deficit in producing consistent responses to events in the environment would surely be expected to impact the responsivity of all sensory systems and to be largely agnostic as to stimulus class or feature content. In the paper of Dinstein et al. (2012), this is precisely the implication, since they reported unreliability of the BOLD response across all 3 major sensory systems. The current work makes it clear, however, that trial-to-trial reliability of the sensory evoked response in ASD, in this case to simple visual and somatosensory inputs, is indistinguishable from that recorded in NT children. Clearly, when the goal of a study is to establish that a difference does not exist, there is the ever-present issue of embracing the null hypothesis. In turn, it is at least possible that small subtle effects will have been missed. The important question, however, is whether subtler differences in reliability would have any real implications for processing and if they could truly be responsible for the complex phenotype we observe in ASD. We consider this to be highly unlikely.
There are a number of other reasons to question a domain general unreliability account. For instance, it is difficult to reconcile such an account with a growing body of work that shows typical sensory thresholds, and in some cases, even more acute sensory processing skills in ASD when specific stimulus conditions pertain. For example, Blakemore et al. (2006) assessed somatosensory detection thresholds in individuals with Asperger syndrome, finding no differences in their thresholds relative to neurotypicals for a 30-Hz stimulus, but in fact, showing greater sensitivity in the ASD group for a 200-Hz input (Blakemore et al. 2006). While studies claiming enhanced visual acuity in ASD have been rightly debunked, what is clear is that visual acuity appears to be substantially normal in ASD, a finding supported by a large number of studies (see Albrecht et al. 2014 for discussion). In our own work, we recently found that recognition of monosyllabic words buried in varying levels of background noise was entirely typical in a large cohort of ASD children ranging in age from 6 to 17 years (Foxe et al. 2015). It seems logical that neural unreliability would be expected to detrimentally impact such functions across the 3 major sensory systems, and so these and the many other studies reporting normal sensory functions simply do not fit well with the basic predictions of the model.
If we turn to the event-related potential (ERP) literature in ASD, we again find patterns of results inconsistent with the central premise of the unreliability theory. Clearly, a full exposition of ERP studies conducted in ASD populations to date is beyond the scope of this paper, but let us consider a few examples. Before doing so though, it is worth considering again what the unreliability thesis would presumably predict regarding the ERP. In most of the prior work, only the traditional averaged evoked potential was shown, and as Dinstein and colleagues argued in the context of hemodynamic responses, a lack of differences between groups in the average evoked response might belie the fact that there are real differences in within-subject trial-to-trial variability. However, in the case of the ERP, unlike the BOLD response, trial-to-trial variability would be fully expected to lead to observable differences in the averaged response. This can be readily seen in the simulation data we present here. As is intuitive, the temporally unreliable simulated data resulted in a broader averaged evoked response and lower reliability of the evoked response across single trials than the observed data, as measured by ITC. The simulation of variable amplitude resulted in no change of the average evoked responses but higher power across single trials than the observed data as measured by ERSP. Furthermore, the simulated data illustrate the sensitivity of such time-frequency analyses to capture temporal jitters and amplitude variability in the evoked response. These differences were simply not in evidence when we compared the NT and ASD groups’ data. Neither are we aware of any previous ERP study that claims to show slowing and broadening of the ERP. Rather, with relative consistency, the averaged evoked response shows a highly typical pass-band (i.e., frequency spectrum).
For example, in a recent study where we recorded the VEP to central and peripheral visual inputs, the VEP response to centrally fixated stimuli was of entirely typical amplitude, morphology and spectral content in ASD children. The main finding of the study, however, was that the VEP to more peripheral stimulation was actually significantly more robust in ASD, a finding that could well be interpreted to reflect more reliable neural responses (Frey et al. 2013). Another nice example comes from a study by Jemel et al. (2010) where they used the VEP to assess the contrast response function of the visual system to inputs of varying spatial frequency. They showed that the VEP was indistinguishable between ASD adults and their NT controls to both low and high spatial frequency inputs but that the ASD response to medium frequency inputs was similar to that recorded to high frequency inputs. Thus, the VEP was entirely robust under all testing conditions, but there were idiosyncratic response characteristics in ASD that pointed to highly specific differences in sensory processing. The point here, as above, is not to exhaustively survey all ERP studies in ASD, but to point to just a few examples of the many studies where highly typical and robust ERPs were recorded. In this context, it is just as important to note that there are also many reports of differences in the sensory processing of auditory and visual information in ASD (Gandal et al. 2010; Dinstein et al. 2011; McFadden et al. 2012; Edgar et al. 2013; Roberts et al. 2013; Brandwein et al. 2015; Murphy et al. 2014; Port et al. 2016). The point that we wish to make here is simply that such findings are not universal and thus differences between ASD and NTs cannot result from a blanket unreliable evoked response but are much more likely to result from more complex stimulus-specific interactions.
Limitations of Hemodynamic Imaging Techniques to Address the Unreliability Thesis
The most prominent of the studies to propose an unreliable evoked response in people with ASD was conducted using neuroimaging techniques (Dinstein et al. 2012). One obvious issue with inferring response unreliability from measures of cerebral blood flow relates to the complex sequence of neurovascular and metabolic events that is intermediary between the primary neural electrical activity and coupled changes in blood flow. It is certainly conceivable that this coupling could be more variable in ASD than it is in NTs, and it is also entirely plausible that variability in this coupling could occur without any measureable loss in the fidelity of the primary signal. Certainly, the cascade of signals that result in increased blood flow following neural activity undergoes significant changes across development (Harris et al. 2011). Another issue pertains to the manner in which the shape of the BOLD response is typically estimated, with the vast majority of studies relying on standard general linear models and assumptions about canonical hemodynamic response morphologies. Ongoing work suggests that this is almost certainly an oversimplification (e.g., Magri et al. 2011). This is not an issue in EEG as no assumption is made about the evoked response in the analysis.
The BOLD response represents an integrated proxy measure of all activity, both feedforward, feedback and later cognitive processes that are collapsed across time given the limited temporal resolution of the measure. Thus, one could well imagine a scenario where the ERP teases apart early from late processes. One way to study this issue going forward would be to make simultaneous hemodynamic and EEG recordings to assess whether variability in the BOLD response within an individual has real impact on the simultaneously recorded electrical response.
With regard to the Dinstein study, it is also worth pointing out that 5 of their ASD participants were taking medication, which could certainly have impacted neurovascular coupling, although a limited analysis of the remaining 7 participants supported the main findings of the paper. Nonetheless, selective serotonin reuptake inhibitors, a relatively commonly used class of drugs in ASD, have been repeatedly associated with significant changes in the BOLD response (Windischberger et al. 2010). Again, it seems entirely possible that medication-driven changes in neurovascular responsivity could be uncoupled from the feedforward electro-cortical sensory response.
Considerations When Comparing NT and ASD Differences
In the current study, the signal to noise, number of accepted trials, ERSP and ITC were not statistically different between the ASD and NT groups for either the somatosensory or visual responses. Furthermore, correlation analysis showed a relationship between the SNR and the ITC values for different frequency bands, which was highly similar across the groups. While this relationship is not surprising and points to SNR being an indirect measure of single-trial consistency, it is not a one-to-one mapping. In future studies where there are SNR differences between the groups, it will be very important to note the number of accepted trials, as this has an impact on both ITC and SNR (seeCohen 2014). Furthermore, it is important to address the source of the differences and whether they are cortical in nature or result from artifacts such as muscle tension or participant movement. One way of addressing this in electrophysiological recordings is employing signal processing methods such as independent component analysis to separate artifactual activity from cortical activity (Delorme and Makeig 2004; Milne et al. 2009, 2012; Nolan et al. 2010; Delorme et al. 2012).
Another consideration when investigating differences between ASD and NT groups pertains to differences in neuropsychological data such as diagnosis severity or IQ (Dinstein et al. 2012; Brandwein et al. 2015; Flevaris and Murray 2014). For example, Dinstein et al (2012) found a correlation of FSIQ and SNR of sensory response in ASD participants; they did not conduct the analyses on the NT participants. We did not find a significant correlation between FSIQ and SNR for either group (see
We employed sensitive measures capable of detecting small variations in neural reliability and were unable to find differences in the neural responses to visual and somatosensory stimuli in a large sample of NT and ASD children. We conclude that support for a domain general neural unreliability thesis of Autism is not compelling at this stage and that sensory processing deficits in this population are likely to be specific to particular stimulus classes and task contexts.
This work was supported by a grant from the U.S. National Institute of Mental Health (NIMH RO1 MH085322 to J.J.F. and S.M.). Additional support was provided by The Nathan Gantcher Foundation (J.J.F. and S.M.) and The Women's Division of The Albert Einstein College of Medicine. Participants in this study were recruited and evaluated at The Human Clinical Phenotyping Core, a facility of the Rose F. Kennedy Intellectual and Developmental Disabilities Research Center (IDDRC) which is funded through a center grant from the Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD P30 HD071593).
We extend our deepest thanks to the families involved in this research for their time, patience, and care. We thank Frantzy Acluche, Victor Del Bene, and Alice Brandwein, and all the members of our research team who assisted in data collection for this project. Conflict of Interest:None declared.