Predicting the fMRI signal fluctuation with echo-state neural networks trained on vascular network dynamics

Resting-state functional MRI (rs-fMRI) studies have revealed specific low-frequency hemodynamic signal fluctuations (<0.1 Hz) in the brain, which could be related to oscillations in neural activity through several mechanisms. Although the vascular origin of the fMRI signal is well established, the neural correlates of global rs-fMRI signal fluctuations are difficult to separate from other confounding sources. Recently, we reported that single-vessel fMRI slow oscillations are directly coupled to brain state changes. Here, we used an echo-state network (ESN) to predict the future temporal evolution of the rs-fMRI slow oscillatory feature from both rodent and human brains. rs-fMRI signals from individual blood vessels that were strongly correlated with neural calcium oscillations were used to train an ESN to predict brain state-specific rs-fMRI signal fluctuations. The ESN-based prediction model was also applied to recordings from the Human Connectome Project (HCP), which classified variance-independent brain states based on global fluctuations of rs-fMRI features. The ESN revealed brain states with global synchrony and decoupled internal correlations within the default-mode network.


INTRODUCTION
Neural oscillations have been extensively studied in both animal and human brains from cellular to systems levels [1][2][3][4] . Power profiles of EEG signals, as well as slow cortical potentials (SCP), exhibit a slow oscillation feature (<1 Hz), which is related to brain states mediating memory, cognition and task-specific behaviors [5][6][7] . Resting-state functional MRI (rs-fMRI) studies have revealed low-frequency hemodynamic signal fluctuations (<0.1 Hz) [8][9][10][11] , which have been confirmed by intrinsic optical imaging 12 , laser-doppler-flowmetry 13 , and near-infrared spectroscopy 14 . In particular, specific spatial correlation patterns can be observed in the slow oscillation of the rs-fMRI signal, e.g. the default-mode network (DMN) [15][16][17] . Concurrent fMRI and electrophysiology studies have shown a correlation of the fMRI signal fluctuation with the EEG signal power profile and SCP low-frequency oscillations, which are candidates for neural correlates of the rs-fMRI signal [18][19][20][21][22][23] . In addition, the slow oscillation of rs-fMRI and hemodynamic signals from vessels are highly correlated to simultaneously acquired intracellular Ca 2+ signal fluctuations in rodents [24][25][26] , which are higher-resolution correlates of the hemodynamic rs-fMRI signal.
Efforts have been made to interpret functional indications of the rs-fMRI spatial correlation patterns, including the dynamic correlation mapping scheme [27][28][29] , and arousal state-dependent global fMRI signal fluctuation studies [30][31][32] . Because of the high variability in different dynamic states, physiological and non-physiological confounding factors also contribute to the rs-fMRI lowfrequency oscillation [33][34][35] . In particular, global fMRI signal fluctuations are one of the most controversial oscillatory features to be linked to dynamic brain signals [36][37][38][39][40][41][42][43][44] . For example, the rs-fMRI signal from the white-matter tract has been used as a nuisance regressor to remove the global noise contribution 45,46 . Interestingly, simultaneous fMRI and EEG studies in the monkey brain demonstrate a strong linkage of brain state changes to the global rs-fMRI signal fluctuations. This phenomenon has been observed at the level of single-vessel fMRI dynamic mapping with concurrent calcium recordings, which show stronger neural correlation with the fMRI signal detected from individual penetrating vessels than the rest of voxels through the whole rodent cortex 24 . This highly coherent vessel-specific fMRI signal fluctuation is a direct signal source that is closely linked to global brain state changes. Here, we applied the artificial state-encoding neural network system in a prediction scheme to better model the brain state-specific coherent oscillatory features from the vessel voxels.
The echo state network 47 (ESN), a recurrent neural network (RNN) based on reservoir computing 48,49 , provides a computational framework for temporally predicting dynamic brain signals. The ESN's main component is a dynamic reservoir consisting of recurrently connected computational nodes (neurons) that encode temporal patterns of input signals, i.e. the vesselspecific rs-fMRI signal, into a state matrix. The second element of an ESN is a linear decoder, which generates predictions based on reservoir's internal states. ESNs have been successfully used for time series prediction 50 , estimating directed connectivity 51 modeling nonlinear systems 52 and superimposed oscillators 53 , and local cortical dynamics 54 . Other artificial neural networks have been applied to fMRI data to encode brain dynamics with the goal of characterizing psychiatric diseases 55,56 , modeling task or sensory-evoked activation [57][58][59] and decoding task or stimuli properties from fMRI activity 60,61 . Artificial neural networks can depict dynamic brain signals over a range of time scales and contexts [62][63][64][65][66] .
In the present study, ESNs were trained to predict dynamic changes of single-vessel rs-fMRI signal fluctuations from the brains of rats and humans. fMRI recordings from single blood vessels 24,67 were used to extract highly correlated vessel-specific fMRI signals from venules or veins, which have been shown to be directly correlated to the underlying intracellular Ca 2+ signal fluctuation from neurons in rat brains 24 . Vessel-specific fMRI signals were used as training data to extract highly correlated slow oscillation features with varied noise profiles and extract brain state-dependent global fMRI signals. The ESN reservoirs and decoders predicted the temporal evolution of slow oscillations of the fMRI signal 10 seconds into the future. The trained network differentiated the global fMRI signal fluctuation from the DMN-specific temporal dynamic patterns in the Human Connectome Project (HCP) data 68 .

RESULTS
Two datasets were used in our study, one from rodents and another from humans. In the first part of the study, we trained an ESN to encode temporal dynamics of BOLD-fMRI signals from individual vessels in anesthetized rat brains to estimate the prediction efficiency. In the second part of the study, we trained an ESN to predict the slow oscillation of fMRI signals from occipital lobe sulcus veins of awake human subjects and applied the ESN trained on human data to classify the brain-state changes from rs-fMRI data acquired by the Human Connectome Project.

Extracting slow oscillatory features of the single-vessel fMRI signal from rat brains
We used recordings obtained from a balanced steady-state free precession (bSSFP) sequence 69 on single-vessel fMRI data from anesthetized rats 24 . Arteriole-venule (A-V) maps based on the multi-gradient-echo (MGE) sequence were acquired to localize individual venules penetrating the cortex, which were shown as dark dots due to the fast T2* decay of the deoxygenated blood (Fig. 1a) 67 . After registering functional data with the A-V map, fMRI time courses from individual venules were extracted and analyzed using independent component analysis (ICA) [70][71][72] . Fig. 1b shows the time series of the largest ICA component, which is dominated by the low frequency fluctuation (<0.1 Hz). The superposition of this ICA component with the single-vessel fMRI signal fluctuation on the A-V map overlapped with venule-dominated patterns (Fig. 1c). Fig. 1d shows the raw bSSFP-fMRI signal fluctuation from three venules, as well as their power spectral density (PSD) plots. These data presented highly coherent oscillatory features of single-vessel fMRI signals, which can be used as a training set. Fig. 2 illustrates the basic schematic of the ESN-based prediction. The single-vessel fMRI signals showing a strong slow oscillatory correlation (Fig. 1) were used as input time series for the supervised training. As described in the Methods section, a recurrent network-based reservoir was predefined to encode the state of the input signal's temporal dynamics. As the key component of the reservoir, the state-weighting vector was optimized to produce output predictions based on supervised training. The targets of the output were bandpass-filtered fMRI signals from the voxels of the same vessel with a 10 s time shift. Pearson correlation analysis was performed to estimate the correlation coefficient (CC) between ESN's output predictions and the filtered target signals, to measure of the ESN's performance. We used random search 73 and cross validation to find the set of hyperparameter values that produced the best performing echo state network (ESN) (Fig.   S1, see details in the Methods section).

Fig. 2 | Prediction system operation pipeline.
Raw vascular data are extracted from fMRI data using venule and ICA masks. These temporal signals are inputs to the ESN; they are also bandpass filtered and shifted by 10 seconds to become target outputs of the network. The reservoir encodes the temporal dynamics of input signals into state vectors. The decoder interprets these states and generates a prediction of the slow fluctuation's value 10 seconds ahead. After generating the full predicted time series, the prediction is compared with the target output using Pearson's correlation coefficient.

ESN-based single-vessel fMRI slow oscillation prediction in anesthetized rats
We first illustrate the predictive capacity of the trained ESN by analyzing the correlation coefficients across all cross-validation tests. Fig. 3a demonstrates the CC of the slow oscillation prediction of all vessels from a representative rat. For each vessel, we generated a surrogate control time course that mimicked the frequency power profile of the fMRI signal. To differentiate the control dataset from true brain dynamic signals, we randomized the phase distribution of its ). c, The signal from a single vessel with the best prediction score (CC = 0.53, t lag =-2 s; black -raw data, green -target signal, blue -network prediction). d, Surrogate signal created to match the real vascular signal shown in c (CC = 0.32, t lag =-1 s; black -raw data, green -target, red -network prediction). e, Mean prediction scores for trials extracted from five rats (blue) and their corresponding controls (red). f, Significantly higher mean of different rats' real data prediction scores (CC=0.29 ± 0.01 s.e.m.) than controls (CC = 0.24 ± 0.01 s.e.m.; paired-sample t-test, p=9.5*10 -26 ). g, Predictions of singlevessel signals from two different rats (v1, CC = 0.55, t lag =-3 s ; v2, CC = 0.55, t lag = 0 s). frequency components 74,75 (Fig. S2, see Methods). The ESN prediction performance showed significantly higher mean CC for fMRI data (CC = 0.34 ± 0.01 s.e.m.) than surrogate controls (CC = 0.24 ± 0.01 s.e.m.) (Fig. 3b). Fig. 3c shows the predicted ESN time course from the vessel with the highest prediction score (CC = 0.53, tlag = -2 s) in contrast to the surrogate control signal corresponding to the same vessel (CC = 0.32, tlag= -1 s). This shows that the trained ESN was better at predicting the fMRI signal fluctuations.
In addition, the ESN trained on one rat was used to predict the fMRI fluctuation of five different rats. Fig. 3e demonstrates trial-specific plots of mean CCs from all vessels in comparison to their surrogate controls (380 vessels from 5 rats), showing significantly higher CC of the fMRI signal than that of surrogate controls (Fig. 3f).

ESN-based single-vessel fMRI slow oscillation prediction in awake human subjects
As previously reported 24,76 , the fMRI signal from sulcus veins of the occipital lobe demonstrated highly correlated slow-oscillatory features (Fig. 4). The vein-specific rs-fMRI signal fluctuations were recorded with high-resolution EPI-fMRI with 840 x 840 µm in-plane resolution In contrast to the multi-trial single-vessel rat fMRI studies, only one trial (15 min) was acquired from each human subject (159 veins from 6 subjects). To perform the supervised training, we designed the 5+1 cross-subject validation process (trials from 5 subjects were used for training, and the sixth trial was used for test validation). Specific surrogate control time courses were created based on PSD profiles of fMRI signals acquired from individual veins in the human brain. Using the trained ESN, higher CC values were obtained by predicting slow oscillatory fMRI signals of individual veins compared to their surrogate controls (Fig. 5a), demonstrating a significantly higher mean CC value for brain dynamic signals (CC = 0.33 ± 0.01 s.e.m.) than for control datasets (CC = 0.27 ± 0.01 s.e.m.) (Fig. 5b). Also, the histogram of the cross-correlation lag times of the predicted and reference time courses showed the median of the lag time equal to 0, demonstrating the effective prediction. Figure 5d shows an example of a predicted slow oscillatory time course from a human subject based on the trained ESN (CC = 0.54, tlag = -2 s).  The trained ESNs predicted artificial time courses with a range of peak frequencies and spectral widths (Fig. S4a,b). The predicted spread of the signal spectra preference for the ESNhuman was greater than for ESNrat as shown in the two-dimensional graphs of peak vs. width of the CC distribution (Fig. S4c,d). These species differences may reflect the difference in their rs-fMRI. Interestingly, the harmonic patterns had negative correlations for the preferred frequency, which could be a consequence of the trained ESNs favoring the dominating frequency ranges with the 10 s prediction interval. ). c, Histogram of lags at which the correlation between target outputs and network prediction was the highest. Distribution centered on 0 s (median = 0 s) indicates that the prediction wasn't simply the filtered input. d, Prediction plot of the signal that obtained the highest score among all training human vessels (CC=0.54, t lag =-2 s; black -raw data, green -target, blue -network prediction). e, Prediction plot of the surrogate control signal created based on the real vascular signal shown in d (CC=0.30, t lag =-2 s; blackraw data, green -target prediction, red -network output).

ESN-based prediction of the fMRI slow oscillation in the visual cortex (V1) of HCP data
Previously, we showed that smoothed single-vessel rs-fMRI correlation maps mimic conventional correlation maps in the human occipital area 24 . As shown in the PSD plots (Fig. 4), the vessel-specific fMRI slow oscillation dominates the 0.01-0.1 Hz frequency range. To examine whether the ESN trained by the single-vessel fMRI scheme can be used to predict the fMRI slow oscillation of a broader range of datasets, we applied the trained ESN to predict the rs-fMRI signals from the V1 of HCP data (a total of 3,279 rs-fMRI sessions; V1 signal extracted from left and right hemispheres separately, yielding 6,558 time courses resampled at 1 s TR, details in the Methods section). To examine the predictive capacity of the ESN on each trial of the HCP dataset, the CC of all prediction trials were plotted in a histogram. The CC distribution resembled a normal distribution centered on 0.28 (median) (Fig. 6a).
Next, we selected two clusters of the HCP dataset based on their CC (top 5%, bottom 5%), showing the top 5% trials with high power levels and the bottom 5% trials with low power levels at the 0.01 -0.1 Hz frequency range (Fig. 6b). The lag time distribution of the top 5% group is centered at zero, unlike the bottom 5% group covering the whole range of lag values (Fig. 6c).
In particular, many lag values of the poorly predicted sessions show a delay of more than the full wavelength of ESN's preferred frequency. Fig. 6d shows three predicted slow oscillatory time courses from the HCP rs-fMRI sessions (top 5% group) (CC1 = 0.54, tlag,1= -2; CC2 = 0.61, tlag,2 =2; CC3 = 0.61, tlag, 3= 2). The predictions of the ESN were dominated by the low-frequency power in the rs-fMRI signals from individual trials.

ESN-based brain state classification from HCP data
Here, we analyzed whole-brain correlation patterns of the HCP dataset, mainly focusing on the top and bottom 5% datasets from ESN predictions. Fig. 7 shows flattened cortical difference maps of seed-based correlations calculated for the two groups of HCP datasets. First, rs-fMRI time courses from the V1 ROI, the whole cortex (global mean), and the DMN were used to calculate voxel-wise correlation maps for all trials in the top and bottom 5% groups. These were then group-averaged and subtracted to create the correlation difference maps (Fig. 7a-f, the representative time courses from 4 subjects in each group were shown in Fig. S5). Similar differential patterns were detected for V1 ROI and global mean time courses, showing significantly stronger correlation pattern for the top 5% group covering most of the cortical regions ( Fig. 7a-d).
Although there is higher correlation to other cortical regions in the maps with the DMN-specific seed, no significant correlation differences were detected among DMN areas between the top and bottom 5% groups (Fig. 7e, f).
We used DMN and visual ICA component time courses to compare their seed-based difference maps in order to better characterize the brain-state specific differences among restingstate networks classified by ESN predictions. The visual ICA-based map resembled the V1 ROI seed-based result (Fig. 8a-c). The DMN ICA signals are free of global signal contributions and represent intrinsic DMN activity. Using them enables the characterization of DMN-specific a, The V1 seed region is marked with a blue border. Visual, sensorimotor and auditory areas display high increases in correlation. Nodes in which the difference was insignificant are masked. b, Same as a but without the mask. c, The average time course of the whole cortex served as the seed. The map resembles the result generated using V1 seeds (shown in a), suggesting that V1 signals extracted from the "top" group were mostly driven by the global signal. Nodes in which the differences were insignificant are masked. d, Same as c but without the mask. e, The average time course of the DMN ROIs served as the seed (marked with white borders). Nodes in which the difference was insignificant are masked. Despite showing increased synchrony with the areas dominated by the global signal, ROIs constituting the DMN don't show significant differences between the groups. f, Same as e but without the mask.
correlation patterns independent of the increase in global synchrony. The DMN ICA-based correlation maps of the top 5% subjects had reduced correlation, in particular showing significantly lower correlation features inside the DMN nodes compared to the bottom 5% sessions (Fig. 8df). The distinction of DMN-specific inter-network correlation was also presented in the correlation matrices based on the 360 ROIs predefined from the brain atlas 77 (Fig. S6a). Correlation matrices computed for subcortical ROIs show increased correlation between the hippocampus and the brainstem with the global signal (Fig. S6b). These results indicate that the ESN-based classification can be used to differentiate the brain-state dependent rs-fMRI signal fluctuations in the HCP datasets. ( Fig. S7a-c). The CC values of the rs-fMRI variance-dependent groups for ESN-based predictions cover the total distribution range. In particular, the top and bottom 5% variance groups had much broader CCESN values and largely overlapped each other in the histogram plot (Fig. S7a).
Similarly, the variance values of trials with the top and bottom 5% of ESN-predicted CC scores overlap and cover the whole range of the variance distribution (Fig. S7c). Also, no significant reduction of the internal DMN correlations were observed between the two variance-based groups ( Fig. S7d-f). These results indicate that the ESN trained with vessel-specific rs-fMRI signals encodes specific brain state dynamic changes, which is less dependent on the variance of the rs-fMRI signal fluctuation.
In addition, we compared the ESN-based predictions with two other prediction schemes, the autoregressive moving average with exogenous input (ARMAX) modeling 78,79 and RNNs trained with the backpropagation algorithm 80,81 . The best ARMAX models were found using an exhaustive grid search. The architectures and hyperparameters of the backpropagation-RNNs, the gated recurrent unit (GRU) 82  showing similar distributions of the correlation coefficients of the top and bottom 5%, which was not the case with the ARMAX method (Fig. S8c,d). Also, the DMN ICA-based correlation differential maps from ESN and GRU methods showed a similar spatial pattern with significantly reduced correlation inside the DMN nodes of top 5% in comparison to the bottom 5% sessions, which was not detected by the ARMAX method (Fig. S8e). These results further confirmed the reliability of the RNN methods (both ESN and the backpropagation based RNNs) to classify the brain state-specific rs-fMRI signal fluctuations.

DISCUSSION
We used the time courses of single-vessel rs-fMRI signals as inputs to train ESN networks to predict the rs-fMRI signal 10 s ahead in both rodents and humans. We also showed that the single-vessel fMRI-based training leads to ESN encoding specific to global fMRI signal fluctuations. The trained network was used to analyze HCP datasets with diverse brain states.
For example, it allowed us to identify sessions with strong global synchrony and to decouple the global signal fluctuations from internal DMN correlations.
We selected the input fMRI time series from individual vessel voxels based on a previously established single-vessel fMRI mapping method 24,67 . The BOLD fMRI signal has a direct vascular origin based on the oxy/deoxy-hemoglobin ratio changes [87][88][89] . The high-resolution single-vessel mapping method allows us to directly extract the venule-dominated BOLD signals with a much higher contrast-to-noise ratio (CNR) than the conventional EPI-fMRI integrating the BOLD signal from both tissue and vessels in large voxels 24,67,90,91 . Although different vessel voxels may present cardiorespiratory noises, e.g. the respiratory volume change 33,92 or the heartbeat variability 93, 94 , a recent simultaneous fMRI and fiber-optic calcium recording study showed strong correlation of the major ICA vascular component of the rs-fMRI signal fluctuation ( Fig. 1) with the calcium signal oscillation 24 . Also, these global hemodynamic signal changes are directly correlated with the calcium signal fluctuation through the whole cortex based on optical imaging 25,26 . Thus, the global fMRI signal fluctuation detected from individual vessels represents changing brain states, and not the non-physiological confounding artifacts uniformly distributed through the brain, e.g. the respiration-induced B0 offset 95 or other sources 34,96 . In comparison to the voxelwise or ROI-based time courses from low-resolution EPI images, the single-vessel rs-fMRI signal provides highly selective datasets for the supervised ESN training to encode brain-state dependent global fMRI signal fluctuations.
The predictions from the trained ESN's vary across vessels as well as across trials. To validate this measurement, we used surrogate controls designed using the IAAFT method 75 . For every vessel, we generated an artificial signal showing a similar frequency power profile (Fig. S2) to its corresponding single-vessel rs-fMRI time course, but with randomized phases of complex Fourier components. It has been shown that high-frequency EEG power profiles are highly correlated to the low-frequency EEG signal fluctuation, i.e. phase-amplitude coupling (PAC), in both cortical and subcortical regions for a variety of brain states [97][98][99][100][101][102] . This feature has also been used for the correlation analysis of the concurrent EEG and rs-fMRI signal recordings from animals and humans 18,19,21,23,[103][104][105] . Our analysis confirms that the phases of the slow oscillatory rs-fMRI signal carry critical dynamic brain state features 3 . By randomizing the phases, the surrogate control excludes dynamic brain features but preserves a high similarity in terms of the signal amplitude/power spectral distribution and autocorrelation structure for the verification of the ESN encoding. Also, the spectral characteristics of the ESNs demonstrate different preference maps in terms of the center frequency and the bandwidth depending on the training data from either rat or human data (Fig. S4). These training data showed differences in frequency power profiles given the inter-species diversity 106 and the presence of anesthetics [24][25][26][107][108][109] .
The global rs-fMRI signal is a critical confound of correlation analysis with many contributing factors from both physiological and non-physiological sources. In particular, whether the global mean fMRI signal should be removed before the analysis, which can create spurious correlation features, has been debated [36][37][38][39][40][41][42]44 . Also, the global rs-fMRI signal can over-shadow specific intrinsic RSN features, e.g. the anti-correlation of the DMN and task-positive RSNs [110][111][112] .
One intriguing observation based on ESN-predicted results shows that the internal DMN correlations of the top 5% ESN performance group are reduced compared to the bottom 5% group ( Fig. 8d-f), which is opposite to changes in the global correlation through the whole brain (Fig. 7,   8). It is noteworthy that the decreased internal DMN correlations are not visible through variancebased approaches (Fig. S7d-f). Thus, independent of the variance analysis method, the ESNbased approach reveals brain-state specific rs-fMRI signal fluctuations in the HCP datasets.
The contrast between internal DMN correlations and whole brain correlation patterns supports other sources of evidence that the global signals are dissociated from intrinsic brain network correlations 32  recording study has shown that the rs-fMRI fluctuation can be regulated by the arousal ascending pathway through the central thalamic nuclei and midbrain reticular formation 31 , implicating the subcortical regulation of the rs-fMRI signal fluctuation as previously reported from both nonhuman primate and human rs-fMRI studies 30,32,43 . Importantly, we also observed that the singlevessel rs-fMRI signal is specifically coupled to the global neuronal signal fluctuation 24 , which supports our single-vessel ESN training scheme to encode the brain-state specific global rs-fMRI signal fluctuations.
Thus, the ESN-based approach provides a variance-independent scheme to differentiate the global rs-fMRI fluctuation of the dynamic brain states. A promising direction for future work involves applying the proposed method to study the predictability of slow fluctuations in brain regions other than sensory cortices and to investigate which factors, besides arousal-related brain state changes, drive the predictions. Extending the platform to process whole-brain signals would provide a more synoptic view of the regularities present in brain dynamics in different states.
Finally, the method could be integrated into a real-time fMRI platform to provide feedback stimuli and close the loop.

Echo state network
To encode the dynamics of spontaneous slow fluctuations we used the echo state network (ESN) 47

ESNstate
The It is important to note that the input might be extended by a constant input bias so that besides the vascular input, every neuron is also driven by a constant at every time step.

ESNreadout
To generate the prediction, the decoder processes the ESN's state. In the case of employing a linear readout, the output of the network is computed as: where is the output vector weighting the ESN's state, which is learned by supervision. Once the reservoir has been fixed, input data are fed in it to generate the states matrix according to eq. (4). The output vector is then computed using the target outputs and the matrix containing reservoir states generated using all the training inputs: where + is the Moore-Penrose pseudoinverse of the states matrix .

Random search optimization
For a given ESN, the weight matrices , , defining its reservoir remain fixed after being initialized. They are not changed through training. To initialize the reservoir, a few hyperparameters need to be specified 124 . The performance of a single ESN instance is largely dependent on the choice of these hyperparameters and the randomness involved in weight initialization. There is no set of parameter values that would lead to good ESN performance on all problems posed. As the process of training a single ESN is not computationally expensive compared to other methods like e.g. network training using backpropagation, many reservoirs might be evaluated. Random search is a simple yet effective way of exploring the hyperparameter space for a given task 73 . For every parameter whose value isn't fixed, a range of possible values is specified and every time an ESN is generated, the parameter values are selected at random from the specified ranges. Then, for the defined reservoir, the readout is optimized. Lastly, the trained ESN is evaluated, its performance is compared with other ESN instances and the best performing network is selected. The optimized hyperparameters are described in Table 1. A key parameter of the network is its spectral radius -the highest absolute eigenvalue of the internal connectivity matrix. The magnitude of this parameter influences the temporal length of the reservoir's memory property 125 . It needs to be set in a way that grants the reservoir the echo state property (ESP), meaning that the ESN's state will become independent of its initial conditions. It has been empirically observed that setting a value lower than 1 usually guarantees that the reservoir will possess the ESP 124 . o Probability of keeping a bidirectional connection.
Scale-freenetworks characterized by the power law decay of the probability P(k) that a given node is connected to k other elements. Directed adjacency matrices created using the Barabasi & Albert algorithm 127,128 have been generated and connection weights have been drawn from specified ranges.
Scale-free network parameters: o The initial number of vertices.
o The number of connections added with every new node.
Randomno structure is imposed on the connectivity pattern. The only parameter of the adjacency matrix is its connection density. Washout time The number of input signals' time points used to drive the reservoir into a dynamical state that is specific to a given input and independent of the initial state (due to the echo state property). The states generated for these points are not used for readout training and prediction. The necessary washout time might be reduced by using an appropriate initial state.

GRU and LSTM
The predictions of two other RNN models were compared with ESN's predictions. Gated recurrent unit (GRU) 82 and long short-term memory (LSTM) 83 where is the cell state, , , , are the input, forget, cell and output gates. As in the ESN, in both GRU and LSTM networks, a linear readout was used to generate the prediction based on the state vector: The networks were trained in PyTorch 129 . The same data and cross-validation procedure as during the ESN's training were used. The hyperparameters were found with Bayesian optimization using the tree of Parzen estimators algorithm (Hyperopt toolbox, n=600) 85,86 . The optimized hyperparameters have been described in Table 2.

ARMAX
The autoregressive-moving-average model with exogenous inputs (ARMAX) 79 was used as a comparative prediction method. ARMAX aims to model a time series using autoregressive, moving-average and exogenous input terms. This is depicted in the equation: MATLAB armax and forecast functions were used to find the coefficient values and evaluate the models. ARX and ARIMAX models were also tested but yielded worse performances, hence are not reported.

Experimental procedures
Single-vessel fMRI data acquired from 6 rats and 6 human subjects have been previously published 24 . The rats were imaged under alpha-chloralose anesthesia. For details related to the experimental procedures refer to 24,132 .

Rat MRI data acquisition
The measurements have been performed using a 14.1 T/26 cm horizontal bore magnet (Magnex) interfaced with an Avance III console (Bruker). To acquire the images a 6 mm (diameter) transceiver surface coil was used.

MGE A-V map acquisition in rats
To detect individual blood vessels a 2D Multiple Gradient-Echo (MGE) sequence was used. The The A-V maps enable identifying venule voxels as dark dots due to the fast T2* decay and arteriole voxels as bright dots because of the in-flow effect.
Subjects had their eyes closed during each 15 minute trial. Respiration and pulse oximetry were simultaneously monitored using the Siemens physiologic Monitoring Unit (PMU).

Data preprocessing
All data preprocessing was done using MATLAB and the Analysis of Functional Neuro Images (AFNI) software package 133 . The functional data were aligned with the A-V map using the mean bSSFP template and the 3dTagAlign AFNI function with 10 tags located in the venule voxels.
Other details of the preprocessing procedure are reported in a previous study 90 . No spatial smoothing was done at any point.

Localization of individual veins
To localize venule voxels in A-V maps, local statistics analysis and thresholding were performed using AFNI. First, for each voxel, the minimum value in a 1 voxel-wide rectangular neighborhood was found. Then, the resulting image was filtered with a 10 voxel rectangular rank filter and divided by the size of the filter. Finally, the image was thresholded to create a mask with vein locations. For human data, the mean of EPI time series was used instead of the A-V map.

ICA identification of vascular slow oscillations
To extract signals only from vessels exhibiting strong slow oscillations an additional independent components analysis (ICA)-based mask was combined with the described above vessel localization method. The functional rs-fMRI data were processed using the Group ICA of fMRI Toolbox (GIFT, http://mialab.mrn.org/software/gift) in MATLAB. First, principal component analysis (PCA) was employed to reduce the dimensionality of the data. PCA output was used to find 10 independent components and their spatial maps using Infomax ICA 70 . If a component exhibiting slow oscillations predominantly in individual vessels had been found, it was thresholded and used together with the vascular mask to identify vessels of interest and extract their signals.

Frequency normalization
To normalize the data, power density estimates of signals' high-frequency components were used. Every time course had its mean removed and was divided by the mean PSD of its frequency components higher than 0.2 Hz. The 0.2 Hz point was chosen, as above this value spectra of extracted signals were centered on a horizontal, non-decaying line. Performing the division brought the mean PSD of high-frequency components to a common unit baseline for all signals.
This allowed to better compensate for different signal strengths across trials than when scaling the data using minimal and maximal values. Additionally, the relative strength of flatter signals and those exhibiting stronger low-frequency oscillations was better preserved when compared to variance normalization. Ultimately it also improved prediction performance.

Power spectrum analysis
The spectral analysis was performed in MATLAB. To compute the power spectral density estimates (PSDs) of utilized signals we employed Welch's method with the following parameters:

Filtering
To obtain target signals, single-vessel time courses were bandpass filtered in MATLAB using butter and filtfilt functions. The frequency bands (0.01-0.1 for human and 0.01-0.05 for rat data) were chosen based on the PSD curves of single-vessel and ICA time courses.

Surrogate data generation
Surrogate data methods are primarily used to measure the degree of nonlinearity of a time series 74 . They allow creating artificial time courses that preserve basic statistics of original data like the mean, variance and autocorrelation structure. In this study, Fourier based surrogate signals were generated for each single-vessel time course using the iterative amplitude adjusted The iteratively generated signals had the same amplitude distribution as the source data and extremely similar amplitudes of the power spectrum. However, the phases of their complex Fourier components were randomized.

HCP datapreprocessing
Data from 3279 15-minute sessions of rs-fMRI acquired by the Human Connectome Project (HCP) 68 were used to extract V1 signals and compute whole-brain correlation maps. The data set was preprocessed 134,135 , had artifacts removed via ICA+FIX 136,137 and was registered to a common space 77, 138 by the HCP. The data were resampled from the original 0.72 s sampling rate to match the 1 s TR of our in-house datasets.

HCP data -ROI signal extraction
The multi-modal cortical parcellation 77 was used to extract 180 ROI signals per hemisphere. The DMN ROI was based on the DMN ROI specified in Yeo et al. 139 . Task-positive regions were labeled according to Glasser et al. 77 . Subcortical structures were extracted using the Connectome Workbench 140 .

HCP data -ICA parcellations
ICA spatial maps and their corresponding time courses for each rs-fMRI session were obtained from the S1200 Extensively Processed fMRI Data released by HCP. The spatial maps are based on group-PCA results generated using MIGP (MELODIC's Incremental Group-PCA) 141 . Spatial ICA was applied to the group-PCA output using FSL's MELODIC tool 142,143 . To derive componentspecific time courses for each session, the spatial maps were regressed against the rs-fMRI data 144 . In this work, we used results from the 15-component decomposition. Not all used rs-fMRI sessions had an ICA time course available (initial group sizes in the seed-based analysis were ntop=202 and nbottom=207; for ICA seed-based analysis the sizes were ntop=195 and nbottom=203).

Cross-correlation
MATLAB xcorr and zscore functions were used to compute cross-correlation. Lag times were computed between predictions and desired outputs. Positive lags correspond to delayed predictions and negative lags to too early predictions. The input signal has an additional 10 s shift.

Statistical tests
The statistical significance of the difference between real/surrogate and ESN/ARMAX prediction scores was verified using a paired t-test (MATLAB ttest function). To determine differences between seed-based correlation maps and PSDs two-sample t-tests were applied (MATLAB ttest2 function). The results have been controlled for false discovery rate with adjustment 145,146 .
Fisher's z-transform has been applied to all correlation values before conducting statistical tests. P values <0.05 were considered statistically significant.
Supplementary fig. 1 | ESN hyper-parameter optimization using random search. For each possible 3+1 trial division (3 training trials + 1 test trial) network parameter values are drawn randomly from pre-specified ranges 5000 times. For each drawn parameter set a reservoir is generated and an ESN is trained and evaluated. For each 3+1 division the 100 best performing ESNs are cross-validated to select a single ESN for predicting other rats' data.
Supplementary fig. 7 | Variance based classification. a, Prediction score histograms of the 5% best (light gray) and worst (dark gray) predicted sessions contrasted with prediction scores of signals with top 5% highest (blue) and lowest (red) variance. Variance levels aren't conclusive of ESNs performance. Top right: same data with groups merged. b, Mean PSDs of signals with the top 5% highest (blue) and lowest (red) variance. Shaded areas show standard deviations. c, Histogram of variance-based (red and blue) and ESNbased (gray) group variance values. Predictions scores of signals having a high or low variance are distributed across the whole range of CC values. d, The spatial ICA component (white borders highlighting the DMN), whose time courses have been used to generate the variance-based difference connectivity map. e, The variance based differential ("top"-"bottom") connectivity pattern. It doesn't resemble the ESN-based DMN internal connectivity reduction. Nodes in which the difference was insignificant are masked. f, Same as e but without the mask.