Brain connectivity estimators

Brain connectivity estimators represent patterns of links in the brain. Connectivity can be considered at different levels of hierarchy: from neurons, neural assemblies to brain structures. Brain connectivity involves different concepts such as: anatomical connectivity (pattern of anatomical links), functional connectivity (usually understood as statistical dependencies) and effective connectivity (referring to causal interactions). The distinction of functional and effective connectivity is not very sharp; sometimes causal or directed connectivity is called functional connectivity. Herein we shall consider the estimators, which evaluate connectivity from brain activity time series such as EEG, LFP or spike trains, with an impact on the directed connectivity. These estimators can be possibly applied to fMRI data, if the sequences of images are available. Among estimators of connectivity, there are linear and non-linear, bivariate and multivariate measures; among them we can distinguish estimators indicating directionality. Different methods of connectivity estimation, their highlights and flaws are described in, more extensively. Below is a short overview of these measures, with an emphasis on the most-effective methods.

Classical methods
Classical estimators of connectivity are correlation and coherence. The above measures provide information on the directionality of interactions in terms of delay (correlation) or coherence (phase), however the information does not imply causal interaction. Moreover it may be ambiguous, since phase is determined modulo 2π. It is also not possible to identify by means of correlation or coherence reciprocal connections.

Non-linear methods
Most frequently used nonlinear estimators of connectivity are: mutual information, Transfer Entropy, Generalised Synchronisation, Continuity Measure, Synchronization Likelihood, Phase Synchronization. First two of them rely on construction of histograms for estimation of probabilities. Continuity Measure, Generalized Synchronisations, Synchronisation Likelihood are very similar methods based on reconstruction of phase space of signals. Among these measures only Transfer Entropy allows for the determination of directionality. According to, nonlinear measures require long stationary segments of signals, are prone to systematic errors and above all are very sensitive to noise. The comparison of nonlinear methods with linear correlation in the presence of noise revealed the poorer performance of non-linear estimators. In the authors conclude that there must be good reason to think that there is non-linearity in the data to apply non-linear methods. In fact it was demonstrated by means of surrogate data test, and time series forecasting that non linearity in EEG and LFP is the exception rather than a rule. On the other hand linear methods perform quite well also for non-linear signals. Finally, non-linear methods are bivariate (calculated pair-wise), which has serious implication on their performance.

Bivariate versus multivariate estimators
Comparison of performance of bivariate and multivariate estimators of connectivity may be found in, where it was demonstrated that in case of interrelated system of channels, greater than two, bivariate methods supply misleading information, even reversal of true propagation may be found. Let us consider very common situation that the activity from a given source is measured at electrodes positioned at different distances, hence different delays between the recorded signals. This situation can be illustrated by the simulation shown in Fig.1.

When we apply bivariate measure we will obtain the propagation always when there is a delay between channels. In this way we get a lot of spurious flows. When we have two or three sources acting simultaneously, which is quite common situation we shall get dense and disorganized structure of connections, similar to random structure (at best some "small world" structure may be identified). This kind of pattern is usually obtained in case of application of bivariate measures. In fact, effective connectivity patterns yielded by EEG or LFP measurements are far from randomness, when proper multivariate measures are applied, as we shall demonstrate below.

Multivariate methods based on Granger causality
The testable definition of causality was introduced by Granger. Granger causality principle states that if some series Y(t) contains information in past terms that helps in the prediction of series X(t), then Y(t) is said to cause X(t). Granger causality principle can be expressed in terms of two-channel multivariate autoregressive model (MVAR). Granger in his later work pointed out that the determination of causality is not possible when the system of considered channels is not complete. The measures based on Granger causality principle are: Granger Causality Index (GCI), Directed Transfer Function (DTF) and Partial Directed Coherence (PDC). These measures are defined in the framework of Multivariate Autoregressive Model.

Multivariate Autoregressive Model
The AR model assumes that X(t)&mdash;a sample of data at a time t&mdash;can be expressed as a sum of p previous values of the samples from the set of k-signals weighted by model coefficients A plus a random value E(t):

The p is called the model order. For a k-channel process X(t) and E(t) are vectors of size k and the coefficients A are k&times;k-sized matrices. The model order may be determined by means of criteria developed in the framework of information theory and the coefficients of the model are found by means of the minimalization of the residual noise. In the procedure correlation matrix between signals is calculated. By the transformation to the frequency domain we get:

H(f) is a transfer matrix of the system, it contains information about the relationships between signals and their spectral characteristics. H(f) is non-symmetric, so it allows for finding causal dependencies. Model order may be found by means of criteria developed in the framework of information theory, e.g. AIC criterion.

Granger Causality Index
Granger causality index showing the driving of channel x by channel y is defined as the logarithm of the ratio of residual variance for one channel to the residual variance of the two-channel model: GCIy→x = ln (e/e1) This definition can be extended to the multichannel system by considering how the inclusion of the given channel changes the residual variance ratios. To quantify directed influence from a channel xj to xi for n channel autoregressive process in time domain we consider n and n&minus;1 dimensional MVAR models. First, the model is fitted to whole n-channel system, leading to the residual variance Vi,n(t) = var(Ei,n(t)) for signal xi. Next, a n&minus;1 dimensional MVAR model is fitted for n&minus;1 channels, excluding channel j, which leads to the residual variance Vi,n&minus;1(t) = var (Ei,n&minus;1(t)). Then Granger causality is defined as:

$$ \mathrm{GCI}_{i\rightarrow j}(t) = \ln \left( \frac{V_{i,n}(t)}{V_{i,n-1}(t)}\right) $$

GCI is smaller or equal 1, since the variance of n-dimensional system is lower than the residual variance of a smaller, n&minus;1 dimensional system. GCI(t) estimates causality relations in time domain. For brain signals the spectral characteristics of the signals is of interest, because for a given task the increase of propagation in certain frequency band may be accompanied by the decrease in another frequency band. DTF or PDC are the estimators defined in the frequency domain.

Directed Transfer Function
Directed Transfer Function (DTF) was introduced by Kaminski and Blinowska in the form:

Where Hij(f) is an element of a transfer matrix of MVAR model. DTF describes causal influence of channel j on channel i at frequency f. The above equation ($$) defines a normalized version of DTF, which takes values from 0 to 1 producing a ratio between the inflow from channel j to channel i to all the inflows to channel i. The non-normalized DTF which is directly related to the coupling strength is defined as:

DTF shows not only direct, but also cascade flows, namely in case of propagation 1→2→3 it shows also propagation 1→3. In order to distinguish direct from indirect flows direct Directed Transfer Function (dDTF) was introduced. The dDTF is defined as a multiplication of a modified DTF by partial coherence. The modification of DTF concerned normalization of the function in such a way as to make the denominator independent of frequency. The dDTFj→i showing direct propagation from channel j to i is defined as:

Where Cij(f) is partial coherence. The dDTFj→i has a nonzero value when both functions Fij(f) and Cij(f) are non-zero, in that case there exists a direct causal relation between channels j→i. Distinguishing direct from indirect transmission is essential in case of signals from implanted electrodes, for EEG signals recorded by scalp electrodes it is not really important.

The preprocessing of signals before DTF computation (and in general in case of estimators based on MVAR) merits special attention. The preprocessing should be limited to the subtraction of mean value and possibly division by variance of each signal. Possible filtering of signals should be performed forward and backward in order not to disturb the phases (in Matlab procedure filtfilt). The signals should be referenced in respect to inactive electrode (e.g. linked ears), no bipolar, common average or Hjorth derivation should be used. The fitting of MVAR coefficients is based on calculation of correlation matrix between channels and these relations should not be distorted in any way, since they reflect phase dependence between channels and estimators of directed connectivity are based on phase differences. This issue will be addressed more extensively in the Section 1.4. DTF may be used for estimation of propagation in case of point processes e.g. spike trains or for the estimation of causal relations between spike trains and Local Field Potentials. The software for calculation of DTF, dDTF and time-varying version of DTF&mdash;SDTF&mdash;may be found at portal http://eeg.pl. In this website also tutorial on DTF is available.

Partial Directed Coherence
The partial directed coherence (PDC) was defined by Baccala and Sameshima in the following form:

In the above equation Aij(f) is an element of A(f)&mdash;a Fourier transform of MVAR model coefficients A(t), where aj(f) is j-th column of A(f) and the asterisk denotes the transpose and complex conjugate operation. Although it is a function operating in the frequency domain, the dependence of A(f) on the frequency has not a direct correspondence to the power spectrum. From normalization condition it follows that PDC takes values from the interval [0,1]. PDC shows only direct flows between channels. Unlike DTF, PDC is normalized to show a ratio between the outflow from channel j to channel i to all the outflows from the source channel j, so it emphasizes rather the sinks, not the sources. The normalization of PDC has an impact on the detected intensities of flow as was pointed out in. Namely, adding further variables that are influenced by a source variable decreases PDC, although the relationship between source and target processes remains unchanged. In another words: the flow emitted in one direction will be enhanced in comparison to the flows of the same intensity emitted from a given source in several directions.

DTF and PDC are not influenced by volume conduction
Volume conduction influences the amplitudes of electric field measured on scalp, which is well established fact, however the situation is different, if we consider phases of the signals.

DTF and PDC are based on the detection of phase differences between channels, they take zero value when there are no phase difference. This fact was supported by the simulation. To the set of EEG signals the sinusoid of 20 Hz and constant phase was added. In the power spectra a distinct peak of 20 Hz was observed, while no trace of 20 Hz activity was observed in DTF and PDC. (Fig. 2).



Volume conduction is propagation of electromagnetic field with a speed of light &mdash; 3⋅1010 cm/s. For the distance of the order of centimeter the delay is roughly 3.3⋅10&minus;9 s. Such delay cannot be practically detected.

The propagation speed of electrical activity in cortex depends on axonal and dendritic conduction velocity and synaptic delays. Recordings in monkey showed that delays of activation of neighboring areas in cortex (e.g. V1 and V2) range between 10-20 ms. Similar delays are obtained by calculating the speed of propagation in neural fibers and taking into account synaptic delays (1-5 ms). The resulting delays in propagation are of the order of tenths of ms and can be detected by proper estimators assuming typical sampling rates.

The propagation shown by DTF or PDC is due to the delays in the neural tracts. The depolarization wave propagating along neural fiber causes along its way oscillatory activity delayed in respect to source. This activity is recorded on scalp and delays between the electrodes are detected by these phase sensitive estimators. The sensitivity of connectivity estimators, based on Granger causality, to the changes of phase was demonstrated in, where it was reported that filters influencing phases disturb connectivity values.

The pattern of phases should not be disturbed in any way. It is coded in the correlation matrix of MVAR model. Therefore the signals should be recorded in respect to some "neutral" electrode such as "linked ears", nose, etc. Common average, Hjorth or Laplace transform should not be used, since they change the correlation matrix between channels and hence phases. When calculating DTF or PDC the preprocessing should be limited to the subtraction of average and possibly filtering without changing the phases.

In the light of robustness of DTF and PDC to volume conduction the procedures performing projection of the scalp potentials on the cortex or application of ICA are not needed, on the contrary they spoil the correlation relation between channels and what they show are may be some causality relations between e.g. some abstract ICA components, but not the propagation of EEG activity. Inspecting papers where such preprocessing routines were used it is easy to see that the results are usually hard to interpret in terms of other neurophysiological evidence. The results obtained by DTF without heavy preprocessing demonstrate very good agreement with temporal and topographical evidence supplied by imaging methods or neurophysiological evidence (section Applications below).

Time-varying estimators of effective connectivity
In order to account for the dynamic changes of propagation, the method of adaptive filtering or the method based on the sliding window may be applied to the estimators of connectivity. Both methods require multiple repetition of the experiment to obtain statistically satisfactory results and they produce similar results. The adaptive methods e.g. Kalman filter are more computationally demanding, therefore the methods based on sliding window may be recommended.

In the case of parametric model the number of data points kNT (k&mdash;number of channels, NT&mdash;number of points in the data window) has to be bigger (preferably by order of magnitude) than the number of parameters, which in case of MVAR is equal to k2p (p&mdash;model order). In order to evaluate dynamics of the process, short data window has to be applied, which requires the increase of the number of the data points, which may achieved by means of the repetition of the experiment. We divide a non-stationary recording into shorter time windows, short enough to treat the data within a window as quasi-stationary. The estimation of MVAR coefficients is based on calculation of the correlation matrix Rij of k signals Xi from multivariate set. We calculate the correlation matrix between channels for each trial separately. The resulting model coefficients are based on the correlation matrix averaged over trials. The correlation matrix has a form:

The averaging concerns correlation matrices (model is fitted independently for each short data window); the data is not averaged in the process. The choice of the window size is always a compromise between quality of the fit and time resolution.

The errors of the SDTF may be evaluated by means of bootstrap method. This procedure corresponds to simulations of the another realizations of the experiment. The variance of the function value is obtained by repeated calculation of the results for a randomly selected (with repetitions) pool of the original data trials.

Applications
Directed transfer Function found multiple applications, the early ones involved: localization of epileptic foci, estimation of EEG propagation in different sleep stages and wakefulness, determination of transmission between brain structures of an animal during a behavioral test. The patterns of directed connectivity for wakefulness and different sleep stages averaged over subjects are shown in Fig. 3.

One may observe the shifting of sources toward the front in transition from wakefulness to the deeper sleep stages. In the deep sleep the source is over corpus callosum, presumably it is connected with feeding the cortex from the sub-cortical structures.

One of the first applications of SDTF was determination of the dynamic propagation during performance of finger movement and its imagination,. The results corresponded very well with the known phenomena of event related synchronization and desynchronization such as decrease of the activity in alpha and beta band and brief increase of activity in the gamma band during movement in the areas corresponding to primary motor cortex, beta rebound after movement and so-called surround effect. Especially interesting was comparison of real finger movement and its imagination. In case of real movement the short burst of gamma propagation was observed from the electrode positioned over finger primary motor cortex. In case of movement imagination this propagation started later and a cross-talk between different sites overlying motor area and supplementary motor area was found. (The dynamics of propagation may be observed in animations available at: http://brain.fuw.edu.pl/~kjbli/DTF_MOV.html).

Another applications of SDTF concerned evaluation of transmission during cognitive experiments. The results of the Continuous Attention Test (CAT) confirmed the engagement of pre-frontal and frontal structures in the task and supported the hypothesis of an active inhibition by Pre-Supplementary Motor Area and right Inferior Frontal Cortex. The animations of propagation during CAT test are available at URL http://brain.fuw.edu.pl/~kjbli/CAT_MOV.html.

The results obtained by means of SDTF in experiments involving working memory were compatible with fMRI studies on the localization of the active sites and supplied the information concerning the temporal interaction between them. The animation illustrating dynamics of the interaction are available at http://brain.fuw.edu.pl/~kjbli/Cognitive_MOV.html.

Conclusions
In all studies when DTF (without unnecessary preprocessing) was applied to EEG or LFP distinct pattern of active sources was observed and their localizations were compatible with existing evidence obtained by means of another techniques and with neurophysiological evidence or hypotheses. The existence of well defined sources of brain activity connected with particular experimental conditions are well established in fMRI experiments, by means of inverse solution methods and intracortical measurements. This kind of deterministic structure of brain activity should have an impact on functional connectivity, so reported in some works random or barely distinguished from random connectivity structure may be considered as a surprising phenomenon. This kind of results may be explained by methodological errors: 1) unrobust methods of connectivity estimation and, even more important,  2) application of bivariate methods. When multivariate robust measures of connectivity are applied for EEG analysis a clear picture of functional connectivity emerges.