Electroencephalography and magnetoencephalography

Get Complete Project Material File(s) Now! »

Neuroimaging acquisition techniques

Neuroimaging is a subfield of medical imaging which focuses on producing and making use of images of the central nervous system (CNS). The objectives of neuroimaging comprise the diagnosis of pathologies related to the CNS as well as its monitoring and understanding through the interpretation of visual repre-sentations of the structure and function of the brain and the spinal cord. Several technologies are available to produce images of the CNS, each providing differ-ent types of information. In this introductory section, we provide an overview of the most common acquisition techniques used in neuroimaging. They comprise standard technologies used in medical imaging as well as specific techniques that have been developed to better characterize the soft tissues of the brain or their functional properties.

Computed Axial Tomography

Computed Tomography (CT scan) is one the the oldest techniques available, di-rectly derived from traditional radiography. It uses a series of X-ray scans in order to produce a three-dimensional image of the head through a computa-tional reconstruction process that solves the inverse Radon transform. A CT scan therefore estimates the amount of X-rays absorbed in a given location, which is related to the tissue density. Even if this technique does not provide state-of-the-art quality, it remains widely used in clinical setting because a scan can be performed in less than a minute. The main indications of CT scan include the preparation of surgeries and the diagnosis of brain injuries thanks to its ability to accurately detect and localize tissue swelling and bleeding.

Positron emission tomography

Positron emission tomography (PET) is a technique that uses an array of sensors to measure the emissions of positrons from a radioactive tracer that is injected into the body prior to the scan. A computational reconstruction allows to image the concentration of the tracer as a three dimensional volume, hence to detect the locations where the chemical compound has accumulated. Depending on the chosen tracer, PET scanning can therefore highlight different properties of the body parts to be examined. When it comes to studying the CNS, the most commonly used tracer is the Fludeoxyglucose (FDG); indeed, this radioactive form of glucose makes it possible to directly study the metabolism of the brain, i.e to quantitatively measure brain activity. As a functional imaging technique, its advantages include the image quality offered and the short time necessary for acquisition, but its main disadvantage lies in the fast speed of decay of the radioactive compound concentration which limits the field of application of PET to studying tasks that are accordingly short.

Magnetic resonance imaging

Magnetic resonance imaging (MRI) is a technique that is based on the use of a high and homogeneous magnetic field in the imaging device. The energy of temporary pulses of radio waves sent by an emitting coil to the patient excite the hydrogen atoms (protons) in the target tissue, which emit themselves radio frequencies recorded during relaxation by a receiver coil. A modulation of the main magnetic field by gradient coils on the emission side allows to encode the position of the target tissue, which makes it possible to reconstruct an image. Because the water content of different types of tissues varies, the recorded mag-netic resonance signal produced by the water protons will also change, which makes it possible to produce images with strong constrasts between different tissues. MRI has the nice advantages of being non-invasive (i.e it is possible to do MRI without injecting any tracer) and avoiding exposition to X-rays. The de-sign of pulse sequences, which define the successive changes on the operation of the gradient coils, leads to different image properties. We will now describe the main types of MR images used in neuroscience.
Anatomical MRI (also called structural MRI) uses pulse sequences that pro-duce images where the structures of the brain and spinal cord are highly con-trasted. This is the tool of choice to study the morphology of the brain in a quantitative manner, i.e to perform morphometry studies. The most usual pulse sequences aim at measuring the difference in T1 relaxation time (spin-lattice re-laxation time) between tissues, and in particular between the grey and white matter. See Fig. 1.1 for an example.
Functional MRI (fMRI) measures the so-called Blood-Oxygen-Level Dependent (BOLD) effect. A local increase in neural activity demands energy consump-tion which requires oxygen; this oxygen demand is actually over-compensated, which result in a increased concentration of oxy-hemoglobin compared to deoxy-hemoglobin, which results in an increased MR signal measured with the T2∗ re-laxation time contrast. Functional MRI therefore measures an indirect signature of neural activity – the precise links between neural activity and the fMRI signal remaining to be elucidated. Since its discovery in the 1990s, fMRI has become widely used to study brain function, mostly because it provides very good spa-tial resolution over the whole brain in a non-invasive manner, which made it contribute significantely to the brain mapping field. See an example of an fMRI volume on Fig. 1.2.

Electroencephalography and magnetoencephalography

Electroencephalography (EEG) and magnetoencephalography (MEG) respectively measure the electrical and magnetic field over a set of electrodes positioned on or over the scalp. The signals recorded are induced by synchronized neuronal electrical currents over large populations of neurons which share the same ori-entation, thus creating local modulations of the electrical and magnetic fields, large enough to be detected by scalp electrodes. While EEG is a very old tool with origins dating from the XIX-th century, MEG is more recent since it was developed in the 1960s. Both EEG and MEG provide a very high temporal res-olution (on the order of a few milliseconds), which make them very useful in research settings, in particular to study the oscillatory behaviors of neuronal ac-tivity. Moreover, EEG is a standard tool in clinical settings, where it can help characterize epileptic seizures, diagnose psychiatric disorders or prognosticate the evolution of comatose patients. Although some research results are encour-aging for a future implantation of MEG in hospitals, it is not, as of today, an approved tool for clinical applications.

Inference in neuroimaging

The main objectives of neuroimaging as a field of medical imaging can be cate-gorized as follows:
• in clinical settings, the goal of neuroimaging is to help to decide whether a patient carries a neurological or psychiatric disease, or to predict his/her evolution with regard to such pathologies;
• in research settings, functional neuroimaging attempts to understand brain function in its normal healthy state;
• in pharmacology, the effectiveness of a drug can be quantified by examining its spread throughout the brain or its effect on the modulation of brain processes thanks to neuroimaging;
• finally, another objective is to build databases to describe what is the normal CNS.
Overall, reaching any of these four objectives requires to examine groups of individuals and solve two main questions that consist in:
• finding commonalities across subjects within a population;
• finding differences between subjects belonging to different populations.
These two main questions can be addressed using univariate or multivariate statistical tools in different ways that we describe in the following section. In short, univariate methods examine a single voxela of the images at a time, before applying the same analysis model repeatedly and independently at each pixel. In contrast, multivariate methods consider groups of pixels – or even all the pixels – in a single model.

Univariate techniques

The General Linear Model

The tool of choice for building univariate methods in neuroimaging is the so-called General Linear Model (GLM). In neuroimaging, it is used as follows:
• Y is a vector of length n which contains n data points recorded at a given location, being a pixel, voxel or single electrode;
• X is the so-called design matrix, of size n × m, which is composed of m regressors that each contains a variable that we believe should contribute to explain Y ; it is to be specified by the experimenter;
• β is a weight vector of size m, the i-th value weighting the i-th regressor of X ; it is the vector that needs to be estimated;
• ǫ is the residual vector of size n, which contains everything in Y that cannot be explained by X .
An example of application of the GLM is shown on Fig. 1.3. This model is tagged as general because it comprises several classical statistical models such as the simple linear regression, the multiple linear regression and the analysis of variance (ANOVA). It has been massively used in neuroimaging, mostly be-cause of the success of the SPM softwareb (see [Ashburner 2012] for a historical perspective on SPM). SPM, which stands for Statistical Parametric Mapping had first implemented the GLM for PET data, before making it available for fMRI and aMRI. It falls into the realm of massively univariate methods. Indeed, because it works on data from a single voxel, the model attempts to explain the behavior of a single variable, hence the use of the univariate term. The massive term follows the repeated use of the same model (i.e the same design matrix X ) on the very large number of voxels available in neuroimaging datasets.
• obtain a statistical parametric map that covers the brain with t or F values, with their associated map of p-values;
• perform inference on this statistical map to detect where the null hypothesis can be rejected, including corrections for the multiple comparison problem (see below).
β of the model. The most simple contrast is a vector c = [0 • • • 1 • • • 0], where only the i-th weight is non zero and equal to one. In this case, cT β = βi, and the null hypothesis is simply βi = 0. When this null hypothesis is rejected, it means that the i-th regressor of the design matrix X actually contributes to explaining the data Y . For instance, it Y contains one data point per subject and Xi is the age of each subject, the rejection of this null hypothesis gets interpreted as the fact that age has a significant effect on explaining Y . In general, when c is a vector, the null hypothesis is a linear combination of the different βi-s and the associated statistical test is a Student t-test; when c is a matrix, thus testing for different linear combinations of βi-s at the same time, the associated statistics if Fischer’s F .
Another important point lies in the fact that the application of the same model at all voxels implies performing a number of tests equal to the number of voxels, which can be on the order of 105 to 106 depending on the modality. Even in the scenario that the null hypothesis is true everywhere, this will produce a large number of voxels that will pass the test defined by p < 0 05. We therefore need to correct for this effect, which is called the multiple comparison problem. The most simple technique for voxel-wise inference is the Bonferroni procedure to control the family-wise error rate: the critical p-value is simply divided by the number of tests, which increases the threshold on the statistic. But it is known to lack power. A standard strategy consists in examining clusters of suprathreshold voxels (for a given fixed threshold that can be informed by uncorrected pointwise p-values), and performing statistical assessment on the clusters, which vastly re-duced the number of tests. This can be done using the Random Field Theory [Worsley et al. 1992] which parametrically linked point-wise statistics with the expected size of suprathreshold clusters using smoothness assumptions on the statistical map. Besides these parametric approaches, one can also resort to non-parametric strategies for either voxel-wise or cluster-wise inference, for in-stance using permutation-based approaches [Bullmore et al. 1999; Nichols et al. 2002a].
We will now describe the implementations of this General Linear Model to address the two most common examples of the two types of questions that were described in Section 1.2.

READ Beyond the Exodus of May-June 1940: Internal Flows of Refugees in France

Group analysis in fMRI

Functional MRI experiments consist in having the subject sequentially perform several repetitions of one or several tasks while lying in the scanner, in a pre-determined manner that defines the experimental paradigm. During the several minutes of the experiment, fMRI volumes are acquired continuously, with typi-cally one volume every 2 to 3 seconds, to form a 3D+time dataset. Most often, the same experiment is performed on several subjects, and the objective of a group analysis is to find significant effects, i.e locations for which we can re-ject the null hypothesis for a contrast of interest, that are common across the population. In this case, we use a two-level GLM.
The first level is the subject level. The fMRI data of each subject s is composed of timeseries available for each voxel v of the brain, that we define as Y v,s(t). Some parts of the brain will be activated by the experimental paradigm and the BOLD response should then correlate with the paradigm, which we therefore use to define the design matrix X . If the subject is asked to alternatively per-forms several tasks – or several variants of the same task, the timeseries of each task/variant will be included as a regressor in the design matrix and the GLM will implement a multiple regression: Y v,s(t) = X βv,s + ǫv,s is estimated at each voxel v and for each subject s. The subject-level contrast maps cv,s = cT βˆv,s are then computed for a given contrast c (where c can for example implement the null hypothesis that two of the tasks produce the same BOLD response). This first level GLM is illustrated in details on Figs 1.3 and 1.4.
Then, a second level GLM is estimated at the group level, where the data Ygv at voxel v contains s data points which are the contrast values cv,s estimated on each subject with the first level GLM c. The most simple question that can be asked at the group level is to determine where in the brain the contrast values have a non-null value over the population. This can be done by including a constant regressor with a one value in the group-level design matrix Xg . The model that we estimate at each voxel v is then Ygv = Xg βv + ǫ, which in fact implements a one-sample t-test. The resulting t map can then be processed by the spatial inference described previously in order to deal with the multiple comparisons problem. The clusters that will survive are locations where the contrast c is significantely non-null over the population, thus answering our initial problem.

Voxel-based morphometry

When processing anatomical MR data, traditional morphometry consists in mea-suring – often manually – the volume of a given brain structure and performing statistical analysis to estimate the potential differences between subjects belong-ing to different populations: a simple two-sample t-test can then be used to C Note that this requires that the voxel numbered V designates the same brain location for all subjects, which is achieved by a processed called spatial normalization assess the difference in volume between healthy controls and patients. For in-stance, the volume of the hippocampus is known to be smaller in patients with Alzheimer’s disease than in healthy subjects [Schott et al. 2003].
In order to detect morphological differences smaller than with region-based approaches, the voxel-based morphometry framework (see [Ashburner 2009] for a review) starts by estimating the density of gray matter Y v (s) at each voxel v of the brain of each subject s, after spatial normalization. By defining the vector Y v containing the density values at voxel v for all subjects , the GLM Y v = X βv + ǫv can be used to assess differences between populations. In order to do so, X needs to encode the fact that our set of observations comprises two populations (for instance, with one regressor that takes the 1 value for subjects belonging to the first population and -1 for subjects of the other), and a contrast c should be defined to test the null hypothesis that there are no differences between the two populations. In this case, we will obtain a map of t values that will then be processed as previously to determine in which locations of the brain the den-sity of gray matter does not verify the null hypothesis of no differences across populations.

Multivariate machine learning techniques

General setting: supervised learning

The goal of supervised learning is to learn a function that expresses as explicitely as possible the relationships between two spaces, an input space X and a target space Y . When Y is a discrete set such as {1, , C }, this problem is known as classification; when Y is continuous, for instance when Y = R, it is known as regression.

Table of contents :

1 Neuroimaging: a primer
1.1 Neuroimaging acquisition techniques
1.1.1 Computed Axial Tomography
1.1.2 Positron emission tomography
1.1.3 Magnetic resonance imaging
1.1.4 Electroencephalography and magnetoencephalography
1.2 Inference in neuroimaging
1.3 Univariate techniques
1.3.1 The General Linear Model
1.3.2 Group analysis in fMRI
1.3.3 Voxel-based morphometry
1.4 Multivariate machine learning techniques
1.4.1 General setting: supervised learning
1.4.2 Multi-Voxel Pattern Analysis of functional MRI data
1.4.3 Computer-aided diagnosis tools for aMRI
2 Inter-subject learning as a multi-source problem
2.1 Multi-source learning
2.1.1 Multi-source setting
2.1.2 Link with multi-view and multi-task learning
2.2 A multi-source setting for inter-subject prediction
2.2.1 Dataset and probabilistic model
2.2.2 Addressed problems
3 State of the art
3.1 Constructing invariant representations
3.1.1 Feature engineering
3.1.2 Structured representations
3.1.3 Representation learning
3.2 Domain adaptation
3.2.1 Looking for shared representations
3.2.2 Instance weighting
3.2.3 Iterative approaches
3.3 Multi-source-specific methods
3.3.1 Multi-source domain adaptation
3.3.2 Boosting-based methods
3.3.3 Multi-task models
3.3.4 Other approaches
3.4 Other approaches for inter-subject learning
3.4.1 Hyperalignment
3.4.2 Spatial regularization
4 Graph-based Support Vector Classification for inter-subject decoding of fMRI data
4.1 Introduction
4.2 Materials and methods
4.2.1 Graph-based Support Vector Classification (G-SVC)
4.2.2 Graphical representation of fMRI patterns
4.2.3 Graph similarity
4.2.4 Datasets
4.2.5 Evaluation framework
4.3 Results
4.3.1 Results on artificial data: G-SVC vs. vector-based methods
4.3.2 Results on real data: G-SVC vs. vector-based methods
4.3.3 Results on real data: G-SVC vs parcel-based methods
4.3.4 Results on real data: G-SVC with variable number of nodes
4.3.5 Results on real data: influence of each graph attribute
4.3.6 Kernel parameters
4.4 Discussion
4.4.1 Hyper-parameters estimation
4.4.2 Linear vs nonlinear classifiers
4.4.3 Examining assumptions and potential applications
4.4.4 Which graph kernel for fMRI graphs?
4.5 Conclusion
4.6 Appendix – Within-subject G-SVC decoding results
4.7 Appendix – Testing pattern symmetry using G-SVC
4.8 Appendix – Inter-region decoding using G-SVC
5 Mapping cortical shape differences using a searchlight approach based on classification of sulcal pit graphs
5.1 Introduction
5.2 Methods
5.2.1 Extracting sulcal pits
5.2.2 Representing patterns of sulcal pits as graphs
5.2.3 Graph-based support vector classification
5.2.4 Searchlight mapping
5.2.5 Multi-scale spatial inference
5.2.6 Interpretation-aiding visualization tools
5.3 Experiments
5.3.1 Mapping gender and hemispheric differences
5.3.2 Results: methodological considerations
5.3.3 Results: neuroscience considerations
5.4 Discussion
5.4.1 Exploring the relevance of our results
5.4.2 Searchlight statistical analysis
5.4.3 On the necessity of the multi-scale approach
5.4.4 A kernel-based multivariate classification model
5.5 Conclusion
6 Multi-source kernel mean matching for inter-subject decoding of MEG data
6.1 Introduction
6.2 A reminder on kernel mean matching
6.2.1 Instance weighting for domain adaptation
6.2.2 Kernel Mean Matching
6.2.3 A transductive domain adaptation classifier
6.3 Multi-source kernel mean matching
6.3.1 Multi-source setting
6.3.2 Multi-source kernel mean matching
6.3.3 Limiting cases of the model
6.4 Simulations
6.4.1 Dataset and pre-processing
6.4.2 Experiments
6.4.3 Results
6.5 Discussion and conclusion
6.6 Appendix – Solving the KMM optimization problem using cvxopt
6.7 Appendix – Solving the MSKMM optimization problem using cvxopt
7 Conclusion
Bibliographie