Cluster-based permutation tests on time-frequency data
Introduction
The objective of this tutorial is to give an introduction to the statistical analysis of EEG and MEG data (denoted as MEEG data in the following) by means of cluster-based permutation tests.
The tutorial starts with a long background section that sketches the background of permutation tests. The next sections are more tutorial-like. They deal with the analysis of an actual MEG dataset. In a step-by-step fashion, it will be shown how to statistically compare the data observed in two experimental conditions in a between-trials, in a within-trials and in a within-subjects design. For this, we will use planar TFR's.
In this tutorial we will continue working on the dataset described in the preprocessing tutorials. Below we will repeat code to select the trials and preprocess the data as described in the first tutorials . We assume that the preprocessing (see Trigger-based trial selection), calculation of the planar gradient (see Event related averaging and planar gradient) and the time-frequency analysis (see Time-frequency analysis using Hanning window, multitapers and wavelets) are already clear for the reader.
This tutorial is not covering statistical test on event-related fields. If you are interested in that, you can read the Cluster-based permutation tests on event related fields tutorial. If you are interested how parametric statistical tests can be used with FieldTrip, you can read the Parametric and non-parametric statistics on event-related fields tutorial.
Background
Procedure
In this tutorial we will consider a between-trials experiment, in which we analyze the data of a single subject. For the statistical analysis for this experiment we calculate the planar TFRs. The steps we perform are as follows:
- Preprocessing with the ft_definetrial and with the ft_preprocessing functions
- Calculation of the planar gradient and time-frequency analysis with the ft_megplanar, with the ft_freqanalysis and ft_combineplanar functions
- Permutation test with the ft_freqstatistics function
- Plotting the result using the ft_freqdescriptives and the ft_clusterplot functions
Figure 1. Pipeline of statistical analysis of planar TFR's in a between trials design
Subsequently we will consider a within-trials experiment, in which we compare the pre-stimulus baseline to the post-stimulus activity time window. The steps we perform are as follows
- Preprocessing with ft_definetrial and with ft_preprocessing and selecting the appropriate time windows with the ft_redefinetrial function
- Calculation of the planar gradient and time-frequency analysis with the ft_megplanar, with the ft_freqanalysis and ft_combineplanar functions
- Permutation test with the ft_freqstatistics function
- Plotting the result using ft_clusterplot
Figure 2. Pipeline of statistical analysis of planar TFR's in a within-trials design
Finally we will consider a within-subjects experiment with the following steps:
- Preprocessing with ft_definetrial and with ft_preprocessing
- Calculation of the planar gradient and time-frequency analysis with the ft_megplanar, with the ft_freqanalysis and ft_combineplanar functions
- Calculation of the grandaverage with the ft_freqgrandaverage function
- Permutation test with ft_freqstatistics
- Plotting the result using ft_clusterplot
Figure 3. Pipeline of statistical analysis of planar TFR's in a within-subjects design
Between-trial experiments
In a between-trials experiment, we analyze the data of a single subject. By means of a statistical test, we want to answer the question whether there is a systematic difference in the MEG recorded on trials with a fully congruent and trials with a fully incongruent sentence ending.
Preprocessing
We first extract the trials of the fully incongruent condition.
Then we also extract the trails of the fully congruent condition.
Calculation of the planar gradient and time-frequency analysis
Before calculating the TFRs we calculate the planar gradient with ft_megplanar. This requires preprocessed data (see above), which is also available from the FieldTrip FTP server (dataFIC.mat and dataFC.mat).
load dataFIC load dataFC cfg = []; cfg.planarmethod = 'sincos'; % prepare_neighbours determines with what sensors the planar gradient is computed cfg_neighb.method = 'distance'; cfg.neighbours = ft_prepare_neighbours(cfg_neighb, dataFC); dataFIC_planar = ft_megplanar(cfg, dataFIC); dataFC_planar = ft_megplanar(cfg, dataFC);
Without a prior hypothesis (i.e., an hypothesis that is independent of the present data), we must test the difference between the FC and the FIC condition for all frequency bands that may be of interest. However, such an analysis is not suited as a first step in a tutorial. Therefore, we assume that we have a prior hypothesis about the frequency band in which the FC and FIC condition may differ. This frequency band is centered at 20 Hz. We will now investigate if there is a difference between the FC and the FIC condition at this frequency. To calculate the TFRs, we use ft_freqanalysis with the configuration below. See the tutorial Time-frequency analysis using Hanning window, multitapers and wavelets for an explanation of the settings. Note that cfg.keeptrials = ‘yes’, which is necessary for the subsequent statistical analysis.
cfg = []; cfg.output = 'pow'; cfg.channel = 'MEG'; cfg.method = 'mtmconvol'; cfg.taper = 'hanning'; cfg.foi = 20; cfg.toi = [-1:0.05:2.0]; cfg.t_ftimwin = 7./cfg.foi; %7 cycles cfg.keeptrials = 'yes';
Calculate the TFRs for the two experimental conditions (FC and FIC).
freqFC_planar = ft_freqanalysis(cfg, dataFC_planar); freqFIC_planar = ft_freqanalysis(cfg, dataFIC_planar);
Finally, we calculate the combined planar gradient and copy the gradiometer structure in the new datasets.
cfg = []; freqFIC_planar_cmb = ft_combineplanar(cfg, freqFIC_planar); freqFC_planar_cmb = ft_combineplanar(cfg, freqFC_planar); freqFIC_planar_cmb.grad = dataFIC.grad; freqFC_planar_cmb.grad = dataFC.grad;
To save:
save freqFIC_planar_cmb freqFIC_planar_cmb; save freqFC_planar_cmb freqFC_planar_cmb;
Permutation test
Now, run ft_freqstatistics to compare freqFIC_planar_cmb and freqFC_planar_cmb. Except for the field cfg.latency, the following configuration is identical to the configuration that was used for comparing event-related averages in the cluster-based permutation tests on event related fields tutorial. Also see this tutorial for a detailed explanation of all the configuration settings. You can read more about the ft_prepare_neighbours function in the FAQ's.
To load the planar gradient TFRs (also available on the FieldTrip FTP servers, freqFIC_planar_cmb.mat and freqFC_planar_cmb.mat), use:
load freqFIC_planar_cmb load freqFC_planar_cmb cfg = []; cfg.channel = {'MEG', '-MLP31', '-MLO12'}; cfg.latency = 'all'; cfg.frequency = 20; cfg.method = 'montecarlo'; cfg.statistic = 'ft_statfun_indepsamplesT'; cfg.correctm = 'cluster'; cfg.clusteralpha = 0.05; cfg.clusterstatistic = 'maxsum'; cfg.minnbchan = 2; cfg.tail = 0; cfg.clustertail = 0; cfg.alpha = 0.025; cfg.numrandomization = 500; % prepare_neighbours determines what sensors may form clusters cfg_neighb.method = 'distance'; cfg.neighbours = ft_prepare_neighbours(cfg_neighb, dataFC); design = zeros(1,size(freqFIC_planar_cmb.powspctrm,1) + size(freqFC_planar_cmb.powspctrm,1)); design(1,1:size(freqFIC_planar_cmb.powspctrm,1)) = 1; design(1,(size(freqFIC_planar_cmb.powspctrm,1)+1):(size(freqFIC_planar_cmb.powspctrm,1)+... size(freqFC_planar_cmb.powspctrm,1))) = 2; cfg.design = design; cfg.ivar = 1; [stat] = ft_freqstatistics(cfg, freqFIC_planar_cmb, freqFC_planar_cmb);
Save the output:
save stat_freq_planar_FICvsFC stat;
Plotting the results
The output can also be obtained from stat_freq_planar_FICvsFC.mat. If you need to reload the statistics output, use:
load stat_freq_planar_FICvsFC
By inspecting stat.posclusters and stat.negclusters, you will see that there is one large cluster that shows a negative effect and no large clusters showing a positive effect.
To show the topography of the negative cluster, we make use of ft_clusterplot. This is a function that displays the channels that contribute to large clusters, based on which the null-hypothesis can be rejected. First we use ft_freqdescriptives to average over the trials.
cfg = []; freqFIC_planar_cmb = ft_freqdescriptives(cfg, freqFIC_planar_cmb); freqFC_planar_cmb = ft_freqdescriptives(cfg, freqFC_planar_cmb);
Subsequently we add the raw effect (FIC-FC) to the obtained stat structure and plot the largest cluster overlayed on the raw effect.
stat.raweffect = freqFIC_planar_cmb.powspctrm - freqFC_planar_cmb.powspctrm; cfg = []; cfg.alpha = 0.025; cfg.parameter = 'raweffect'; cfg.zlim = [-1e-27 1e-27]; cfg.layout = 'CTF151_helmet.mat'; ft_clusterplot(cfg, stat);
Figure 1: Raw effect (FIC-FC) and channel-time cluster of planar gradient TFRs of subject 1
Within trial experiments
We will now show how to statistically test the difference between the TFRs in the pre-stimulus (baseline) and the post-stimulus (activation) period of the fully incongruent sentence endings. To perform this comparison by means of a permutation test, we have to select equal-length non-overlapping time intervals in the baseline and the activation period. For the baseline period we choose [-1 0], which is the time interval from 1 to 0 seconds before stimulus onset. And for the activation period we choose [0.6 1.6], which is the time interval from 0.6 to 1.6 seconds after stimulus onset.
It must be stressed that the time windows we choose to compare are nonoverlapping and of equal length. This constraint follows from the null hypothesis that is tested with a permutation test. This null hypothesis involves that the data (spatiotemporal matrices) observed in the two experimental conditions are drawn from the same probability distribution. This null hypothesis only makes sense if the dimensions of the data matrices in the two experimental conditions are equal. In other words, the number of channels and the number of time points of these spatiotemporal data matrices must be equal. This also applies to a within trials experiment in which a baseline and an activation condition are compared: the number of channels and time points of the baseline and the activation data matrices must be equal.
Preprocessing and freqanalysis on planar data
We first select equal-length non-overlapping time intervals in the baseline and the activation period. For the baseline period we choose [-1 0], the time interval from 1 to 0 seconds before stimulus onset. And for the activation period we choose [0.6 1.6], the time interval from 0.6 to 1.6 seconds after stimulus onset. We will again focus on the power in the beta band (around 20 Hz).
We now calculate the TFRs for the baseline and the activation period. We use again the preprocessed dataFIC (see above or download from ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/cluster_permutation_freq/dataFIC.mat).
load dataFIC
To calculate the TFRs with the planar gradient, we first cut out the time intervals of interest with the function ft_redefinetrial.
cfg = []; cfg.toilim = [-1.0 0]; dataFIC_baseline = ft_redefinetrial(cfg, dataFIC); cfg = []; cfg.toilim = [0.6 1.6]; dataFIC_activation = ft_redefinetrial(cfg, dataFIC);
ft_freqstatistics will always compare data at the same time point, and therefore you get an error if you try to statistically test differences between data structures with non-overlapping time axes. This implies that, in order to statistically test the differences between baseline and activation period data, we have to trick the function by making the time axis the same. We do this by copying the time axis of dataFIC_activation onto dataFIC_baseline.
dataFIC_baseline.time = dataFIC_activation.time;
To calculate the TFRs for the synthetic planar gradient data, we must run the following code:
cfg = []; cfg.planarmethod = 'sincos'; % prepare_neighbours determines with what sensors the planar gradient is computed cfg_neighb.method = 'distance'; cfg.neighbours = ft_prepare_neighbours(cfg_neighb, dataFIC_activation); dataFIC_activation_planar = ft_megplanar(cfg, dataFIC_activation); dataFIC_baseline_planar = ft_megplanar(cfg, dataFIC_baseline); cfg = []; cfg.output = 'pow'; cfg.channel = 'MEG'; cfg.method = 'mtmconvol'; cfg.taper = 'hanning'; cfg.foi = 20; cfg.toi = [0.6:0.05:1.6]; cfg.t_ftimwin = 7./cfg.foi; %7 cycles cfg.keeptrials = 'yes'; freqFIC_activation_planar = ft_freqanalysis(cfg, dataFIC_activation_planar); freqFIC_baseline_planar = ft_freqanalysis(cfg, dataFIC_baseline_planar);
Finally, we combine the two planar gradients at each sensor and add the gradiometer definition:
cfg = []; freqFIC_baseline_planar_cmb = ft_combineplanar(cfg,freqFIC_baseline_planar); freqFIC_activation_planar_cmb = ft_combineplanar(cfg,freqFIC_activation_planar); freqFIC_baseline_planar_cmb.grad = dataFIC.grad; freqFIC_activation_planar_cmb.grad = dataFIC.grad;
To save:
save freqFIC_baseline_planar_cmb freqFIC_baseline_planar_cmb; save freqFIC_activation_planar_cmb freqFIC_activation_planar_cmb;
Permutation test
To load the planar gradient TFRs (download from FieldTrip FTP, freqFIC_baseline_planar_cmb.mat and freqFIC_activation_planar_cmb.mat), use:
load freqFIC_baseline_planar_cmb load freqFIC_activation_planar_cmb
To compare freqFIC_baseline_planar_cmb and freqFIC_activation_planar_cmb by means of ft_freqstatistics, we use the following configuration:
cfg = []; cfg.channel = {'MEG', '-MLP31', '-MLO12'}; cfg.latency = [0.8 1.4]; cfg.method = 'montecarlo'; cfg.frequency = 20; cfg.statistic = 'ft_statfun_actvsblT'; cfg.correctm = 'cluster'; cfg.clusteralpha = 0.05; cfg.clusterstatistic = 'maxsum'; cfg.minnbchan = 2; cfg.tail = 0; cfg.clustertail = 0; cfg.alpha = 0.025; cfg.numrandomization = 500; % prepare_neighbours determines what sensors may form clusters cfg_neighb.method = 'distance'; cfg.neighbours = ft_prepare_neighbours(cfg_neighb, freqFIC_activation_planar_cmb); ntrials = size(freqFIC_activation_planar_cmb.powspctrm,1); design = zeros(2,2*ntrials); design(1,1:ntrials) = 1; design(1,ntrials+1:2*ntrials) = 2; design(2,1:ntrials) = [1:ntrials]; design(2,ntrials+1:2*ntrials) = [1:ntrials]; cfg.design = design; cfg.ivar = 1; cfg.uvar = 2;
This configuration for a within-trials experiment is very similar to the configuration for the within-subjects experiment in the "Cluster-based permutation tests on event related fields" tutorial in which we compared the evoked responses to fully incongruent and fully congruent sentence endings. The main difference is the measure that we use to evaluate the effect at the sample level (cfg.statistic = 'ft_statfun_actvsblT' instead of cfg.statistic = 'ft_statfun_depsamplesT'). With cfg.statistic = 'ft_statfun_actvsblT', we choose the so-called activation-versus-baseline T-statistic. This statistic compares the power in every sample (i.e., a (channel,frequency,time)-triplet) in the activation period with the corresponding time-averaged power (i.e., the average over the temporal dimension) in the baseline period. The comparison of the activation and the time-averaged baseline power is performed by means of a dependent samples T-statistic.
You can read more about the ft_prepare_neighbours function in the FAQ's.
We can now run ft_freqstatistics
[stat] = ft_freqstatistics(cfg, freqFIC_activation_planar_cmb, freqFIC_baseline_planar_cmb);
Save the output:
save stat_freqFIC_ACTvsBL stat;
The output can also be obtained from the FieldTrip FTP server ( stat_freqFIC_ACTvsBL.mat). If you need to reload the statistics output, use:
load stat_freqFIC_ACTvsBL
By inspecting stat.posclusters and stat.negclusters, you will see that there is one substantially large cluster showing a positive effect and no large clusters showing a negative effect.
Plotting the results
This time we will plot the largest cluster on top of the statistics, which are present in the .stat field.
cfg = []; cfg.alpha = 0.025; cfg.parameter = 'stat'; cfg.zlim = [-4 4]; cfg.layout = 'CTF151_helmet.mat'; ft_clusterplot(cfg, stat);
Figure 2: Largest cluster that shows a difference between activation and baseline, plotted on top of the T-statistic of the difference.
Within subjects experiments
In this paragraph we describe permutation testing for TFRs obtained in experiments involving multiple subjects that are each observed in multiple experimental conditions. Every subject is observed in a large number of trials, each one belonging to one experimental condition. For every subject, averages are computed over all trials belonging to each of the experimental conditions. Thus, for every subject, the data are summarized in an array of condition-specific averages. The permutation test that will be described in the following informs us about the following null hypothesis: the probability distribution of the condition-specific averages is identical for all experimental conditions.
Preprocessing, planar gradient and grandaverage
To test the difference between the average TFRs for fully incongruent (FIC) and fully congruent (FC) sentence endings we use planar gradient data. To load the data structures containing time frequency grand averages of all ten subjects (available from ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/cluster_permutation_freq/GA_TFR_orig.mat), use:
load GA_TFR_orig;
The averages of the TFRs for the fully incongruent and the fully congruent sentence endings are stored in GA_TFRFIC and GA_TFRFC. These averages were calculated using ft_freqgrandaverage .
Permutation test
We now perform the permutation test using ft_freqstatistics. The configuration setting for this analysis are almost identical to the settings for the within-subjects experiment in the "Cluster-based permutation tests on event related fields" tutorial. The only difference is a small change in the latency window (cfg.latency). You can read more about the ft_prepare_neighbours function in the FAQ's.
cfg = []; cfg.channel = {'MEG'}; cfg.latency = [0 1.8]; cfg.frequency = 20; cfg.method = 'montecarlo'; cfg.statistic = 'ft_statfun_depsamplesT'; cfg.correctm = 'cluster'; cfg.clusteralpha = 0.05; cfg.clusterstatistic = 'maxsum'; cfg.minnbchan = 2; cfg.tail = 0; cfg.clustertail = 0; cfg.alpha = 0.025; cfg.numrandomization = 500; % specifies with which sensors other sensors can form clusters cfg_neighb.method = 'distance'; cfg.neighbours = ft_prepare_neighbours(cfg_neighb, GA_TFRFC); subj = 10; design = zeros(2,2*subj); for i = 1:subj design(1,i) = i; end for i = 1:subj design(1,subj+i) = i; end design(2,1:subj) = 1; design(2,subj+1:2*subj) = 2; cfg.design = design; cfg.uvar = 1; cfg.ivar = 2; [stat] = ft_freqstatistics(cfg, GA_TFRFIC, GA_TFRFC)
Save the output:
save stat_freq_planar_FICvsFC_GA stat;
The output can also be obtained from stat_freq_planar_FICvsFC_GA.mat. If you need to reload the statistics output, use:
load stat_freq_planar_FICvsFC_GA
From inspection of stat.posclusters and stat.negclusters, we observe that there is one significant negative cluster.
Plotting the results
Plot again with ft_clusterplot:
cfg = []; cfg.alpha = 0.025; cfg.parameter = 'stat'; cfg.zlim = [-4 4]; cfg.layout = 'CTF151_helmet.mat'; ft_clusterplot(cfg, stat);
Figure 3: T-statistic of the difference (FIC-FC) (of the combined planar gradient TFRs) and largest channel-time clusters
Exercise
Summary and suggested further readings
In this tutorial, we showed how to do non-parametric statistical test, cluster-based permutation test on planar TFR's with between-trials, with within-trials and with within-subjects experimental designs. It was also shown how to plot the results.
If you are interested in parametric tests in FieldTrip, you can read the Parametric and non-parametric statistics on event-related fields tutorial. If you are interested in how to do the same statistics on event-related fields, you can read the Cluster-based permutation tests on event related fields tutorial.
If you would like to read more about issues related to statistical analysis, you can read the following as well:
FAQs:
2011/07/08 10:58 | Jörn M. Horschig | |
2011/07/08 10:54 | Jörn M. Horschig | |
2015/05/23 10:36 | ||
2015/11/02 20:51 | ||
2011/09/14 09:53 | Robert Oostenveld | |
2015/03/12 12:16 | Jim Herring | |
2011/07/08 10:56 | Jörn M. Horschig | |
2014/12/17 15:58 | Robert Oostenveld | |
2014/02/21 08:25 | ||
2009/12/10 09:28 | ||
2011/06/23 17:36 |
Example scripts:
2009/03/03 21:52 | ||
2017/10/09 09:46 | Robert Oostenveld | |
2009/03/05 16:56 | Saskia Haegens | |
2009/10/13 14:46 | ||
2016/11/11 09:45 | Robert Oostenveld |
This tutorial was last tested by Jörn with version r9460 (April 30 2014) of FieldTrip using MATLAB 2010b on a 64-bit Linux platform.