Getting started with Neuralynx data


The Neuralynx acquisition software writes a variety of file formats.

  • *.ncs single continuous channel file
  • *.nse single electrode waveform file
  • *.nts single stereotrode file
  • *.nst spike timestamps
  • *.ntt single tetrode file
  • *.nev event information

All files acquired during one recording are combined in a common directory. We refer to that directory as the “dataset”.

Neuralynx also writes a raw data file (*.nrd) in which all the channels are sampled at 32kHz and multiplexed. This file can get very large for the 256 channel recordings (up to 200GB). The standard output format for Neuralynx writes a single file for each channel. All channels together are combined in the “dataset directory”. All channels/files within that directory can be read simultaneously.


To get started, you should add the FieldTrip main directory to your path, and execute the ft_defaults function, which sets the defaults and configures up the minimal required path settings (see the faq):

addpath <full_path_to_fieldtrip>
ft_defaults


FieldTrip includes a number of low-level reading functions, located in fieldtrip/fileio/private:

read_neuralynx_bin.m
read_neuralynx_cds.m
read_neuralynx_dma.m
read_neuralynx_ds.m
read_neuralynx_ncs.m
read_neuralynx_nev.m
read_neuralynx_nse.m
read_neuralynx_nst.m
read_neuralynx_nts.m
read_neuralynx_ntt.m
read_neuralynx_sdma.m
read_neuralynx_tsh.m
read_neuralynx_tsl.m
read_neuralynx_ttl.m

These functions are used to read the actual files. If you are not sure whether your particular file is supported, you can “cd fieldtrip/fileio/private” and do

read_neuralynx_xxx('fullpath/filename.xxx')   % where XXX is the file format

This returns the content of a single-channel file as a Matlab structure.


To facilitate working with multichannel recordings, FieldTrip has an additional layer on top of the low level Neuralynx file reading functions. The idea is that all files belonging to a single recording are located in a single directory, which represents the “dataset” as a whole. The FieldTrip functions ft_read_header, ft_read_data, ft_read_event operate on the LFP and spike files in the dataset directory.

The LFP files are used for setting the sample “time” axis. If you only have spike files during a recording, you cannot merge them automatically. Merging is done by reading the LFP files (*.nsc), determining the first and last timestamp, and subsequently the spikes are represented as “1” in an other wise “0” channel. So the spike and LFP channels are jointly represented by ft_read_data in a nchan X nsamples matrix. This is also how the FieldTrip high level ft_preprocessing function accesses the collection of LFP and spike channels in the dataset.

>> ls dataset/
Events.Nev	csc01.ncs	csc02.ncs	csc03.ncs	sc1.nse		sc2.nse

>> hdr = ft_read_header('dataset')

hdr = 
                nChans: 5
                 label: {'csc028'  'csc028'  'csc028'  'Sc1'  'Sc1'}
              filename: {'dataset/csc01.ncs'  'dataset/csc02.ncs'  'dataset/csc03.ncs'  'dataset/sc1.nse'  'dataset/sc2.nse'}
               nTrials: 1
                    Fs: 32556
           nSamplesPre: 0
              nSamples: 97792
        FirstTimeStamp: 1007379572
         LastTimeStamp: 1010383712
    TimeStampPerSample: 30.7200
                  orig: [5x1 struct]

You can see from the header that some additional fields are included, such as the filename (needed to link a channel label to the corresponding file), the first timestamp, the last timestamp and the number of timestamps per channel.

It might be that you first only want to process the LFP channels and keep the spike channels separate. You can do that using the ft_read_spike function and the ft_appendspike function.


The events.nev file (which you probably use) only contains timestamps and not sample numbers. For writing trialfuns (see documentation) and using preprocessing to read the data, you should compute the corresponding sample numbers yourself by using hdr.FirstTimesStamp and hdr.TimeStampPerSample according to

  hdr   = ft_read_header('dataset_directory');
  event = ft_read_event('dataset_directory');
  
  for i=1:length(event)
    % the first sample in the datafile is 1
    event(i).sample = (event(i).timestamp-double(hdr.FirstTimeStamp))./hdr.TimeStampPerSample + 1;
  end

How can I deal with a discontinuous Neuralynx recording

It may occur that there are gaps in the Neuralynx recording traces, e.g. when the experimenter stops and re-starts the recording in the Cheetah software. The consequence is that the data samples do not form a continuous data representation any more. This can be detected offline, since the time-stamps that are stored along with the data will show gaps.

If there are gaps in the recording, the default way of linking timestamps to samples and vice versa will be incorrect. The default is is to assume a linear relationship, i.e.

timestamp = hdr.TimeStampPerSample * sample hdr.FirstTimeStamp

Due to the gaps in the recording, the timestamps that relate to events cannot be mapped like this onto sample numbers. Furthermore, ft_read_header computes the hdr.TimeStampPerSample without taking the gaps in the recording into account, so it will be incorrect.

The low-level read_neuralynx_ncs function detects the presence of gaps in the .Ncs file and issues a warning. If you get this warning, the solution is to read in the raw timestamps for all individual samples rather than relying on the monotonous linear relationship. This is possible for a single channel, the procedure does not yet work when reading a whole dataset at once (i.e. a directory containing multiple Ncs files).

One can then construct a time-stamp axis and interpolate in-between samples, and set samples occurring within gaps to NaNs. This is demonstrated in the following code:

% start with normal preprocessing of a single channel
cfg         = [];
cfg.dataset = 'CSC1.Ncs';
data        = ft_preprocessing(cfg);

Warning: discontinuous recording, predicted number of timestamps and observed number of timestamps differ by 1693523717.00
 Please consult the wiki on http://fieldtrip.fcdonders.nl/getting_started/neuralynx?&#discontinuous_recordings 
> In fileio/private/read_neuralynx_ncs at 94
  In ft_read_header at 1196
  In ft_preprocessing at 394 
  
>> disp(data)
           hdr: [1x1 struct]
         label: {'CSC1'}
          time: {[1x12902912 double]}
         trial: {[1x12902912 double]}
       fsample: 1893
    sampleinfo: [1 12902912]
           cfg: [1x1 struct]
           
figure
plot(data.time{1})
xlabel('sample number')
ylabel('time (s)')

This shows the default time axis of the data, which FieldTrip assumes to be continuous.

Now we continue with reading the actual timestamps and performing interpolation and gap-filling with NaNs across multiple channels. Your multiple channels can happen to have different start or end timestamps (which is another “feature” of the Cheetah software). This function then constructs a single timestamp-axis onto which all channels are represented.

function [data_all] = ft_read_neuralynx_interp(fname)

% FT_READ_NEURALYNX_INTERP reads a cell-array of NCS files and performs interpolation
% and gap-filling of the timestamp axis to correct for potentially different offsets
% across channels, and potential gaps within the recordings. All samples are being 
% read out.
%
% Inputs:
%  input FNAME is the list of CSC filenames
%  All of these channels should have the sample sampling frequency.
% Outputs:
%  data_all is a raw data structure containing interpolated data and NaNs at the gaps, 
%  based on all the available samples in a recording.
%

% Copyright (c) Martin Vinck, SILS, Center for Neuroscience, University of Amsterdam, 2013.

% first check if these are indeed ncs files
ftype = zeros(length(fname), 1);
for i=1:length(fname)
  if     ft_filetype(fname{i}, 'neuralynx_ncs')
    ftype(i) = 1;
  end
end
if ~all(ftype==1), error('some files do not correspond to ncs files'); end

% number of channels
nchans = length(fname);

% get the original headers
for i=1:nchans
  orig(i) = ft_read_header(fname{i});
end

% check if they all have the same sampling frequency, otherwise return error
for i=1:length(orig), SamplingFrequency(i) = orig(i).Fs; end
if any(SamplingFrequency~=SamplingFrequency(1))
  error('not all channels have the same sampling rate');
end

% get the minimum and maximum timestamp across all channels
for i = 1:nchans
  ts    = ft_read_data(fname{i}, 'timestamp', true);  
  mn(i) = ts(1);
  mx(i) = ts(end); 
  ts1   = ts(1);
  md(i) = mode(diff(double(ts-ts1)));

  % get the minimum and maximum across all channels
  if i>1 && mn(i)<min_all
    min_all = mn(i);
  else
    min_all = mn(i);
  end
  
  if i>1 && mx(i)>max_all
    max_all = mx(i);
  else
    max_all = mx(i); 
  end
end

% issue some warning if channels don't start or end at the same time
startflag = 0; endflag = 0;
if any(mn~=mn(1)), 
  warning('not all continuous channels start at the same time'); 
  startflag = 1;
end
if any(mx~=mx(1)), 
  endflag = 1; 
  warning('not all continuous channels end at the same time'); 
end
if any(md~=md(1)), warning('not all channels have same mode'); end

% take the mode of the modes and construct the interpolation axis
% the interpolation axis should be casted in doubles
mode_dts  = mode(md);
rng       = double(max_all-min_all); % this is small num, can be double
offset    = double(mn-min_all); % store the offset per channel
offsetmx  = double(max_all-mx);
tsinterp  = [0:mode_dts:rng]; % the timestamp interpolation axis

% loop over the channels if the channels have different timestamps
for i = 1:nchans
  cfg         = [];
  cfg.dataset = fname{i};
  data        = ft_preprocessing(cfg);
  ts          = ft_read_data(cfg.dataset, 'timestamp', true);  

  % original timestamaps in doubles, with the minimum ts subtracted
  ts          = double(ts-ts(1)) + offset(i); 
  
  % check if there are gaps to correct
  gaps     = find(diff(ts)>2*mode_dts); % skips at least a sample
  if isempty(gaps) && startflag==0 && endflag==0
    fprintf('there are no gaps and all channels start and end at same time, no interpolation performed\n');
  else  
    % interpolate the data
    datinterp   = interp1(ts, data.trial{1}, tsinterp);

    % you can use NaN to replace the data in the gaps
    gaps     = find(diff(ts)>2*mode_dts); % skips at least a sample
    for igap = 1:length(gaps)
      sel = tsinterp < ts(gaps(igap)+1) & tsinterp > ts(gaps(igap));
      datinterp(sel) = NaN;
    end
  
    % set data at the end and beginning to nans
    if startflag==1
      n = floor(offset(i)/mode_dts);
      if n>0
        datinterp(1:n) = NaN;
      end
    end
    
    % set data at the end and beginning to nans
    if endflag==1
      n = floor(offsetmx(i)/mode_dts);
      if n>0
        datinterp(end-n+1:end) = NaN;
      end
    end
    
    % update the FieldTrip data structure
    data.trial{1} = datinterp; clear datinterp
    data.time{1}  = [0:length(tsinterp)-1].*(1./data.hdr.Fs);
  end
  
  % append all the data 
  if i==1
    data_all = data;
    clear data
  else
    data_all      = ft_appenddata([], data_all, data);
    clear data
  end
end

% correct the header information and the sampling information
data_all.hdr.FirstTimeStamp     = min_all;
data_all.hdr.LastTimeStamp      = uint64(tsinterp(end)) + min_all;
data_all.hdr.TimeStampPerSample = mode_dts;
len = length(tsinterp);
data_all.hdr.nSamples           = len;
data_all.sampleinfo             = [1 len];
2013/03/17 12:22 · robert