Extended documentation

aecker · May 2, 2014 · 15af2ec · 15af2ec
1 parent bff6766
commit 15af2ec
Showing 1 changed file with 73 additions and 20 deletions.
diff --git a/README.md b/README.md
@@ -35,12 +35,20 @@ Understanding the code
 General organization of the code
 --------------------------------
 
-We used a data management framework called [DataJoint](https://github.com/datajoint) to organize data and code. Under DataJoint the results of any analysis are stored in a relational database. Each result is stored along with the parameters that were used to obtain it. In addition, dependencies between subsequent analysis steps are kept track of and are enforced automatically. This process tremendously simplifies staying on top of complex analysis toolchains, such as the one for the current project.
+We used a data management framework called [DataJoint](https://github.com/datajoint) to organize data and code. Under DataJoint the results of any analysis are stored in a relational database. Each result is stored along with the parameters that were used to obtain it. In addition, dependencies between subsequent analysis steps are kept track of and are enforced automatically. This process tremendously simplifies staying on top of complex analysis toolchains, such as the one for the current project. Details can be found in the [DataJoint documentation](https://github.com/datajoint/datajoint-matlab/wiki).
 
 In DataJoint, every analysis consists of one or multiple database tables, each of which is associated with its own Matlab class. Such a group of tables is populated automatically via a specific class method called makeTuples(). This method is where the actual work is done. 
 
 Below we provide an overview of the analysis tables used for the current project. In addition, the functions that create the figures are a good entry point to find out which classes/tables are relevant for a certain figure. These functions are located in the folder 'figures'. You will notice that those functions don't do much other than getting data/results from the database [fetch(...)], plot it and sometimes do some statistics on it.
 
+In addition to the classes defined in this repository, we use a number of general purpose libraries for the basic organization of our experimental data and tasks such as spike detection and sorting. These libraries can be found on Github as well:
+
+	* https://github.com/atlab/sessions -- meta data stored by acquisition system and processing toolchain (database schemas acq, detect, sort)
+	* https://github.com/atlab/spikedetection -- spike detection
+	* https://github.com/aecker/moksm -- spike sorting
+	* https://github.com/aecker/gpfa -- Gaussian Process Factor Analysis (GPFA)
+
+
 
 
 Documentation of individual analysis steps
@@ -49,61 +57,106 @@ Documentation of individual analysis steps
 
 ### Spike detection
 
+Spike detection is done by a separate library. See https://github.com/atlab/spikedetection
+The function `detectSpikesTetrodesV2.m` is used.
 
 
-### Spike sorting
 
+### Spike sorting
 
+Parameter settings etc. are defined in `sessions/sort.KalmanAutomatic`. The actual spike sorting is done by a separate library. See https://github.com/aecker/moksm for more information.
 
-### Basic unit statistics (rate, variance, instability)
 
-classes nc.UnitStats*
 
+### Basic single unit properties
 
+Basic single unit statistics such as firing rates, variances, rate stability etc. are computed in the class `nc.UnitStats`. Quantities related to tuning properties such as orientation and direction tuning, their significance and visual responsiveness are computed in the class `nc.OriTuning`. For statistics of pairs of neurons, see next section (Noise Correlations).
 
-### Orientation tuning
 
-classes nc.OriTuning*
 
+### Noise correlations (Fig. 2)
 
+Pairwise analyses such as computing signal and noise correlations, but also more basic properties such as geometric mean firing rate, distance between electrodes, maximum contamination or rate instability, are computed in the class `nc.NoiseCorrelationSet`.
 
-### Noise correlations
 
-classes nc.NoiseCorrelation*
 
+### Fitting and evaluation of GPFA model (Fig. 3–5, 7)
 
+Fitting the GPFA model is performed in the classses `nc.GpfaModelSet` (for evoked responses) and `nc.GpfaSpontSet` (for spontaneous responses). More precisely, these classes deal with preparing the data, i.e. partitioning for cross-validation, pre-transforming etc. The actual work (model fitting and evaluation) is done by a separate library. See https://github.com/aecker/gpfa for details.
 
-### Fitting GPFA model
+Computing variance explained and residual correlations is performed by the class `nc.GpfaResidCorrSet`. Again here, the actual work is done by the GPFA library.
 
-repo aecker/gpfa
-classes nc.Gpfa*
 
 
+### GLM with LFP as input (Fig. 8)
 
-### GPFA model for spontaneous activity
+The Generalized Linear Model using the local field potention (LFP) as input to predict correlated variability is fit by the class `nc.LfpGlmSet`.
 
-classes nc.GpfaSpont*
 
 
+### Spectral analysis of LFP (Fig. 9)
 
-### GLM with LFP as input
+Power spectrograms of the LFP are computed in the class `nc.LfpSpectrogram`. The correlation between LFP power ratio and overall correlation is performed by the classes `nc.LfpPowerRatioGpfaSet` and `nc.NetworkStateVar`.
 
-classes nc.LfpGlm*
 
 
+### Inclusion criteria
 
-### Spectral analysis of LFP
+All sessions and cells that were included in the analysis are listed in the tables `nc.AnalysisStims` and `nc.AnalysisUnits`. The populate relation (property `popRel`) of those classes defines the restrictions that are applied.
 
-classes nc.LfpSpectrogram*
-classes nc.LfpPowerRatio*
 
 
 
 General outline of database structure
 -------------------------------------
 
-* Describe schemas act, detect, sort, ephys, ae, nc
-* Give some pointers where to find certain (meta)data
+Each Matlab package (+xyz) maps to a database schema and each Matlab class to a database table. 
+
+
+### Schema `acq`
+
+This schema contains the metadata entered into the database by the recording software during data collection. The following tables are most relevant:
+
+	* `Subjects`: list of monkeys
+	* `Sessions`: experimental sessions (can contain multiple recordings and stimuli)
+	* `Ephys`: electrophysiological recording
+	* `Stimulation`: visual stimulus presentation
+	* `EphysStimuliationLink`: links simultaneous ephys recordings and stimulus presentations
+
+
+
+### Schema `detect`
+
+Implements the spike detection part of the processing toolchain. This is highly customized code to be run on specific, optimized computers. The actual spike detection is done by an [external library](https://github.com/atlab/spikedetection).
+
+
+
+### Schema `sort`
+
+Implements the spike sorting toolchain. The `sort.Kalman*` classes are the ones used for the Mixture of Kalman filter model we use (Calabrese & Paninski 2011). The actual model fitting is done by an [external library](https://github.com/aecker/moksm).
+
+
+
+### Schema `ephys`
+
+The schema contains general purpose electrophysiology tables. Only `ephys.Spikes` is relevant here; it contains the spike times for each single unit.
+
+
+
+### Schema `ae`
+
+This schema contains my (AE) general purpose electrophysiology tables. The following tables are most relevant:
+
+	* `SpikesByTrial`: spike times relative to stimulus onset for each trial
+	* `SpikeCounts`: spike count in a certain window (see `SpikeCountParams`) for each cell and each trial
+	* `Lfp`: bandpass-filtered LFP trace (see `LfpParams` for parameters)
+	* `LfpByTrial`: LFP snippet for each trial aligned to stimulus onset
+
+
+
+### Schema `nc`
+
+This schema contains the concrete analyses done in the paper. The specific tables are listed in the previous sections for each analysis.