Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] implement bids-schema - part 1 #124

Merged
merged 54 commits into from
Feb 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
898ee1c
create test for load_schema
Remi-Gau Jan 16, 2021
0e08410
add function to load schema
Remi-Gau Jan 16, 2021
9a20988
fix conflict
Remi-Gau Feb 6, 2021
fce12de
update load_schema test
Remi-Gau Jan 16, 2021
4eec143
add functions to deal with schema extensions, suffixes and entities
Remi-Gau Jan 16, 2021
aacc804
add function to return regular expression for a certain data type
Remi-Gau Jan 16, 2021
e14da01
update return_datatype_entities to new schema structure
Remi-Gau Feb 4, 2021
c76bf48
move schema functions
Remi-Gau Feb 4, 2021
84393d6
update tests to moxunit format
Remi-Gau Feb 4, 2021
8dac584
add test for parse_filename
Remi-Gau Feb 4, 2021
951077c
use schema to parse anat
Remi-Gau Feb 4, 2021
fe38416
linting, cleaning, use valid bids example for parse_filename
Remi-Gau Feb 4, 2021
0e42c53
use schema to parse behavioral and refactor
Remi-Gau Feb 4, 2021
6748545
parse ASL with schema
Remi-Gau Feb 4, 2021
3795b6b
parse dwi using schema
Remi-Gau Feb 4, 2021
c830bd9
linting
Remi-Gau Feb 4, 2021
16072de
parse func with schema
Remi-Gau Feb 4, 2021
c0e1934
fix issue of regexp on Octave
Remi-Gau Feb 4, 2021
e092dad
prepare for rebase
Remi-Gau Feb 6, 2021
a3b0134
handle asl file parsing with schema
Remi-Gau Feb 6, 2021
1d4115c
refactor asl manage m0scan, json, context, labelling
Remi-Gau Feb 6, 2021
2656a5f
refactor managing intended_for for ASL
Remi-Gau Feb 6, 2021
b0a4f35
pass schema as argument instead of reloading it
Remi-Gau Feb 6, 2021
60c4cd4
refactor handling of loading of metadata in layout
Remi-Gau Feb 6, 2021
accf9d3
refactor fmap parsing to use schema
Remi-Gau Feb 6, 2021
cc82818
parse ieeg with schema
Remi-Gau Feb 6, 2021
327adad
use schema to parse eeg
Remi-Gau Feb 6, 2021
8de1143
parse meg with schema and refactor
Remi-Gau Feb 7, 2021
831843e
reorganize tests and make list of target files to parse for meeg more…
Remi-Gau Feb 7, 2021
888ad55
add test dwi
Remi-Gau Feb 7, 2021
236e3c2
make file listing most general for all modalities
Remi-Gau Feb 7, 2021
641c889
make sure .pos files and meg.ds folders are parsed
Remi-Gau Feb 7, 2021
6008e33
refactor and linting
Remi-Gau Feb 7, 2021
910d193
refactor layout to manage most of the parsing through the same subfun…
Remi-Gau Feb 7, 2021
3208ebb
rename function and variables
Remi-Gau Feb 7, 2021
9bbbc87
rename function
Remi-Gau Feb 7, 2021
d43ca19
refactor query
Remi-Gau Feb 7, 2021
1332dec
allow schemaless parsing and test on derivatives
Remi-Gau Feb 7, 2021
2fc8d77
refactor schema handling
Remi-Gau Feb 8, 2021
52e919a
refactor description light validation
Remi-Gau Feb 8, 2021
d50ef3e
start implementing querying for dependencies
Remi-Gau Feb 8, 2021
f1b924b
fix CI
Remi-Gau Feb 8, 2021
f1e186b
increase depth fetch (in the hope that codecov will be happy)
Remi-Gau Feb 8, 2021
f6986a6
upload CI coverage
Remi-Gau Feb 8, 2021
a0af0d3
move all entities to sub-structure
Remi-Gau Feb 10, 2021
1659acd
rename keep file function
Remi-Gau Feb 10, 2021
b000789
remove parse_pet
Remi-Gau Feb 10, 2021
7f34319
small refactoring
Remi-Gau Feb 10, 2021
c9b3af8
refactor matching structure content
Remi-Gau Feb 15, 2021
3fa4c13
Attempt to fix CI failures
Remi-Gau Feb 15, 2021
814a9a7
fix assert in test
Remi-Gau Feb 15, 2021
849dfa9
update tests and lint
Remi-Gau Feb 15, 2021
e8ea572
update test to new schema
Remi-Gau Feb 15, 2021
e318a43
update test for parsing and querying derivatives
Remi-Gau Feb 15, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions +bids/+internal/add_missing_field.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
function structure = add_missing_field(structure, field)
if ~isfield(structure, field)
structure(1).(field) = '';
end
end
91 changes: 91 additions & 0 deletions +bids/+internal/append_to_layout.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
function subject = append_to_layout(file, subject, modality, schema)
%
% appends a file to the BIDS layout by parsing it according to the provided schema
%
% USAGE::
%
% subject = append_to_layout(file, subject, modality, schema == [])
%
% :param file:
% :type file: string
% :param subject: subject sub-structure from the BIDS layout
% :type subject: strcture
% :param modality:
% :type modality: string
% :param schema:
% :type schema: strcture
%
%
% Copyright (C) 2021--, BIDS-MATLAB developers

if ~exist('schema', 'var')
schema = [];
end

% Parse file fist to identify the suffix group in the template.
% Then reparse the file using the entity-label pairs defined in the schema.
p = bids.internal.parse_filename(file);

idx = find_suffix_group(modality, p.suffix, schema);

if ~isempty(schema)

if isempty(idx)
warning('append_to_structure:noMatchingSuffix', ...
'Skipping file with no valid suffix in schema: %s', file);
return
end

entities = bids.schema.return_modality_entities(schema.datatypes.(modality)(idx), schema);
p = bids.internal.parse_filename(file, entities);

end

% Check any new entity field that needs to be added into the layout or the output
% of the parsing to make sure the 2 structures can be concatenated
if ~isempty(subject.(modality))

[subject.(modality), p] = bids.internal.match_structure_fields(subject.(modality), p);

end

if isempty(subject.(modality))
subject.(modality) = p;
else
subject.(modality)(end + 1, 1) = p;
end

end

function idx = find_suffix_group(modality, suffix, schema)

idx = [];

if isempty(schema)
return
end

% the following loop could probably be improved with some cellfun magic
% cellfun(@(x, y) any(strcmp(x,y)), {p.type}, suffix_groups)
for i = 1:size(schema.datatypes.(modality), 1)

this_suffix_group = schema.datatypes.(modality)(i);

% for CI
if iscell(this_suffix_group)
this_suffix_group = this_suffix_group{1};
end

if any(strcmp(suffix, this_suffix_group.suffixes))
idx = i;
break
end

end

if isempty(idx)
warning('findSuffix:noMatchingSuffix', ...
'No corresponding suffix in schema for %s for datatype %s', suffix, modality);
end

end
1 change: 0 additions & 1 deletion +bids/+internal/file_utils.m
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,6 @@
files = dirs;

else

t = regexp(files, expr);

if numel(files) == 1 && ~iscell(t)
Expand Down
27 changes: 19 additions & 8 deletions +bids/+internal/get_metadata.m
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
N = 3;

% -There is a session level in the hierarchy
if isfield(p, 'ses') && ~isempty(p.ses)
if isfield(p.entities, 'ses') && ~isempty(p.entities.ses)
N = N + 1;
end

Expand All @@ -31,7 +31,7 @@

% -List the potential metadata files associated with this file suffix type
% Default is to assume it is a JSON file
metafile = bids.internal.file_utils('FPList', pth, sprintf(pattern, p.type));
metafile = bids.internal.file_utils('FPList', pth, sprintf(pattern, p.suffix));

if isempty(metafile)
metafile = {};
Expand All @@ -44,13 +44,17 @@
for i = 1:numel(metafile)

p2 = bids.internal.parse_filename(metafile{i});
fn = setdiff(fieldnames(p2), {'filename', 'ext', 'type'});
entities = {};
if isfield(p2, 'entities')
entities = fieldnames(p2.entities);
end

% -Check if this metadata file contains the same entity-label pairs as its
% data file counterpart
ismeta = true;
for j = 1:numel(fn)
if ~isfield(p, fn{j}) || ~strcmp(p.(fn{j}), p2.(fn{j}))
for j = 1:numel(entities)
if ~isfield(p.entities, entities{j}) || ...
~strcmp(p.entities.(entities{j}), p2.entities.(entities{j}))
ismeta = false;
break
end
Expand All @@ -73,9 +77,15 @@

end

% ==========================================================================
% -Inheritance principle
% ==========================================================================
if isempty(meta)
warning('No metadata for %s', filename);
end

end

% ==========================================================================
% -Inheritance principle
% ==========================================================================
function s1 = update_metadata(s1, s2, file)
if isempty(s2)
return
Expand All @@ -88,3 +98,4 @@
s1.(fn{i}) = s2.(fn{i});
end
end
end
32 changes: 32 additions & 0 deletions +bids/+internal/keep_file_for_query.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
function status = keep_file(file_struct, options)

status = true;

% suffix is treated separately as it is not one of the entities
for l = 1:size(options, 1)
if strcmp(options{l, 1}, 'suffix') && ~ismember(file_struct.suffix, options{l, 2})
status = false;
return
end
end

for l = 1:size(options, 1)

if ~strcmp(options{l, 1}, 'suffix')

if ~ismember(options{l, 1}, fieldnames(file_struct.entities))
status = false;
break
end

if isfield(file_struct.entities, options{l, 1}) && ...
~ismember(file_struct.entities.(options{l, 1}), options{l, 2})
status = false;
break
end

end

end

end
14 changes: 14 additions & 0 deletions +bids/+internal/match_structure_fields.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
function [s1, s2] = match_structure_fields(s1, s2)

missing_fields = setxor(fieldnames(s1), fieldnames(s2));

if ~isempty(missing_fields)
for iField = 1:numel(missing_fields)

s1 = bids.internal.add_missing_field(s1, missing_fields{iField});
s2 = bids.internal.add_missing_field(s2, missing_fields{iField});

end
end

end
52 changes: 32 additions & 20 deletions +bids/+internal/parse_filename.m
Original file line number Diff line number Diff line change
@@ -1,23 +1,35 @@
function p = parse_filename(filename, fields)
%
% Split a filename into its building constituents
% FORMAT p = bids.internal.parse_filename(filename, fields)
%
% USAGE::
%
% p = bids.internal.parse_filename(filename, fields)
%
% :param filename: fielname to parse that follows the pattern
% ``sub-label[_entity-label]*_suffix.extension``
% :type filename: string
% :param fields: cell of strings of the entities to use for parsing
% :type fields: cell
%
% Example:
%
% >> filename = '../sub-16/anat/sub-16_ses-mri_run-1_echo-2_FLASH.nii.gz';
% >> bids.internal.parse_filename(filename)
% filename = '../sub-16/anat/sub-16_ses-mri_run-1_acq-hd_T1w.nii.gz';
%
% ans =
% bids.internal.parse_filename(filename)
%
% ans =
%
% struct with fields:
%
% filename: 'sub-16_ses-mri_run-1_echo-2_FLASH.nii.gz'
% type: 'FLASH'
% ext: '.nii.gz'
% sub: '16'
% ses: 'mri'
% run: '1'
% echo: '2'
% 'filename', 'sub-16_ses-mri_run-1_acq-hd_T1w.nii.gz', ...
% 'suffix', 'T1w', ...
% 'ext', '.nii.gz', ...
% 'entities', struct('sub', '16', ...
% 'ses', 'mri', ...
% 'run', '1', ...
% 'acq', 'hd');
%
% __________________________________________________________________________

% Copyright (C) 2016-2018, Guillaume Flandin, Wellcome Centre for Human Neuroimaging
Expand All @@ -26,31 +38,31 @@
filename = bids.internal.file_utils(filename, 'filename');

% -Identify all the BIDS entity-label pairs present in the filename (delimited by "_")
% https://bids-specification.readthedocs.io/en/stable/99-appendices/04-entity-table.html
[parts, dummy] = regexp(filename, '(?:_)+', 'split', 'match'); %#ok<ASGLU>
p.filename = filename;

% -Identify the suffix and extension of this file
% https://bids-specification.readthedocs.io/en/stable/02-common-principles.html#file-name-structure
[p.type, p.ext] = strtok(parts{end}, '.');
[p.suffix, p.ext] = strtok(parts{end}, '.');

% -Separate the entity from the label for each pair identified above
for i = 1:numel(parts) - 1
[d, dummy] = regexp(parts{i}, '(?:\-)+', 'split', 'match'); %#ok<ASGLU>
p.(d{1}) = d{2};
p.entities.(d{1}) = d{2};
end

% -Extra fields can be added to the structure and ordered specifically.
if nargin == 2
for i = 1:numel(fields)
if ~isfield(p, fields{i})
p.(fields{i}) = '';
end
p.entities = bids.internal.add_missing_field(p.entities, fields{i});
end
try
p = orderfields(p, ['filename', 'ext', 'type', fields]);
p = orderfields(p, {'filename', 'ext', 'suffix', 'entities'});
p.entities = orderfields(p.entities, fields);
catch
warning('Ignoring file ''%s'' not matching template.', filename);
warning('bidsMatlab:noMatchingTemplate', ...
'Ignoring file %s not matching template.', filename);
p = struct([]);
end
end

end
19 changes: 19 additions & 0 deletions +bids/+internal/return_modality_extensions.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
function extensions = return_modality_extensions(modality)

extensions = '(';

% for CI
if iscell(modality)
modality = modality{1};
end

for iExt = 1:numel(modality.extensions)
if ~strcmp(modality.extensions{iExt}, '.json')
extensions = [extensions, modality.extensions{iExt}, '|']; %#ok<AGROW>
end
end

% Replace final "|" by a "){1}"
extensions(end:end + 3) = '){1}';

end
8 changes: 8 additions & 0 deletions +bids/+internal/return_modality_regular_expression.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
function regular_expression = return_modality_regular_expression(modality)

suffixes = bids.internal.return_modality_suffixes(modality);
extensions = bids.internal.return_modality_extensions(modality);

regular_expression = ['^%s.*' suffixes extensions '$'];

end
17 changes: 17 additions & 0 deletions +bids/+internal/return_modality_suffixes.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
function suffixes = return_modality_suffixes(modality)

suffixes = '_(';

% For CI
if iscell(modality)
modality = modality{1};
end

for iExt = 1:numel(modality(:).suffixes)
suffixes = [suffixes, modality.suffixes{iExt}, '|']; %#ok<AGROW>
end

% Replace final "|" by a "){1}"
suffixes(end:end + 3) = '){1}';

end
Loading