Skip to content

Conversation

@bendichter
Copy link
Contributor

@bendichter bendichter commented Oct 25, 2025

Add comprehensive support for audio and video recordings in behavioral
experiments:

- Add audio file extensions (mp3, wav) and video file extensions
  (mp4, mkv, avi) with corresponding _audio and _video suffixes
- Document usage of audio/video recordings in beh directory for
  capturing vocalizations, speech, facial expressions, and body movements
- Add metadata schema for audio/video device information and stream
  properties
- Include privacy warnings about personally identifiable information
  in human subject recordings
- Update behavioral experiments title to remove "with no neural
  recordings" restriction, clarifying data can be stored with or
  without neural recordings
- Add examples for file organization including multi-angle recordings
  and split files
- Define optional entities: task, acquisition, run, recording, split
@yarikoptic yarikoptic changed the title SCHEMA: Add audio video SCHEMA: Add audio video behavioral data support Oct 25, 2025
@yarikoptic yarikoptic added the schema Issues related to the YAML schema representation of the specification. Patch version release. label Oct 25, 2025
…ee macros

- Change section title from 'Behavioral experiments' to 'Behavioral recordings'
- Convert file tree examples to use MACROS___make_filetree_example for consistent rendering
- Address review comments from @yarikoptic in PR #2231
Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this makes sense to me. It would be good to get some feedback from contributors to related BEPs, such as eye-tracking (20), motion (29), stimuli (44) and physio (45). Even if this PR doesn't propose adding this as an associated file to those data types, the potential is there and it's worth getting opinions and identifying potential conflicts.

cc @bids-standard/bep029 @bids-standard/bep044
cc @mszinte @julia-pfarr @oesteban (BEP020)
cc @m-miedema @smoia @SouravKulkarni (?) (BEP045)

@effigies effigies changed the title SCHEMA: Add audio video behavioral data support [ENH] Add audio/video recordings to behavioral experiments Oct 28, 2025
@codecov
Copy link

codecov bot commented Oct 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.81%. Comparing base (d97bcf9) to head (98eea5e).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2231   +/-   ##
=======================================
  Coverage   82.81%   82.81%           
=======================================
  Files          22       22           
  Lines        1693     1693           
=======================================
  Hits         1402     1402           
  Misses        291      291           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Introduce `AudioBitDepth` and `CameraPosition` object definitions, and allow them as optional fields in BEH sidecar `AudioVideoStreams` to capture richer audio recording parameters and camera setup context.
@bendichter
Copy link
Contributor Author

See bids-standard/bids-examples#523 for an example dataset that validates under this schema

@yarikoptic
Copy link
Collaborator

@neuromechanist this PR has lots of overlap with your work on

Please review here and also the

since we should aim for consistency regarding the metadata on audio/video files etc

@neuromechanist
Copy link
Member

My first suggestion (🤓) would be to request a BEP number for this PR from maintainers, as this is quite a change, and arguably/conceptually adds a new modality (although audio and video are being added in BEP044), which would require a BEP. I'll take it up to this week's maintainer meeting.

@bendichter
Copy link
Contributor Author

bendichter commented Dec 15, 2025

@neuromechanist

OK, I have submitted a PR to the website to make this an official BEP here: bids-standard/bids-website#759

@bendichter
Copy link
Contributor Author

bendichter commented Jan 6, 2026

Comparison: vs PR #2022

Summary

  • This PR: Adds audio/video recordings of subjects behaving to the beh/ datatype
  • PR [ENH] BEP044 - Stim-BIDS #2022 (BEP044 - Stim-BIDS): Adds organized stimulus files to the root-level /stimuli directory

Key Differences

Aspect This PR PR #2022 (Stim-BIDS)
Location sub-XX/beh/ (subject-scoped) /stimuli/ (root-level, shared)
Suffixes _audio, _video _audio, _video, _audiovideo, _image
Audio formats .flac, .mp3, .ogg, .wav .wav, .mp3, .aac, .ogg
Video formats .mp4, .mkv, .avi .mp4, .avi, .mkv, .webm
Unique formats .flac (audio) .aac (audio), .webm (video), image formats (.jpg, .png, .svg, .webp)
Key entity recording-<label> (for multiple angles) stim-<label> (stimulus identifier)
Catalog files None (uses scans.tsv for timing) stimuli.tsv, annotations.tsv
Events linking _events.tsv alongside recordings stim_id column in events.tsv
Metadata focus Technical (AudioSampleRate, FrameRate, Height, Width, Duration, CameraPosition, AudioBitDepth) Attribution (License, Copyright, URL, Description)
Privacy concern Explicit PII warning for human subjects Not emphasized (stimuli, not subjects)
Part splitting split-<index> for continuous recordings part-<label> for stimulus segments
Annotation system Standard _events.tsv Dedicated _annot-<label>_events.tsv

Notable Distinctions

  1. _audiovideo suffix: Only in PR [ENH] BEP044 - Stim-BIDS #2022 (BEP044), explicitly distinguishes files with both audio and video streams from video-only files

  2. _image suffix: Only in PR [ENH] BEP044 - Stim-BIDS #2022 (BEP044), adds support for static visual stimuli

  3. Reusability: PR [ENH] BEP044 - Stim-BIDS #2022 emphasizes stimulus reuse across subjects/studies (centralized in /stimuli), while PR [ENH] Add audio/video recordings to behavioral experiments #2231 ties recordings to specific subjects

  4. Annotation richness: PR [ENH] BEP044 - Stim-BIDS #2022 has a more elaborate annotation system with annotations.tsv and annot-<label> entity for multiple annotation sets per stimulus

  5. New columns: PR [ENH] BEP044 - Stim-BIDS #2022 adds stim_id column to events.tsv; PR [ENH] Add audio/video recordings to behavioral experiments #2231 doesn't add new event columns

  6. Timing alignment: PR [ENH] Add audio/video recordings to behavioral experiments #2231 references scans.tsv for synchronization with other modalities; PR [ENH] BEP044 - Stim-BIDS #2022 focuses on stimulus onset/duration in events files

These PRs are complementary—one captures what the subject does (behavioral recordings), the other captures what's shown to the subject (stimuli).

@neuromechanist , thoughts?

description: |
Width of the video in pixels (for example, `1920`).
type: integer
minimum: 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be quite valuable to expose at least some details on the underlying codec(s) used for audio/video within the files, so e.g. we could assess if e.g. browser would play it etc?

@satra
Copy link
Collaborator

satra commented Jan 7, 2026

thanks @bendichter - you have described the issue. if beh contains different types of acquisitions about subject behavior then func should not just be BOLD and ASL, it should include EEG/MEG/iEEG, etc.,. i think some clarity how the organizational principles are different in behavior versus others would be good to add. the last part of your proposed sentence doesn't match up with what i wrote about func above. bids clearly says modality specific files in its documentation.

@oesteban
Copy link
Collaborator

oesteban commented Jan 7, 2026

if beh contains different types of acquisitions about subject behavior then func should not just be BOLD and ASL, it should include EEG/MEG/iEEG, etc.

+1000

Indeed, add PET to the list. Looking beyond func/ BIDS seems to be choosing to define differences between modalities so data type folders and suffixes end up encoding the same thing, and most BEPs choose this path to move forward.

@bendichter
Copy link
Contributor Author

@oesteban, @satra, Thanks for engaging on this. I'm having trouble understanding what changes you're suggesting to this PR though. Are you asking for specific modifications to the proposed specification text, or is this a broader concern about BIDS organizational principles?

If there's something actionable I can do here, I'm happy to consider it. But if the concern is about how BIDS has historically organized modalities, that seems like a separate discussion from whether audio/video/events should live together in beh/.

@oesteban
Copy link
Collaborator

oesteban commented Jan 7, 2026

is this a broader concern about BIDS organizational principles?

Yes it is in my case. I haven't been able to follow on this particular PR so please don't take my message for an objection.

@satra
Copy link
Collaborator

satra commented Jan 8, 2026

is this a broader concern about BIDS organizational principles?

same here. and it is not an objection, but that people coming to this will see two conflicting grouping mechanisms. a note added to the PR to the general section (or in the beh section) in relation to these two grouping mechanisms would help people understand the difference and perhaps help addressing (by some group) in the future.

@yarikoptic
Copy link
Collaborator

yarikoptic commented Jan 8, 2026

Here are the issues relating to potentially changing the folder organization within BIDS, I think it is better to discuss it among those

and leave this BEP in alignment with current state of allowing func/ or other potentially "functional" data modalities ('eeg/' etc) to contain associated behavioral and then beh/ to absorb behavioral data in case of absent instrumental data or where it would make more sense to keep it separate.

P.S. Also in general let's prefer commenting on the diff, instead of using the main thread here, since we cannot easily group of related comments out from within main thread.

@oesteban
Copy link
Collaborator

oesteban commented Jan 8, 2026

Here are the issues relating to potentially changing the folder organization within BIDS, I think it is better to discuss it among those

IMHO, this is a problem of today. It'd be great that BIDS 2.0 had an elegant/more consistent response. However, BIDS needs to address this for future BEPs (cc/ @ericearl)

@yarikoptic
Copy link
Collaborator

Let's continue on that in

@bendichter
Copy link
Contributor Author

@neuromechanist

  1. On the audio video vs. audiovideo labeling, we went back and forth a bit in the issue. For us it doesn't make a huge difference, since you would be able to parse that from the metadata about the streams in the json sidecar anyway. If you feel strongly about audiovideo I would be fine with changing it in interest of consistency.

  2. _image. I suppose one could take a picture of a subject performing a task task. I don't know if that's recording behavior per se, but I'd be fine with adding it if you think we should.

  3. I think one of our biggest differences is with the metadata in the sidecar files. Yours is attribution (License, Copyright, URL, Description) and this one is technical (AudioSampleRate, FrameRate, Height, Width, Duration, CameraPosition, AudioBitDepth). I don't think adding attribution to ours makes much sense. It will generally share the license of the rest of the dataset. However, I do think it might make sense for you to adopt our technical attributes. Maybe not CameraPosition, but it might be nice to be able to get AudioSampleRate, FrameRate, Height, Width, Duration without reading the data file.

  4. Our splitting is different. I think split is more consistent with existing usage. The only mention I see of part in the existing schema is:

part
Full name: Part

Format: part-

Allowed values: mag, phase, real, imag

Definition: This entity is used to indicate which component of the complex representation of the MRI signal is represented in voxel data. The part- entity is associated with the DICOM Tag 0008, 9208. Allowed label values for this entity are phase, mag, real and imag, which are typically used in part-mag/part-phase or part-real/part-imag pairs of files.

Phase images MAY be in radians or in arbitrary units. The sidecar JSON file MUST include the "Units" of the phase image. The possible options are "rad" or "arbitrary".

When there is only a magnitude image of a given type, the part entity MAY be omitted.

whereas split already has to do with splitting large files:

split
Full name: Split

Format: split-

Definition: In the case of long data recordings that exceed a file size of 2Gb, .fif files are conventionally split into multiple parts. Each of these files has an internal pointer to the next file. This is important when renaming these split recordings to the BIDS convention.

Instead of a simple renaming, files should be read in and saved under their new names with dedicated tools like MNE-Python, which will ensure that not only the filenames, but also the internal file pointers, will be updated.

It is RECOMMENDED that .fif files with multiple parts use the split- entity to indicate each part. If there are multiple parts of a recording and the optional scans.tsv is provided, all files MUST be listed separately in scans.tsv and the entries for the acq_time column in scans.tsv MUST all be identical, as described in Scans file.

though I can see in your case why you might want to use part, if you are splitting the stimulus up into logical components, like chapters of an audiobook. I don't mind terribly if we use different approaches for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

schema Issues related to the YAML schema representation of the specification. Patch version release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BEP for audio/video capture of behaving subjects

9 participants