-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Output to BIDS #750
Comments
How do you mean? Like a 'dandi organize' helper function to name files in folder in a particular way? Are there existing tools/libraries for that so we don't have to be the ones to maintain it (again, like |
Yes, Basically DANDI is just a user of NWB and BIDS, not a provider of NWB/BIDS tools (nwbinspector and bidsschematools are also not part of DANDI), so why then would BIDS formatting of NWB files be contingent on DANDI? By analogy to e.g. MRI workflows, usually -to-NIfTI converters offer the capability to either create BIDS-NIfTI, or just output NIfTI files with whatever names. In the case of NWB, the top-level conversion package is neuroconv, so it would make a lot of sense to offer the option for BIDS-formattable output in addition to just a user-specified string for the file. Since it's still early enough I could move the feature PR from DANDI to here. The up-side is that neuroconv could then advertise that it can comply with BIDS. |
I agree that separation of concerns is a good idea - but passing the buck to NeuroConv specifically may not be the best location for all that logic Input to NeuroConv is source data from proprietary formats (files/folders in structure of those formats) and output, by default, NWB files at whatever paths with whatever structure the user specifies. The basic recommendation (and tests) is to output as a flat collection of uniquely named NWB files to make it easy for I'd be happy to add convenience wrappers/options/instructions here to make it easy for someone to take their flat list of NWB files created through NeuroConv and organize them into DANDI or BIDS (likely as an extra option to the I would advocate for the creation of a specific standalone package for file organization strategies that can be used as a modular step in the workflow after creation of the NWB files and before DANDI upload. It's own dependency stack would also benefit tremendously from that isolation away from NeuroConv (e.g., what about people making their own files without NeuroConv?) |
There is already a package which specifically deals with the sort of data we're looking to reformat, https://github.com/INT-NIT/BEP032tools/ , we could indeed extend that. Though BIDS supports NWB even outside of BEP032, so in the end it might be best to just have another package. Not sure how long-term-maintenance would look on that.
This would be a good solution if you really don't want BIDS-reformatting functionality mixed in your code. |
To be clear, I'd be happy to include the functionality; I just don't want to be the one responsible for maintaining the code and keeping it up to date with BIDS standard (unless this gets negotiated in the upcoming DANDI renewal scope, so we'll see). We have enough to maintain here as it is lol Would appreciate a minimal example of how to call/use whatever function you're planning to extend whenever it's available |
@CodyCBakerPhD ok, I have a draft package for this → https://github.com/con/nwb2bids I'd like to set up a test for it, and I'd like to have a “blank” NWB file (with all the metadata, but lengt-0 data arrays) to test this on... but I can't seem to generate one from my data:
Do you maybe have some empty example data files somewhere? |
The specific error there comes from the Otherwise I got the following to work import pynwb
import numpy
from pynwb.testing.mock.file import mock_NWBFile
nwbfile = mock_NWBFile()
time_series = pynwb.TimeSeries(name="Test", data=numpy.array([], dtype="uint8"), unit="n.a.", rate=1.0)
nwbfile.add_acquisition(time_series)
with pynwb.NWBHDF5IO(path="./test_empty.nwb", mode="w") as io:
io.write(nwbfile) |
@CodyCBakerPhD thanks for the example code. I guess in a pinch I could go with that, but since the nwb2bids conversion focuses on handling metadata, I'd like to take one of the files I have, and just replace the data arrays from it with empty arrays. Is there any way to take a file and remove the “data” from it? |
I would not recommend this approach It would also be more explicit/readable to specify the metadata on the test files rather than inheriting from an implicit source |
A helper script to make some test ecephys NWB files using simulated data in a semi-realistic manner (SpikeGLX inspired) If the synthetic recording/spiking is overkill for your purposes we can also just use the import datetime
import tempfile
import neuroconv
import spikeinterface
from neuroconv.datainterfaces import PhySortingInterface
from neuroconv.tools.spikeinterface import add_recording
from spikeinterface.exporters import export_to_phy
# Define NeuroPixel-like values for sampling rates and conversion factors
duration_in_s = 0.5 # Shortened from ~3 for demo purposes. Can't be too short or else errors in extraction step
number_of_units = 10
number_of_channels = 385 # Have to include 'sync' channel to be proper SpikeGLX. TODO: artificiate sync pulses
ap_conversion_factor_to_uV = 2.34375
ap_sampling_frequency = 30_000.0
lf_sampling_frequency = 2_500.0
downsample_factor = int(ap_sampling_frequency / lf_sampling_frequency)
# Generate synthetic spiking and voltage traces with waveforms around them
artificial_ap_band, spiking = spikeinterface.generate_ground_truth_recording(
durations=[duration_in_s],
sampling_frequency=ap_sampling_frequency,
num_channels=number_of_channels,
dtype="float32",
num_units=number_of_units,
seed=0, # Fixed seed for reproducibility
)
artificial_ap_band.set_channel_gains(gains=ap_conversion_factor_to_uV)
waveform_extractor = spikeinterface.extract_waveforms(recording=artificial_ap_band, sorting=spiking, mode="memory")
int16_artificial_ap_band = artificial_ap_band.astype(dtype="int16")
# Kind of messy that this intermediate step is needed, but can't figure out direct out to NWB
temporary_folder = tempfile.TemporaryDirectory()
export_to_phy(
waveform_extractor=waveform_extractor, output_folder=temporary_folder.name, remove_if_exists=True, copy_binary=False
)
phy_interface = PhySortingInterface(folder_path=temporary_folder.name)
# Write to NWB
# This is the part where the BIDS or other organization comes into play?
# Or start from existing files and make symlinks to them?
metadata = dict(
NWBFile=dict(session_id="testabc", session_start_time=datetime.datetime(1970, 1, 1).isoformat()),
Subject=dict(subject_id="test123", species="Mus musculus", age="P30D", sex="O"),
)
nwbfile_path = "C:/Users/Raven/Downloads/test_data.nwb"
with neuroconv.tools.nwb_helpers.make_or_load_nwbfile(
nwbfile_path=nwbfile_path, metadata=metadata, overwrite=True
) as nwbfile:
add_recording(recording=int16_artificial_ap_band, nwbfile=nwbfile)
phy_interface.add_to_nwbfile(nwbfile=nwbfile, metadata=phy_interface.get_metadata())
# add_sorting(sorting=spiking, nwbfile=nwbfile) |
What would you like to see added to NeuroConv?
Would you be interested in having neuroconv output BIDS-formatted NWB files? NWB is already supported as a format by BIDS.
Is your feature request related to a problem?
No response
Do you have any interest in helping implement the feature?
Yes.
Code of Conduct
The text was updated successfully, but these errors were encountered: