Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RemoveBadChannelsRecording class and remove_bad_channels #3685

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

chrishalcrow
Copy link
Collaborator

Part of plan to class-ifty all preprocessing steps. Here's an implementation of a RemoveBadChannelsRecording class, inheriting from ChannelSliceRecording.

Users can now use

recording_with_bad_channels_removed = remove_bad_channels(recording)

and this will detect and remove the bad channels.

Most difficult thing was the kwargs. RemoveBadChannelsRecording shares much of the same signature as detect_bad_channels, and I didn't want to duplicate the default parameters, but did want to save them in the RemoveBadChannelsRecording._kwargs. Ended up using inspect from signature, as is also done in the motion module. Happy to discuss different implementations.

@chrishalcrow chrishalcrow added enhancement New feature or request preprocessing Related to preprocessing module labels Feb 12, 2025
Copy link
Collaborator

@JoeZiminski JoeZiminski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, few suggestions / questions. Really love this as it solves a big problem for me!

Spatial radius below which two channels are considered neighbors in the neighborhood_r2 method.
seed : int or None, default: None
The random seed to extract chunks
{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required for the doc formatting?

I think this works nicely and makes sense not to duplicate the docs, but could have a note here for users reading the code that the full params list is at RemoveBadChannelRecording (though then this might be strange for docs readers).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's quite hard to do this, since we can't add comments to docstrings. The pattern is used a fair bit in the package, and I do remember getting confused. Hmmm.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required for the doc formatting. Basically f-strings were decided to never be used for docstrings by the powers that be so you have to use .format for docstring injection for this, so this has been the long-term solution here (and at Neo --so maybe this Sam) for ages.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# make sure they are removed
assert len(set(new_rec._kwargs["bad_channel_ids"]).intersection(new_rec.channel_ids)) == 0
# and that the kwarg is propogatged to the kwargs of new_rec.
assert new_rec._kwargs["noisy_channel_threshold"] == 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything to worry about channel ordering here? I guess not (and this is a question for ChannelSliceRecording tests anyways. But as I have no understanding of how the ordering works I thought worth asking 😆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering of the channel ids?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erm like order of the channels on the recording itself (I'm not sure how exactly this is represented 😅 ). But like the default order when you do plot_traces without order_channel_by_depth

The recording with bad channels removed
"""

params_doc = """Different methods are implemented:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this outside of the class and just call it _bad_channel_detection_kwargs_doc

updated_detect_bad_channels_kwargs = {k: v.default for k, v in sig.parameters.items() if k != "recording"}
updated_detect_bad_channels_kwargs.update(detect_bad_channels_kwargs)

bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)
bad_channel_ids = detect_bad_channels_kwargs.get("bad_channel_ids")
if bad_channel_ids is None:
bad_channel_ids, channel_labels = detect_bad_channels(recording=recording, **detect_bad_channels_kwargs)

We don't have to rerun bad channel detecion if bad channels have already been computed. Otherwise, if we parallelize this then each process/thread will rerun detection.

We should also keep the ProcessingPipeline in mind. In case we have a processing pipeline, we should differentiate somehow the lazy/non-lazy computation steps.
An option could be to extend the self._kwargs mechanism, and add a self._precomputed_kwargs (in this case this would have the bad_channel_ids).
When loading an object, we could then specify whether we want to use or ignore the precomputed kwargs. When applying the pipeline to another object, you would ignore the precomputed, so that channel detection will be repreformed on the new recording.

Does it make sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you're very right, great point. For the ProcessingPipeline I was thinking that the main distinction is if the parameter is recording-specific or not. If the kwarg (e.g. bad_channel_ids or whitening_matrix - these might be the only examples??) depends on which recording it's applied to, it shouldn't be used by the pipeline. I reckon this is the exact same set of kwargs that could be precomputed, since you'll always need a recording to do a precomputation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_kwargs are handled by the subclasses, so would we need to add _precomputed_kwargs to all the preprocessing subclasses?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be done in the base object. It will need to be there because we will have to handle it upon loading and give the option to add them to the main _kwargs (default), or skip them. @samuelgarcia what do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, we could use hasattr to check if a preprocessing class has possible _precomputed_kwargs, nice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should never change the self._kwargs yourself :) In case someone does, we can't expect that function behave correctly

Copy link
Collaborator

@JoeZiminski JoeZiminski Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: sorry I think I am confusing two cases and these are two separate issues:

  1. Might passing a last of channels to remove in detect_bad_channel_kwargs be a bit confusing? Can this be a separate argument? must pass either bad_channel_ids or detect_bad_channel_kwargs?

  2. If I understand correctly the idea would be to store the bad channel ids in self._kwargs and not re-detect in case they were already computed. I meant change the public function kwargs sorry, would this case cause a problem:

chan_removed_rec = remove_bad_channels(recording, noisy_channel_threshold=5)

# do some plotting or something to check the remove channels, and decide want to remove a few more

chan_removed_rec = remove_bad_channels(chan_removed_rec, noisy_channel_threshold=1)

Copy link
Collaborator Author

@chrishalcrow chrishalcrow Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, having bad_channel_ids as a separate argument sounds ok to me.

I think your example wouldn't cause a problem because they are two independent RemoveBadChannelRecording classes each with their own kwargs. Will check...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarise the problem (and clarify in my head...)

The difficult bit is that there are two things we want. First, if you load a recording from the recording.json file, it should quickly be able to reconstruct the preprocessed recording. For this to work and be fast, we need to save the bad_channel_ids in the kwargs and apply them when we load the recording, rather than re-computing.

On the other hand, thinking about the PreprocessingPipeline, we want people to be able to share their pipelines and apply a pipeline from another lab to their own data. To do this, we need to be able to share which kwargs were used to detect the bad channels, but not the bad channels themselves; since different recordings have different bad channels.

So we need to mark out that as a special kwarg: apply it when you load the recording, but don't add it to a PreprocessingPipeline. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes total sense

Copy link
Member

@alejoe91 alejoe91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's figure out how to deal with precomputed kwargs before merging this

@samuelgarcia
Copy link
Member

We need a strong refactoring to separate kwargs in 2 parts, the parametrsation and caching some heavy computation.
This is on my todo list since 3 years now...
Lets do it before.

@h-mayorquin
Copy link
Collaborator

We need a strong refactoring to separate kwargs in 2 parts, the parametrsation and caching some heavy computation. This is on my todo list since 3 years now... Lets do it before.

+1 on discussing this.

Sam mentioned it on the last meeting. Would be good to bring it back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request preprocessing Related to preprocessing module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants