Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH, MRG] Add interpolation per epoch #12334

Closed
wants to merge 9 commits into from

Conversation

alexrockhill
Copy link
Contributor

Complementing the functionality of autoreject, it would be nice to interpolate channels only on some epochs. For instance, if you epochs.drop_bad and notice that there are say 15 epochs all with a different bad channel, it could be just pops or other artifact that interpolation would save you an epoch and/or a channel.

@alexrockhill alexrockhill requested a review from larsoner as a code owner January 4, 2024 00:12
@mscheltienne
Copy link
Member

I like the idea to save a couple of Epochs, but I'm -1 on adding an extra method specifically for this. It is simple, but will further congest the API and creating the input interp_chs will be confusing for some users. I would prefer that epochs.interpolate_bads(...) becomes smart enough to support this use-case.
The "nice" feature of interpolate_bads is that it knows what are the bad channels. You do not need to provide an argument containing the channels to interpolate. Similarly, we could extend the .info["bads"] field for Epochs object to support storing bad channels per epochs. Maybe info["bads"]: list[str] | dict[int, list[str]]?
As a second and third step, we could add a method to generate this dictionary from a threshold and enable interactive bad channel selections in the 2D browser (clicking on a trace will mark the channel as bad in the epochs, clicking on the background will mark the epoch as bad).

It's definitely more work, but I think it would be worth in the long run.

@alexrockhill
Copy link
Contributor Author

Really good points I hadn't considered.

If we change the info['bads'] it will have to survive roundtrip to disk which might involve modifying the FIF standard unless it just works, I'll check.

I think if info['bads'] is modified to be a list, that's going to break some things, namely plotting is the first that comes to mind so that would have to be fixed first or at the same time I think.

It might be a lot of work to store the bad channels per epoch in info['bads'] if it involves modifying the FIF standard, so much so that I know other similar situations have been worked around because it is ideal to leave FIF be. If that's the case, we can just add interp_chs overloads in the two cases that come to mind (interpolation and plotting) (and any more of those are helpful) without making new API entries, good call on that.

@larsoner
Copy link
Member

larsoner commented Jan 4, 2024

Agreed I don't think modifying info['bads'] is a good idea.

I think a tractable way forward is with channel-specific annotations, as these exist in raw and then carry over to epochs. See for example mne-tools/mne-qt-browser#204 which would provide interactive marking (in addition to any automated ones people want to use), then we can write a function to look through for channel-specific BAD_ annotations (by default, descriptions to interp could be an arg) and interpolate. And ideally this interpolation bit could be part of interpolate_bads somehow, maybe with interp_annots=() by (backward-compatible) default but you could pass interp_annots="BAD_*" or similar to interpolate any channel-specific bad annotations as well as info['bads'].

@alexrockhill
Copy link
Contributor Author

I actually checked and it wasn't too crazy to make info['bads'] with a list of lists survive roundtrip to disk. I think the intractable issue in principle is that info is below epochs in the object hierarchy and in order to check that you have the right number of bad entries (one for each epoch), you need access to an information specific to the epochs -- the length of the epochs-- so in principle I think this breaks a nice object hierarchy and so isn't the way to go. I'll push the commit doing this for the record though.

@larsoner
Copy link
Member

larsoner commented Jan 4, 2024

I don't object to info['bads'] on implementation grounds but rather conceptual and ecosystem compliance. info['bads'] has been list of str and has worked this way for years (decades?) and other software expects it to be this. Moreover the measurement info is not designed to store time-varying information. We have already have annotations for this sort of thing, it's much better suited to it.

@alexrockhill
Copy link
Contributor Author

Hmmm @larsoner that all sounds reasonable but it sounds like a GSoC project or something, it's even more broad in scope than what @mscheltienne was suggesting. Perhaps we can live with a simple overload of interp_chs and then plan to depcrecate that for interp_annots when that gets written.

That all sounds very complicated with tagging bad epochs and channels with specific strings by the way, I'm not sure that will be implemented as such, it would make more sense to me to have a matrix with bad/interpolate entries like autoreject.RejectLog.labels as an attribute of annotations only for epochs. Since annotations already store events, you should be able to compare just within the annotations without needing access to the parent epochs like this info problem. Just my two cents.

@larsoner
Copy link
Member

larsoner commented Jan 4, 2024

That all sounds very complicated with tagging bad epochs and channels with specific strings by the way, I'm not sure that will be implemented as such

We have built / are building tools around channel-specific EDIT: annotations (not epochs), so I don't see that as a problem. Then the mapping from the bad annotations to which channels to interpolate in which time intervals of which epochs is like a two-level loop or similar (< 10 lines probably). Where do you see the complication specifically?

Note that what I propose is also more general in the sense that it doesn't have to be specific to Epochs, though it would work thenre. You build the interpolation logic once and it works almost as easily for Epochs as it does for Raw. That to me is a big advantage. Something like find_bad_channels_maxwell could do time-varying bad channel marking (e.g., for flux jumps) or something and interpolate_bads would work on Raw or Epochs, up to you when you do it.

it would make more sense to me to have a matrix with bad/interpolate entries like autoreject.RejectLog.labels as an attribute of annotations only for epochs

Rather than sticking with this binary mask idea these labels could easily be converted to annotations on the epochs object, again with a few lines of code (we could even write a convenience function for it). Then in addition to interpolation with interpolate_bads working, we additional nice standard stuff like being able to visualize the channel specific badness/annotations in epochs.plot for example, which is a work in progress.

To me this interpolation stuff is way smaller scope than GSoC and fits within a more general framework if we conceptualize it the right way from the beginning. Using a binary mask like you suggest is the easiest way to do it perhaps but doesn't fit as nicely with existing tools/frameworks in MNE, so it misses out on some opportunities.

@alexrockhill
Copy link
Contributor Author

alexrockhill commented Jan 4, 2024

That sounds very convincing and reasonable other than my ignorance of the internal working of annotations. Sounds like you've been thinking about it a bit so I don't insist on sticking to what I would consider my initial impression of how to solve this.

I'll just leave this PR at this state (working with interp_chs) and it sounds like maybe should just be closed to avoid confusion until the annotations are implemented to be specific to epochs and then we can use that to reimplement this.

For notes/record, see d9b4b83 for modifying epochs.info['bads'] to be a list. Not that hard but unnecessarily complicated, annotations is the place to store this.

@drammock
Copy link
Member

drammock commented Jan 4, 2024

I hope you don't mind me saying this @alexrockhill but this is a great example of why we encourage all our contributors to open an issue first to discuss a proposed implementation before putting in the work to code it up. (I know you can code quite fast and for you this probably wasn't a ton of time lost, but I think it's still worth making the point; those lost minutes do add up!)

@alexrockhill
Copy link
Contributor Author

I don't mind you saying it. I had already written the code for my own analysis so it was only a couple minutes extra work in this case and I was copying how Mainak did it in autoreject so I thought it would be pretty straightforward otherwise I would have opened an issue. I wasn't aware of the active development of annotations which would fit this so good to know and it ended up basically like an issue discussion but in the future I'll keep that in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants