Skip to content

add_channels() documentation is difficult to understand #11106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hoechenberger opened this issue Aug 28, 2022 · 15 comments · Fixed by #13051
Closed

add_channels() documentation is difficult to understand #11106

hoechenberger opened this issue Aug 28, 2022 · 15 comments · Fixed by #13051
Assignees

Comments

@hoechenberger
Copy link
Member

hoechenberger commented Aug 28, 2022

I just wanted to use Raw.add_channels(), so I looked at the API docs and even though I'm a developer, it actually took me a couple of seconds to even understand what could probably be meant, and still I think I'll have to take a look at the code to get a better understanding:

Screen Shot 2022-08-28 at 17 00 00

  • What is self
  • What is meant by, "Must contain the same type as the current object" (actually I'm still unsure, I'll probably have to look at the code to figure it out)
  • "force the info objects ... to match the values in self". I have absolutely no clue what that means.
  • Why does it say it can return Raw, Evoked, or Epochs – I was looking it up for Raw at https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw.add_channels, and I don't think my Raw is suddenly gonna get turned into an Epoch by this method, no?

I believe we're making bad use of the docdict here, avoiding redundancy but making the docs effectively useless because they're too weird and hard to understand.

Didn't we recently have a somewhat similar case where we ended up with a way to improve things?

@agramfort
Copy link
Member

agramfort commented Aug 28, 2022 via email

@hoechenberger
Copy link
Member Author

hoechenberger commented Aug 28, 2022

Oh wow only after actually looking at the code I realize I totally misunderstood what this method is even supposed to do, based on what I expected from the method name and its docstring.

The docstring says,

Append new channels to the instance.

So what I thought I could do is, just pass an array of data for additional channels (and channel names), and they will be appended to the existing instance.

But actually what this method is supposed to do is in fact a concatenation of multiple objects of the same type, right?

But we already have .append() and concatenate_*() for this. The only issue with these is that they currently only work on the time domain. But if we added a switch to allow for concatenation on the channel domain as an alternative way of operation, we wouldn't even need .add_channels() anymore.

I guess I'm thinking sort of like in terms of pd.concat(), which has an axis parameter.

Any thoughts here?

@agramfort
Copy link
Member

agramfort commented Aug 28, 2022 via email

@hoechenberger
Copy link
Member Author

hoechenberger commented Aug 28, 2022

it needs to be more than channel name and array as a channel info["chs"] is a lot more than this.

Exactly! That's why naming the method add_channels() is misleading – effectively, you're concatenating multiple "complete" objects.

Look just how much code I had to produce to add two channels to an existing raw:

# Add GSR and temperature channels
gsr_data = np.array([2.1e-6] * len(raw.times))
temperature_data = np.array([36.5] * len(raw.times))

gsr_and_temp_data = np.concatenate([
    np.atleast_2d(gsr_data),
    np.atleast_2d(temperature_data),
])
gsr_and_temp_info = mne.create_info(
    ch_names=["GSR", "Temperature"],
    sfreq=raw.info["sfreq"],
    ch_types=["gsr", "temperature"],
)
gsr_and_temp_info["line_freq"] = raw.info["line_freq"]
gsr_and_temp_info["subject_info"] = raw.info["subject_info"]
with gsr_and_temp_info._unlock():
    gsr_and_temp_info["lowpass"] = raw.info["lowpass"]
    gsr_and_temp_info["highpass"] = raw.info["highpass"]
gsr_and_temp_raw = mne.io.RawArray(
    data=gsr_and_temp_data,
    info=gsr_and_temp_info,
    first_samp=raw.first_samp,
)
raw.add_channels([gsr_and_temp_raw])
del gsr_and

https://github.com/mne-tools/mne-bids/blob/036dc7e08c5053e218f4aeff7c1786ffbdf0a6b4/mne_bids/tests/data/tiny_bids/code/make_tiny_bids_dataset.py#L46-L69

@agramfort
Copy link
Member

agramfort commented Aug 29, 2022 via email

@sappelhoff
Copy link
Member

+1 to improve the docs based on the 4 points in the OP

once that is clarified, I think it's easier to live with the fact that add_channels actually concatenates raws (or other instances).

I am not sure what you suggest here.

not sure either, but perhaps that add_channels would be a function that takes params like data, ch_name, ... (and all else that is needed) instead of a raw/... object?

@drammock
Copy link
Member

Look just how much code I had to produce to add two channels to an existing raw:

Agree that this is way too difficult. Comments below:

# Add GSR and temperature channels
gsr_data = np.array([2.1e-6] * len(raw.times))
temperature_data = np.array([36.5] * len(raw.times))
gsr_and_temp_data = np.concatenate([
    np.atleast_2d(gsr_data),
    np.atleast_2d(temperature_data),
])

The above could be simplified a bit; still 3 expressions but much shorter:

gsr = np.full((1, len(raw.times)), 2.1e-6)
temp = np.full((1, len(raw.times)), 36.5)
gsr_and_temp_data = np.vstack([gsr, temp])

This next bit seems unaviodable but not so bad:

gsr_and_temp_info = mne.create_info(
    ch_names=["GSR", "Temperature"],
    sfreq=raw.info["sfreq"],
    ch_types=["gsr", "temperature"],
)

Are the below lines necessary when creating RawArray, or only when adding the channels to the existing Raw? (I suspect the latter, but haven't checked). Regardless, we should make it possible to skip line_freq, lowpass, highpass, and subject_info for cases like this --- perhaps by having "wildcard" defaults that will automatically take on the values of the Raw they are concatenated with? This is (I think) what force_update_info does already: uses the existing Raw's info to update/override the info of the concatenated objects. But it would be nice if that were possible without having to first explicitly set values for those info fields.

gsr_and_temp_info["line_freq"] = raw.info["line_freq"]
gsr_and_temp_info["subject_info"] = raw.info["subject_info"]
with gsr_and_temp_info._unlock():
    gsr_and_temp_info["lowpass"] = raw.info["lowpass"]
    gsr_and_temp_info["highpass"] = raw.info["highpass"]

The rest seems unavoidable, but if the two simplifications above were made, I think we're down to something reasonable/acceptable.

gsr_and_temp_raw = mne.io.RawArray(
    data=gsr_and_temp_data,
    info=gsr_and_temp_info,
    first_samp=raw.first_samp,
)
raw.add_channels([gsr_and_temp_raw])

As for the intuitiveness of the API name... I think I would agree that something like raw.add_channels(data, ch_name, ch_type, ...) ought to work too. I can imagine an API where data can be a NumPy array (in which case you must pass other args), or alternatively data can be a list of Raw instances (or epochs or evoked).

@hoechenberger
Copy link
Member Author

hoechenberger commented Aug 30, 2022

I am not sure what you suggest here.

not sure either, but perhaps that add_channels would be a function that takes params like data, ch_name, ... (and all else that is needed) instead of a raw/... object?

I am not sure myself either 😆

But I suppose my feeling was that:

  • add_channels() doesn't live up to its name, because
  • it performs a concatenation of two same-typed objects, hence
  • this functionality could (should) live in concatenate_*(), meaning that
  • concatenate_*() should gain an axis-like parameter, and
  • add_channels() could be subsequently deprecated or "legacy-lized" (and intenally call concatenate_*()),
  • but it would still be super nice if I could just pass an array of data, channel names, channel types, and perhaps locations to add_channels() or a similar method to simply append those channels without having to create an entirely new Raw/Epochs/Evoked container first; but I do realize that things can get out of hand here quickly

Thanks @drammock for the simplifications, that's helpful!

@agramfort
Copy link
Member

agramfort commented Aug 31, 2022 via email

@hoechenberger hoechenberger added this to the 1.2 milestone Sep 14, 2022
@larsoner
Copy link
Member

As for the intuitiveness of the API name... I think I would agree that something like raw.add_channels(data, ch_name, ch_type, ...) ought to work too.

I would rather not add this. To me it's replicating RawArray without much gain. I don't think this extra line is worth the API duplication.

Can we just live with a documentation improvement for now?

@hoechenberger
Copy link
Member Author

Can we just live with a documentation improvement for now?

Yep, let's do that!

@hoechenberger
Copy link
Member Author

hoechenberger commented Sep 15, 2022

Hey! who said I'd be volunteering? 😅😅😅😅

@larsoner
Copy link
Member

I believe @agramfort calls this being "volun-told" :)

More seriously if you won't have time in the next few days @drammock or I could give it a shot. But I think you're the best equipped having most recently suffered through the doc badness :)

hoechenberger added a commit to hoechenberger/mne-python that referenced this issue Sep 15, 2022
hoechenberger added a commit to hoechenberger/mne-python that referenced this issue Sep 16, 2022
@hoechenberger hoechenberger modified the milestones: 1.2, 1.3 Sep 16, 2022
@larsoner larsoner added EASY and removed EASY labels Oct 21, 2022
@larsoner larsoner modified the milestones: 1.3, 1.4 Dec 8, 2022
@larsoner larsoner modified the milestones: 1.4, 1.5 Apr 21, 2023
@larsoner larsoner modified the milestones: 1.5, 1.6 Jul 31, 2023
@larsoner larsoner modified the milestones: 1.6, 1.7 Nov 7, 2023
@larsoner larsoner modified the milestones: 1.7, 1.8 Apr 9, 2024
@larsoner larsoner added the EASY label Jul 12, 2024
@larsoner
Copy link
Member

Milestone on this has been bumped six times, removing milestone until someone champions this issue. But also marking as "easy" so it hopefully gets visibility / higher likelihood of being tackled

@skjerns
Copy link
Contributor

skjerns commented Jan 8, 2025

As for the intuitiveness of the API name... I think I would agree that something like raw.add_channels(data, ch_name, ch_type, ...) ought to work too.

I would rather not add this. To me it's replicating RawArray without much gain. I don't think this extra line is worth the API duplication.

I have added some documentation in #13051, however, having this as a convenience function without having to worry about potentially overwriting things in the info would be nice. I tried adding a simple RawArray with a minimal info and there was an exception thrown about inconsistencies in the infos, which I only solved via force_update_info=True, which was odd as the info was mainly empty.

Having a simple function add_channels(data, ch_name) would be more what I expected from the function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants