-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multidimensional non-dimension coordinates in DataArray + Dataset #9579
Comments
this is the same issue as #8005: the data model for In the issue above it has been proposed to extend the data model to use indexes to determine the content of As a workaround, you could use a single-variable cc @dcherian |
@keewis thanks for the heads up on the older topic. Hmm... I'll have to do other work-arounds for now then. :) |
For what it's worth, I am currently unpacking the threads = ds.dims["thread"]
place_names = [f"place_{i}" for i in range(threads)]
ds.assign_coords(
dict((place_name, ("affinity", aff)) for place_name, aff in zip(place_names,
ds.affinities.values.T))) |
Good to hear you found a workaround. Thinking about it a bit more, #8005 would require setting an index on the variable to carry over, which means this might not be suitable for every use case, and I don't think we will extend the data model to allow non-indexed coordinates with additional dimensions on Either way, I'm closing in favor of #8005. |
What is your issue?
I wish to do data-analysis on some data in a, seemingly weird format. I had hoped I could use
xarray
for this.I am using a
Dataset
variable looking like this:Lets explain the details of why this looks like this:
size
dimension coordinate which is simple and straightforward.affinity
andthread
. Theaffinity
is a simple index; equivalent to a linear experiment index.The
thread
dimension is the number of threads I ran the experiment on. So this is not a dimension for the data. But anintrinsic information for each experimente (
affinity
index).affinities
. This affinities1 is a coordinate of the two above dimensions.This construction seemed natural to me because:
size
size
I ran the exprimentaffinity
times with 10 threads, and I stored the unique placements of thethreads
in theaffinities
coordinate.I.e. the thread-placements is an intrinsical part of the experiment index, and not a dimension of the data.
The nice thing is that I can do:
The problem comes when I need to extract only one of the variables.
so I loose all information related to the
affinity
. Now, I can understand how this works because any dimension in aDataArray
is tightly bound to the variable. So the coordinates must as well.But is this the wrong way to structure things?
I considered turning
affinities
into an attribute, but that has the problem that it won't get carried over when extracting the variable (I also triedwith xr.set_options(keep_attrs=True)
to no avail), and it surely won't select theaffinity
index.I.e. I wouldn't be able to do the above
groupby
action easily...Footnotes
The affinity here refers to process/thread affinity. ↩
The text was updated successfully, but these errors were encountered: