Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: Enable numpy 2+ installs #758

Merged
merged 1 commit into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion imap_processing/cdf/config/imap_hi_variable_attrs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -287,7 +287,6 @@ hi_de_nominal_bin:
<<: *default_uint8
CATDESC: Corresponding histogram angle bin for this Direct Event
FIELDNAM: Histogram Bin Number
FILLVAL: 511
FORMAT: I2
LABLAXIS: Hist Bin \#
VALIDMIN: 0
Expand Down
4 changes: 4 additions & 0 deletions imap_processing/mag/l1a/mag_l1a_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,10 @@ def to_signed16(n: int) -> int:
primary_vectors = []
secondary_vectors = []

# To avoid overflows, we need to cast the potentially 8 bit signed integers to
# int32 before the bitshifting operations below.
vector_data = vector_data.astype(np.int32)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxinelasp, I'm not sure how you want to handle this, so let me know if you want something else. You are calling astype(int32) in the code calling section, but in the tests you pass in a frombuffer() call which is only int8. We could get rid of the type-cast in the calling routine and rely on this, or we could immediately cast the frombuffer in the test too if you'd prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly, this is also complicated by the compression code that I'm working on, which also takes in vector data. Is the problem that the 8 bit ints are signed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this change passes the tests then I'm ok with it. The bitshifting is unfortunately complicated with different types, but I am also looking to rework the types to be more consistent in my newest change, so that it works with the uncompressed and compressed algorithms.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the problem that the 8 bit ints are signed?

The problem is here:

x = (
((vector_data[pos + 0] & 0xFF) << 8)
| ((vector_data[pos + 1] & 0xFF) << 0)
) & 0xFFFF

where you are doing math with 0xFFFF which is out of range for those types, so numpy2+ is warning and saying we need to be explicit with what types we want to do the math on.

Within the code you are already casting here:

mag_l0.VECTORS.astype(dtype=np.int32), # type: ignore[union-attr]

But in the tests, you aren't:

(primary_vectors, secondary_vectors) = MagL1a.process_vector_data(
input_data, total_primary_vectors, total_secondary_vectors
)

So I was saying that we could also update this in those two locations if you wanted as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ok, in that case either change is fine, and I'll go back once the compression algorithm is done and make sure it's consistent with the types I'm using there.


# Since the vectors are stored as 50 bit chunks but accessed via hex (4 bit
# chunks) there is some shifting required for processing the bytes.
# However, from a bit processing perspective, the first 48 bits of each 50 bit
Expand Down
2 changes: 1 addition & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ space_packet_parser = ">=4.2.0"
spiceypy = ">=6.0.0"
xarray = '>=2023.0.0'
pyyaml = "^6.0.1"
numpy = "^1.26.4"
numpy = "<=3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to have this pinned to a minor version, no? numpy = "^2.1.0 I'm not sure we want random bumps up to, for example, 2.2.0.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should require numpy 2+ because that will likely cause issues with other packages only supporting <=2 because those packages maybe haven't done the fixes themselves yet to be compatible.

I'm personally a fan of more open-ended ranges for packages so that we don't get pinned back and then have to do a massive update later on. When deploying/installing the package as a user, then we can/should use explicit package locks, but in general I have a preference for leaving the testing of the package open so that we don't get stuck way behind the times and miss things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, since this uses a lockfile I'm alright with that. Did you see if requiring numpy 2+ actually caused issues with packages?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does not cause issues right now, we are able to install numpy 2+. I think this is a difference of whether we are the final "user" (what I think you're getting at with the lockfile for install) or if we expect this package to be used by others (my assumption) and potentially installing other packages besides the ones we require. Pinning to numpy 2+ will affect others and not just ourselves. If we work with numpy 1.26 right now we should let that keep working IMO.


# Optional dependencies
numpydoc = {version="^1.5.0", optional=true}
Expand Down
Loading