Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: space_packet_parser update #1007

Merged
merged 31 commits into from
Oct 18, 2024

Conversation

tech3371
Copy link
Contributor

@tech3371 tech3371 commented Oct 10, 2024

Change Summary

closes #1008

Overview

This PR updates space_packet_parser to new release of space_packet_parser==5.0.0. Gavin helped fix bug and release patch version 5.0.1. This PR is large but very important to get this done early to clear roadblocks for all of us. Almost all instrument code had to change due to the new release.

This new release includes tons of breaking changes. Please see its release notes for more details. It's posted on xtce Slack channel as well. These changes effect us most:

  • Create only a definition from your XTCE document and instead of my_parser.generator, use my_packet_definition.packet_generator
  • The ParsedDataItem class has been removed and the derived values are being returned now.
    The raw_value is stored as an attribute on the returned object. (My clarification of what this means to us) - The returned "value" defaults to raw value if there is no different derived value but if there is derived value then it uses that as default value. Therefore, we need to specific value.raw_value if we do want raw value of variable that also has derived value. Please reach out to Gavin and me if you have question.
  • Replace bitstring objects with native Python bytes objects. (My clarification of how this effects us) - This means that it returns <class bytes> instead of binary string. Before it was returning binary string and many of our algorithm was written with that information. Now, we need to be aware of this new change. I tried to keep lot of your code same by adding below code blocks that converts bytes to binary string. We will need to revisit as needed. This allowed me to keep scope of this PR to only cdflib update.
            # TODO: improve this as needed
            science_values = "".join(f"{byte:08b}" for byte in science_values)
  • (This was not included in release note but important to us) Before, unpacked data can be accessed using value.header and value.data. In new release, it has changed to value.user_data instead of value.data.

Updated Files

  • updates every instruments files with changes mentioned above. I like request everyone to review this PR.

Testing

@tech3371 tech3371 added enhancement New feature or request CDF Related to CDF files labels Oct 10, 2024
@tech3371 tech3371 requested a review from a team October 10, 2024 17:05
@tech3371 tech3371 self-assigned this Oct 10, 2024
Copy link
Contributor

@maxinelasp maxinelasp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some questions about the types that are returned as part of CcsdsPacket....

value = (
item.derived_value if item.derived_value is not None else item.raw_value
)
for key, value in packet_header.items():
if key in attributes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need to get value.raw_value here... won't value be of type packets.IntParameter instead of int?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood the original if statement, it seems like we want to get derived value if it's available or else get raw value. Now with new changes, we get derived value if it's available or else we get raw value. so getting the value will do what the if statement was doing before, if I understood correctly. That's how I understood at least when Gavin explain in the Slack thread. should we double check to be safe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could just print out the type and validate that it's a python type, that's good enough for me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packets.IntParameter is a subclass of int

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case, are we good with current change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if it's a different type, that's going to mess with some code, right? Some of my code depends on the types being accurate because of MyPy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, if you make the change and all the tests pass, then it's probably fine... but I'm just worried about changing all the types out from underneath everyone like this, even if they're basically the same as the original types. But perhaps that is just Python offending my delicate, type-loving sensibilities again

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, then I think you can change the types to IntParameter now. But note that raw_value is not the same as derived_value which was being used before if it was present.

@@ -88,7 +88,7 @@ def add_metadata_to_array(packet: space_packet_parser, metadata_arrays: dict) ->
for key, value in packet.header.items():
metadata_arrays.setdefault(key, []).append(value.raw_value)

for key, value in packet.data.items():
for key, value in packet.user_data.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, are the types here correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we check with Gavin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to get clarification from him

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be removed entirely because it is in packet_file_to_datasets() now? I thought codice was using that...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. you are right Greg. I can delete this function. It's not used anymore. Sorry for including you to this PR by mistake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To your question Maxine, data type of both header and user_data are dict from this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what is inside the dicts? Python types? I did go look at that code but I'm still confused about the parameter types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a Gavin question :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subclasses of built-in types.

An IntParameter is simply an int with a raw_value attribute attached to it. You should be able to use it exactly like an int, it just has an extra attribute to help you get at the raw_value if you really want it.

else item.raw_value
for item in packet.data.values()
]
values = [item for item in packet.user_data.values()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
values = [item for item in packet.user_data.values()]
values = [item.raw_value for item in packet.user_data.values()]

Missed that one! Sorry about that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above. I think this shouldn't be raw_value based on the previous logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was giving me errors before because it was assigning everything as packet.IntParameter and messing with the types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, you did explain here.

event_number = packet.data["IDX__SCI0EVTNUM"].derived_value
if "IDX__SCI0TYPE" in packet.user_data:
scitype = packet.user_data["IDX__SCI0TYPE"].raw_value
event_number = packet.user_data["IDX__SCI0EVTNUM"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
event_number = packet.user_data["IDX__SCI0EVTNUM"]
event_number = packet.user_data["IDX__SCI0EVTNUM"].raw_value

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packet items are the derived value now. If there was no "derived" / "calibration" going on, then it is identical to the raw_value. So this should be left without the raw_value attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my understanding as well. That's why I removed .derived_value where it was used before.

@@ -312,9 +312,9 @@ def _set_impact_time(self, packet: space_packet_parser.parser.Packet) -> None:
testing.
"""
# Number of seconds since epoch (nominally the launch time)
seconds_since_launch = packet.data["SHCOARSE"].derived_value
seconds_since_launch = packet.user_data["SHCOARSE"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need .raw_value or .derived_value here? I don't fully understand when the _Parameter type is returned...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _Parameter is the derived value. But, SHCOARSE is never derived it is just a counter, so it is the same as the raw value too.

imap_processing/utils.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to remove almost all .header and .user_data attribute accessors and rely on just the packet[lookup_value] directly instead.

imap_processing/glows/l0/decom_glows.py Outdated Show resolved Hide resolved
else item.raw_value
for item in packet.data.values()
]
values = [item.raw_value for item in packet.user_data.values()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this was getting the derived value if it is not None. The if item.derived_value is not None else item.raw_value is taken care of for you by space_packet_parser now.

I'm pretty sure this is the equivalent to before now:

Suggested change
values = [item.raw_value for item in packet.user_data.values()]
values = list(packet.user_data.values())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxinelasp what do you think here? I thought same as Greg but you helped me debug some of GLOWS' error and I saw that you had to get raw_value. Is that right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when I printed out the types here it was all IntParameter (etc) instead of int.

Copy link
Collaborator

@greglucas greglucas Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class IntParameter(int):

IntParameter is a subclass of int so if you type it as int IntParameter should be acceptable as well? Alternatively, could you type it as IntParameter where this is needed?

item.raw_value is very different from item.

item.derived_value from v4.x is the same as item in v5.x

else item.raw_value
for item in packet.data.values()
]
values = [item for item in packet.user_data.values()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above. I think this shouldn't be raw_value based on the previous logic.

imap_processing/hit/l0/utils/hit_base.py Outdated Show resolved Hide resolved
imap_processing/idex/idex_l1a.py Outdated Show resolved Hide resolved
imap_processing/tests/glows/test_glows_decom.py Outdated Show resolved Hide resolved
imap_processing/ultra/l0/decom_ultra.py Outdated Show resolved Hide resolved
imap_processing/utils.py Outdated Show resolved Hide resolved
imap_processing/utils.py Outdated Show resolved Hide resolved
imap_processing/utils.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laspsandoval After reverting some of " changes, I came across this error similar to what we saw couple days ago.

E                   ValueError: Unable to convert data to CDF format, data object cannot be of type string.

I did make changes to this file again since it seems like you did want your data in derived value. Please do revisit this in future PR if that's not what you want. It seems to involve more changes and I am worried I will mess something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I understand why it failed last time. It needed support_data type instead of metadata type since that's what it was. That's why I made that change in this PR.

@tech3371
Copy link
Contributor Author

pre-commit.ci autofix

imap_processing/hit/l0/decom_hit.py Outdated Show resolved Hide resolved
imap_processing/idex/idex_l1a.py Outdated Show resolved Hide resolved
imap_processing/idex/idex_l1a.py Outdated Show resolved Hide resolved
imap_processing/lo/l0/lo_science.py Outdated Show resolved Hide resolved
imap_processing/utils.py Outdated Show resolved Hide resolved
@tech3371 tech3371 merged commit 3fe722f into IMAP-Science-Operations-Center:dev Oct 18, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDF Related to CDF files enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Update space_packet_parser to version 5.0.0
4 participants