Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in arming the detector after beam dump #282

Open
DominicOram opened this issue Feb 19, 2024 · 5 comments · Fixed by DiamondLightSource/dodal#410
Open

Error in arming the detector after beam dump #282

DominicOram opened this issue Feb 19, 2024 · 5 comments · Fixed by DiamondLightSource/dodal#410
Assignees
Labels

Comments

@DominicOram
Copy link
Contributor

We had a beam dump of ~1.5 hours here. Prior to the beamdump we had armed the Eiger but in coming back the Eiger was not armed, we then failed to re-arm the Eiger. Likely there is a timeout in odin/eiger that stops itself after sometime but only partly so and we struggle to come back from this state. Potentially the state is the same as that in DiamondLightSource/hyperion#1167?

Acceptance Criteria

  • Hyperion recovers and correctly re-arms the detector after a beam dump
@rtuck99 rtuck99 self-assigned this Mar 4, 2024
@rtuck99 rtuck99 removed their assignment Mar 13, 2024
@DominicOram
Copy link
Contributor Author

DominicOram commented Mar 26, 2024

  • First thing to do is work out what state the Eiger was in exactly - can we see this from the logs? Or can we reproduce this by going for lunch? Write this state on this issue.
  • Add logic to check this state and re-arm if necessary, do this right after beam check. Could do this by justcalling stage again and this re-arms if required?
  • Think of ways we can test this? Can we reproduce by doing a disarm manually?

@DominicOram
Copy link
Contributor Author

We think it's a Medium

@rtuck99 rtuck99 self-assigned this Apr 2, 2024
@rtuck99
Copy link
Contributor

rtuck99 commented Apr 2, 2024

From what I can determine from the logs, the following is the sequence of events (against hyperion 8.9.0)

Sequence of events:
10:49:15: Hyperion call from GDA starts
eiger.do_arm set called from start_preparing_data_collection_then_do_plan()...

10:49:16.630: set_detector_z_position executed
10:49:17: in pin_tip_centre_plan()
10:49:19.415: _wait_for_odin_status() called in EigerDetector
10:49:22.063: _wait_fan_ready() called in EigerDetector
10:49:23.281: _finish_arm() called in EigerDetector

10:49:24.123: reaches log message "Setting aperture position..." in detect_grid_and_do_gridscan()
Call set_detector_parameters. use_panda unset
10:49:24.595: Generate and log FGS parameters
10:49:24.597: Starts to enter run_gridscan_and_move_and_tidy
10:49:24.721: Gets start document for ZocaloCallback
10:49:24.762: Hyperion sits waiting for XBPM stable in _check_and_pause_feedback

<...> beam dump

12:09:26.803 beam restored, various callbacks pre-collection
12:09:27:767 run_gridscan - waiting for read_for_data_collection group (arming to finish)
bps.stage

arming_status.done == True
not armed - either fan not ready or cam not acquired?
12:09:41.333: Failed to get odin metadata when re-arming the eiger

As far as re-arming the detector goes, it seems that at this point, we do already re-attempt to arm the eiger synchronously when the device is staged. However it seems that for whatever reason, the re-arming failed. We never reach cam acquire, so it seems that some other state needs to be tidied first. Arming the first time completed successfully so it would seem that if the device isn't armed when we reach staging, that we should assume that the eiger is not in a consistent state and needs to be reset (somehow?)

@DominicOram
Copy link
Contributor Author

Test when DiamondLightSource/hyperion#1465 done

@rtuck99
Copy link
Contributor

rtuck99 commented Jul 17, 2024

Attempted to reproduce the issue on the beamline using the injection flag, however this did not reproduce the problem and gridscan proceeded normally. Also could not reproduce by setting feedback threshold to 0 to force a wait, after > 10 minutes delay there was no ill-effect. (although strangely, GDA itself did appear to timeout even though hyperion was fine).

However code has been running with the fix for some time now, so will raise a PR to remove the feature flag and make the fix permanent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants