Skip to content
This repository has been archived by the owner on Sep 2, 2024. It is now read-only.

Failed smargon move on pin tip centre causes detector not to clean up [Timebox: 2 days] #1167

Closed
DominicOram opened this issue Feb 19, 2024 · 5 comments · Fixed by DiamondLightSource/dodal#403
Assignees

Comments

@DominicOram
Copy link
Collaborator

DominicOram commented Feb 19, 2024

We had a hardware issue where the smargon failed to move https://graylog2.diamond.ac.uk/messages/logs_8656/56894842-ce7f-11ee-8ab3-1866dafae904. This left the eiger in a bad state as the next collection (https://ispyb.diamond.ac.uk/dc/visit/cm37235-1/dcg/11281776) failed to arm.

Note: If we don't make any progress in the 2 days then we should instead propose converting the eiger to ophyd-async next sprint

Acceptance Criteria

  • When the pin tip centre fails like this the next collection is able to run smoothly
@DominicOram DominicOram added the needed_for_release Issues that must be complete before the next release label Feb 27, 2024
@DominicOram DominicOram self-assigned this Feb 29, 2024
@DominicOram
Copy link
Collaborator Author

DominicOram commented Mar 1, 2024

Tried to reproduce by putting:

    global TEST_FAIL
    if TEST_FAIL:
        TEST_FAIL = False
        raise Exception()

in move_smargon_warn_on_out_of_range but couldn't. I think this might only occur when the exception gets raised at a specific time during the arming procedure?

@DominicOram DominicOram removed the needed_for_release Issues that must be complete before the next release label Mar 8, 2024
@DominicOram DominicOram removed their assignment Mar 8, 2024
@DominicOram
Copy link
Collaborator Author

We need to test calling eiger.stop() at various points in the arming chain and confirm it tidies up. Also, with more logging we might be able to find out where in the chain we were when we got the error.

@DominicOram DominicOram changed the title Failed smargon move on pin tip centre causes detector not to clean up Failed smargon move on pin tip centre causes detector not to clean up [Timebox: 2 days] Mar 11, 2024
@olliesilvester
Copy link
Contributor

eiger.stop() does not wait on odin.stop() when it looks like it should. Hard to know if this is causing the issue without some beamline testing

@rtuck99
Copy link
Contributor

rtuck99 commented Apr 3, 2024

This looks to fail at the same point as DiamondLightSource/mx-bluesky#282 with the same message

XXX - [2024-02-18 17:29:07,159] ophyd status ERROR: SubscriptionStatus(device=eiger_odin_meta_ready, done=True, success=False) encountered an error during _handle_failure()
Traceback (most recent call last):
  File "/dls_sw/i03/software/bluesky/hyperion_v8.9.0/hyperion/.venv/lib/python3.10/site-packages/ophyd/status.py", line 260, in _run_callbacks
    self._handle_failure()
  File "/dls_sw/i03/software/bluesky/hyperion_v8.9.0/hyperion/.venv/lib/python3.10/site-packages/ophyd/status.py", line 760, in _handle_failure
    return super()._handle_failure()
  File "/dls_sw/i03/software/bluesky/hyperion_v8.9.0/hyperion/.venv/lib/python3.10/site-packages/ophyd/status.py", line 654, in _handle_failure
    self.device.stop()
AttributeError: 'EpicsSignalRO' object has no attribute 'stop'

@DominicOram
Copy link
Collaborator Author

This looks to fail at the same point as DiamondLightSource/mx-bluesky#282 with the same message

XXX - [2024-02-18 17:29:07,159] ophyd status ERROR: SubscriptionStatus(device=eiger_odin_meta_ready, done=True, success=False) encountered an error during _handle_failure()
Traceback (most recent call last):
  File "/dls_sw/i03/software/bluesky/hyperion_v8.9.0/hyperion/.venv/lib/python3.10/site-packages/ophyd/status.py", line 260, in _run_callbacks
    self._handle_failure()
  File "/dls_sw/i03/software/bluesky/hyperion_v8.9.0/hyperion/.venv/lib/python3.10/site-packages/ophyd/status.py", line 760, in _handle_failure
    return super()._handle_failure()
  File "/dls_sw/i03/software/bluesky/hyperion_v8.9.0/hyperion/.venv/lib/python3.10/site-packages/ophyd/status.py", line 654, in _handle_failure
    self.device.stop()
AttributeError: 'EpicsSignalRO' object has no attribute 'stop'

I believe this is after the initial failure as it's trying to clean up. See DiamondLightSource/dodal#294

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants