-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BFT] Recoverable Random Beacon State Machine #6725
Comments
@AlexHentschel @jordanschalm I've made an iteration on your state machine representation which I think simplifies complexity and makes reasoning a bit simpler. It has same number of states(except I think yours has one extra recovery unless I missed something). Regarding DKG recovery I have tried to make it explicit that before we can execute recovery we have to have previous epoch committed and it's not possible to commit DKG and then enter recovery state while staying in the same epoch. |
very nice work Yurii. I agree with all your revisions to the state machine:
Minor suggestionsThere are some minor thoughts / suggestions for extensions, which weren't part of my initial state machine sketch either ... my thinking has also evolved a little (maybe I just forgot some details already 😅 since reviewing your PR). (ii) wording below the
|
🤯 would desire stronger isolation between the business logics of
ReactorEngine
andBeaconKeyRecovery
I have spent multiple hours on possible interaction of the
ReactorEngine
with theBeaconKeyRecovery
and I am still very much worried about open edge cases. Aspects that possibly overlap across both componentsReactorEngine
declaresflow.DKGEndStateSuccess
.SafeBeaconPrivateKeys
).ReactorEngine
andBeaconKeyRecovery
despite them both accessing the same database fields (ReactorEngine
acts uponEpochCommittedPhaseStarted
whileReactorEngine
listens toEpochFallbackModeExited
which are possibly emitted at the same time)I have been digging into the code and my gut feeling is that its correct. But the problem is there is a whole bunch of hypothetical edge cases and scenarios where we have to confirm that things can't go wrong. I don't think there are any broken edge-cases, but we also provide no exhaustive argument why. In my opinion, the logic is distributed over two asynchronous components with weak isolation modifying the same state, which makes it too time-intensive to affirm correctness just implicitly from the code.
Thoughts / my mindset
I finally understand now the critical role of the
DKGState
as an isolation layer. What I am struggling with is the precise specification of what state transitions are allowed.Concrete suggestion:
DKGState
should enforce:Lets review that and make sure we all agree.
DKGState
permits all those state transitions, but also a few more state transitions that shouldn't be allowed. For example,flow.DKGEndState
reachesSuccess
orRecovered
(we should silently swallow identical write requests in the spirit of building an information-driven system; just inconsistent changes ofMyBeaconPrivateKey
would be refused with an exception). At this point, a "safe read" of the key could already succeed and we can't change the value anymore without risking slashing. This should be documented and enforced in theDKGState
(this removes interdependencies betweenReactorEngine
andBeaconKeyRecovery
for correct behaviour)DKGState
implementation toRecoverablePrivateBeaconKeyState
This PR is already large enough. I would appreciate if we could create an issue and address this in a follow-up PR
Originally posted by @AlexHentschel in #6424 (comment)
The text was updated successfully, but these errors were encountered: