ADR-004 proposal on consumer expiration #712

albttx · 2023-02-06T18:00:20Z

Description

Proposal to set an expiration date on the consumer chain updated on each VSC Packet received to ensure the chain is up-to-date.

Type of change

Docs: Adds documentation

albttx · 2023-02-06T18:00:47Z

I set ADR-003 because it came after #711

shaspitz · 2023-02-06T21:55:59Z

I set ADR-003 because it came after #711

See #711 (comment), lets change the ADR number to 4 for this one, thanks!

Co-authored-by: Thomas Bruyelle <[email protected]>

albttx · 2023-02-07T15:55:13Z

There is a PR of the code on my fork albttx#1

Once the ADR passed, i'll create the PR here on this repo

jtremback · 2023-02-09T21:39:31Z

Hey @albttx very interested in the potential this has to make things much simpler, but we'll probably table this for now and pick it back up after launch.

docs/architecture/adr-003-consumer-expiration

mpoke · 2023-03-08T11:58:16Z

@albttx Thank for the contribution. I have some concerns with the proposed solution.

Benefits of the existing protocol:

The default action is to have the collateral (that enables validators to produce blocks) locked. This collateral is unlocked in specific cases that we can clearly reason about.
Consumer chains have their own unbonding period, which doesn't depend on the provider unbonding period. This simplifies how IBC clients to the consumers are updated: Consumer chains are treated just like other chains.
The current ICS protocol is asynchronous. As a result, the assumptions concerning time delays are minimized.
The current ICS protocol is very robust. Given that ICS is crucial infrastructure in Cosmos, it's important to have a very secured protocol.

All these properties will be removed by the proposed solution.

The default action would be to have the collateral unlocked after the provider unbonding elapses, without any information of the state of consumer chains. This means that we would need to reason about the potential states consumer chains can be in. I find it hard to be confident that all relevant states were considered.
Consumer chains would not have their own unbonding period. IBC clients to these chains (even clients on other chains) would need to query the provider unbonding period and then estimate a lower bound on the consumer unbonding. This complicates the interaction with consumer chains.
More assumptions on the synchrony between the provider and consumers are introduced, which will clearly reduce the robustness of the protocol.

IMO, the only benefits of the proposed solutions are:

Simplify the protocol, which I agree that in general is good, but not at the cost of robustness. Plus, I do believe that the extra logic needed to deal with edge-cases of the proposed solution will result in a much more complex protocol than the one proposed.
Reduce the of packets that need to be relayed by a factor of 2x. This can be solved much easier by just introducing epochs which will have a much bigger impact on the relaying cost. For example, sending VSCPackets every 10m, with reduce the number of packets by a factor of ~100x.

cc @josef-widder @AdityaSripal @jtremback

jtremback · 2023-03-09T01:53:19Z

@mpoke All of your points sounds good on their own, but allow me to introduce some counterpoints-

We cannot actually lock collateral for much longer than the provider chain's own unbonding period, it is not tenable from a financial or PR perspective to have people's money locked up with no end in sight. For this reason, we will be setting consumer chain unbonding periods to shorter than the provider's unbonding. We also introduced a timeout, so that if a consumer chain does not acknowledge VSCs for somewhat longer than the provider's unbonding period, that consumer chain is removed from the provider.

Let's briefly consider not what is described in this ADR but a hypothetical alternative where we just don't care about the consumer's unbonding period at all. In this hypothetical protocol VSCs are sent to the consumer and are not acknowledged. The only packets sent from the consumer to the provider would be slashing/jailing packets. Assuming perfect and synchronous transmission of VSCs to the consumer, this protocol would have no problem since the validator sets would always be perfectly in sync.

But in the real world, transmission over IBC is not perfect and synchronous. There are two possible sources of asynchrony that I am aware of- clock drift and IBC relaying delays.

First, let's look at clock drift. The clocks on two IBC connected chains cannot drift more than a few seconds (or minutes? I don't remember) apart before the IBC connection breaks. So I wouldn't consider clock drift to be significant. We should probably confirm this more rigorously, but I'm pretty certain that we can set an upper bound on clock drift to be a few minutes, or even an hour if we want to be extremely generous. But I do concede that this may need more thought.

Next, let's look at relaying delay. The delay we are concerned about is a delay in relaying a VSC from the provider to the consumer. If a validator begins unbonding on the provider, but the VSC containing this change is delayed, then they may double sign on the consumer during this delay. After this double signing, there will be less time on the provider to submit the equivocation evidence. For example, if there is a relaying delay of one week, and then a validator double signs on the consumer, they will already be through one week of their unbonding period on the provider. Then there will only be two weeks remaining to submit the evidence.

Stated simply, VSC relaying delays reduce the effective unbonding period of the consumer by the duration of the delay.

This is why this ADR makes sense. The amount of time it takes for the consumer chain to stop after it has stopped receiving VSCs is an upper bound on how much its unbonding period can effectively be reduced by relaying delays.

The benefit of simplifying the protocol is a big one. It will make future development much faster and is something to be strived for. You're talking about edge cases, but you haven't named any. Even one surprising edge case would convince me that this change would indeed not simplify the protocol. But as it is, it currently seems simpler and perhaps superior to our current protocol.

jtremback · 2023-03-09T21:58:27Z

In conclusion, I guess I have a few concrete questions-

Is it true that VSC relaying delays reduce the effective unbonding period of the consumer by the duration of the delay?
Does clock drift have the same effect or is it more complicated?
Can we disregard clock drift because IBC itself cannot work in the presence of clock drift anyway?
What are some of the edge cases you are thinking of?

josef-widder · 2023-03-09T22:39:51Z

There is another argument for the asynchronous explicit mature packet design (compared to the synchronous design with additional assumptions on delays and relative clock drifts etc). To the best of my understanding, the mature packet design it is aligned with incentives. If a validator wants to unbond, it is in its interest

to behave nicely on the consumer chain
keep the channel open and relay the mature packet to get its stake back

If you don't need an explicit acknowledgement to get the stake back, this alignment is gone. This may lead to one possible edge case is that some +1/3 of the validators freeze the channel on the provider side by a lightclient attack. At this point we need to figure out what to do about the consumer chain. Even if we slash the +1/3, the consumer chain keeps running and is not secured.

In addition, every synchronous design based on assumption on delays (e.g., packet delays)

either breaks if the assumption is violated at run-time
needs to have machinery in place to check the assumption and go to a safe mode when the assumption is broken (edge cases)

More generally, I think that discussions about the possible solution are a bit tricky without understanding what are the precise properties we need to ensure under what assumptions. @mpoke has written about the benefits of this design. I think they are all undisputed. I added the alignment of the incentives here as additional benefit. If some of the benefits are not needed, we should identify which. Then I think we can have a better discussion about alternative solutions.

mpoke · 2023-03-23T17:22:28Z

As a side note, to avoid the delays stacking up, each VSCPacket should have the timestamp of when the provider sent it. Under the assumption that the clock drift is bounded, the consumer can use this timestamp to decide whether to accept or not the VSCPacket. This would be similar to the IBC timeout timestamp.

mpoke · 2023-05-05T10:23:18Z

Converting to draft as more discussion is needed.

jaekwon · 2023-12-04T20:52:46Z

docs/architecture/adr-003-consumer-expiration

+
+This time must be:
+
+- Shorter than 1/2 of the provider unbonding period.


why does it have to be less than 1/2?

To ensure someone don't have the time to unbond on the provider chain and harm the consumer chain then, if there is no VSC package.

jaekwon · 2023-12-04T21:12:46Z

The intent is to halt the chain until the relevant updates come from the provider chain, so that operations on provider chains can be timely. In that sense, 8 days might be quite a while. And this is mostly about halting logic on the consumer side, but the unbonding considerations came later. I agree that it needs to be specified out first, but what is clear to me is that consumer chains will need to halt if it hasn't heard an update from the hub in a while because other things will depend on synchronicity, including the app itself.

Arguably it should be a per-app configuration, but where the default (maximum) allowable delay is set by the hub.
But if the hub doesn't have a constitution that describes the spec and rails for safety, then governance can become a threat to safety as well.

mpoke · 2024-08-06T09:30:31Z

@albttx Closing in favor of https://github.com/cosmos/interchain-security/blob/main/docs/docs/adrs/adr-018-remove-vscmatured.md. The ADR was partially implemented by #2098 (the provider-side changes).

albttx requested review from jtremback, mpoke, sainoe, shaspitz, MSalopek, mmulji-ic, glnro and yaruwangway as code owners February 6, 2023 18:00

albttx mentioned this pull request Feb 6, 2023

ADR-004: Consumer expiration albttx/interchain-security#1

Open

5 tasks

feat: ADR proposal on consumer expiration

a6e6cb5

Co-authored-by: Thomas Bruyelle <[email protected]>

albttx force-pushed the adr/vsc-expiration branch from c388ba7 to a6e6cb5 Compare February 7, 2023 10:04

yaruwangway reviewed Feb 10, 2023

View reviewed changes

docs/architecture/adr-003-consumer-expiration Outdated Show resolved Hide resolved

docs/architecture/adr-003-consumer-expiration Outdated Show resolved Hide resolved

docs/architecture/adr-003-consumer-expiration Outdated Show resolved Hide resolved

fix: typo

f912faa

albttx changed the title ~~feat: ADR proposal on consumer expiration~~ ADR-004 proposal on consumer expiration Feb 13, 2023

mpoke marked this pull request as draft May 5, 2023 10:23

jaekwon reviewed Dec 4, 2023

View reviewed changes

tbruyelle mentioned this pull request Dec 7, 2023

Define ICS1.5 atomone-hub/genesis#66

Open

mpoke closed this Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR-004 proposal on consumer expiration #712

ADR-004 proposal on consumer expiration #712

albttx commented Feb 6, 2023

albttx commented Feb 6, 2023

shaspitz commented Feb 6, 2023

albttx commented Feb 7, 2023

jtremback commented Feb 9, 2023

mpoke commented Mar 8, 2023 •

edited

Loading

jtremback commented Mar 9, 2023

jtremback commented Mar 9, 2023

josef-widder commented Mar 9, 2023

mpoke commented Mar 23, 2023

mpoke commented May 5, 2023

jaekwon Dec 4, 2023

albttx Dec 5, 2023

jaekwon commented Dec 4, 2023

mpoke commented Aug 6, 2024


		This time must be:

		- Shorter than 1/2 of the provider unbonding period.

ADR-004 proposal on consumer expiration #712

ADR-004 proposal on consumer expiration #712

Conversation

albttx commented Feb 6, 2023

Description

Type of change

albttx commented Feb 6, 2023

shaspitz commented Feb 6, 2023

albttx commented Feb 7, 2023

jtremback commented Feb 9, 2023

mpoke commented Mar 8, 2023 • edited Loading

jtremback commented Mar 9, 2023

jtremback commented Mar 9, 2023

josef-widder commented Mar 9, 2023

mpoke commented Mar 23, 2023

mpoke commented May 5, 2023

jaekwon Dec 4, 2023

Choose a reason for hiding this comment

albttx Dec 5, 2023

Choose a reason for hiding this comment

jaekwon commented Dec 4, 2023

mpoke commented Aug 6, 2024

mpoke commented Mar 8, 2023 •

edited

Loading