-
SummaryThe Casper Labs engineering team is requesting to reduce the Max TTL setting for mainnet from 24 hours to 18 hours. This change would be proposed as part of the 1.5.x release. The changes in 1.5 that enables new nodes to join from the tip of the chain exacerbates the pressure that a large maximum TTL setting places on the nodes and network at large. Nodes must retain all deploys in the system that have an unexpired TTL. Since nodes can join much faster in 1.5, the pressure on the network and the nodes increases with longer TTL. Reducing this maximum TTL setting will relieve the strain on the nodes/network in the following ways:
BackgroundDefinitions
Creation of a Deploy
Sending the Deploy to A Network Such a deploy may then be sent to one or more nodes on the target network with one or more signatures attached. The receiving node(s) validates the deploy, enforces various rules (including a dead-on-arrival check):
If the deploy passes all of these checks, it is accepted; the node then schedules that deploy be gossiped to the rest of the network.
Inclusion in a Block When a given validator is selected as the leader of a round and proposes a block, the validator’s node produces a set of 0 or more deploys that it has buffered for inclusion and sends that proposal onward for consideration / consensus. The protocol dictates various rules about what does and does not constitute a valid proposed block;
Clearing up a common TTL misconceptionAll versions of the released casper node software to date use a FIFO (first in first out) scheme. A leader node will propose deploys that it has received, based on the order it has received them, either directly, or via the gossip mechanism. The TTL setting of a deploy does not indicate a deferment of execution; setting a deploy to have a TTL of 24 hours and sending it to the network immediately does not result in the deploy being buffered for 24 hours and then included in a proposed block. Rather, if a deploy were created with a 24 hour TTL, was held for 23+ hours then signed and submitted to the network, it would still be a viable deploy assuming all other validity checks pass. It would be gossiped and buffered, and would be eligible for inclusion in a proposed block right up until its TTL expired. Similarly, in a multisig scenario, if a deploy were created with a 24 hour TTL and signed by the initiating entity, then sent onward off-chain to one or more other signing entities, and the final signer submitted the deploy to one or more nodes on the network, if the TTL had not yet expired it would remain viable as described previously. Impact of Maximum TTL on a NetworkThe maximum TTL setting of a given Casper network is a load-bearing chainspec setting which has multiple implications. Recall that the node must perform validity checks on deploys before including these in a block. Deploys are only expired when the TTL has elapsed. Therefore, as the duration of the max TTL increases, so does the potential memory pressure on the deploy buffer and the amount of overhead involved to enforce the deploy replay protection rules. In addition to the increased pressure on the nodes under normal conditions, it is also a scaling challenge as longer durations eventually present a resource exhaustion attack vector. In addition to this, new nodes cannot become validators and participate in consensus until they can perform the deploy validation for new deploys. This means that all new joining nodes must have all the deploys with unexpired TTL (maximum TTL), and a contiguous segment of complete blocks for the same time period. Without this information, the node cannot validate blocks or propose new blocks. Changes coming with 1.5The 1.5 version of the protocol joins new nodes at the tip of the network rather than starting them at genesis and forcing them to grind forward for weeks to eventually catch up to the current state of the chain and only then be able to participate in the network. Now nodes join up to the tip of the chain and are able to follow along and act as participating nodes relatively quickly. Such nodes do not have all blocks on the chain, rather they start filling in a tip-centric sequence of the chain as they go. They naturally attain new blocks as they come into existence simply by participating. In 1.4.x and prior versions, every node (validator or not) was required to have all data of all blocks from genesis onward. In 1.5 and onward, this restriction is no longer necessary. Nodes may still opt to attempt to acquire all historical data back to genesis, but a new option called SyncToTTL is offered which allows nodes to fully participate in the network (including becoming a validator) if they at least have a continuous chain of complete blocks covering the current time window dictated by now back to the network’s maximum TTL setting (actually implemented with a small additional safety margin to avoid fencepost concerns). The 1.5 version of the node also provides a mechanism for such nodes to acquire historical data for past blocks from other nodes on the network that have it. However, all such work on both the node asking and the nodes responding is prioritized below all essential functions of the network. On a network that has sufficient capacity, filling in historical blocks happens semi-constantly in the background as cycles are available. On a busy network, it happens sporadically on an as-able basis. For nodes that have no intent to validate, there is no particular urgency to this eventually consistent process. However, for a node that intends to participate in the validation process, that node must acquire sufficient data to cover the time window determined by the max TTL of that network. There are two ways to accomplish this; the first is the above mentioned historical data acquisition process will eventually fill this data is subject to that network’s available cycles. The other way is to simply run that node for a length of time equal to the same time window; i.e. if the max TTL of the network is 24 hours then a node that has been joined to the network following new blocks for 24 hours will naturally acquire a contiguous segment of complete blocks. These two approaches are not mutually exclusive; in the best case such a node will acquire the necessary state relatively quickly via the historical synchronization process, and in the worst case will acquire the necessary state as the time window advances forward until the applicable period has been directly observed. Either way, the longer the maximum TTL setting of the network the larger the burden of work and the more time it takes to advance to the desired state of being able to enforce the deploy replay protection rules. As mentioned, the current maximum TTL setting in mainnet is 24 hours and the software is designed and tested against that value. However, as illustrated a shorter maximum TTL offers reduced system overhead and reduces the burden on both newly joining nodes and the other nodes servicing their requests for sufficient historical data to satisfy TTL awareness. It also reduces the worst-case scenario time frame for a node attempting to enter the validation process. Thus, the recommendation is to shorten the maximum TTL in mainnet as part of the 1.5 release to 12 or 18 hours. This would have no effect on the large majority of users of the chain. However, some entities that have batch processing middleware and / or multisig processes may be negatively impacted; particularly in multisig scenarios where signatories are geographically distant or otherwise temporally asynchronous and need time to collect sufficient signatures. Analysis of on Chain transactionsAn analysis of mainnet reveals that over 99% of deploys are included within 2 hours of the timestamp of the deploy. A chart with the raw data is available here |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 23 replies
-
We recommend 18 hours for the new max TTL setting. |
Beta Was this translation helpful? Give feedback.
-
My understanding so far was that with a tool like That leads to a few questions:
|
Beta Was this translation helpful? Give feedback.
We recommend 18 hours for the new max TTL setting.