Skip to content

Commit

Permalink
docs: upgrading instructions (#3135)
Browse files Browse the repository at this point in the history
* update upgrading instr.

* release notes

* update height
  • Loading branch information
mpoke committed Jun 5, 2024
1 parent 63f2c8e commit 17f2ba0
Show file tree
Hide file tree
Showing 2 changed files with 103 additions and 134 deletions.
3 changes: 3 additions & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ The upgrade height is [20740970](https://www.mintscan.io/cosmos/block/20740970).

Check out the [changelog](https://github.com/cosmos/gaia/blob/v17.2.0/CHANGELOG.md) for a list of relevant changes or [compare all changes](https://github.com/cosmos/gaia/compare/v17.1.0...v17.2.0) from last release.

<!-- Add the following line for major releases -->
Refer to the [upgrading guide](https://github.com/cosmos/gaia/blob/release/v17.2.x/UPGRADING.md) when migrating from `v17.1.x` to `v17.2.x`.

## 🚀 Highlights

<!-- Add any highlights of this release -->
Expand Down
234 changes: 100 additions & 134 deletions UPGRADING.md
Original file line number Diff line number Diff line change
@@ -1,194 +1,160 @@
# Upgrading Gaia
# Upgrade Gaia from v17.1.0 to v17.2.0

This guide provides instructions for upgrading Gaia from v16.x to v17.1.x.
## This is a coordinated upgrade. IT IS CONSENSUS BREAKING, so please apply the fix only on height 20740970.

This document describes the steps for validators, full node operators and relayer operators, to upgrade successfully for the Gaia v17 release.
### Release Details
* https://github.com/cosmos/gaia/releases/tag/v17.2.0
* Chain upgrade height : `20740970`. Exact upgrade time can be checked [here](https://www.mintscan.io/cosmos/block/20740970).
* Go version has been frozen at `1.21`. If you are going to build `gaiad` binary from source, make sure you are using the right GO version!

For more details on the release, please see the [release notes](https://github.com/cosmos/gaia/releases/tag/v17.1.0)
# Performing the co-ordinated upgrade

**Relayer Operators** for the Cosmos Hub and consumer chains, will also need to update to use [Hermes v1.8.0](https://github.com/informalsystems/hermes/releases/tag/v1.8.0) or higher. You may need to restart your relayer software after a major chain upgrade.
This co-ordinated upgrades requires validators to stop their validators at `halt-height`, switch their binary to `v17.2.0` and restart their nodes with the new version.

## Release Binary
The exact sequence of steps depends on your configuration. Please take care to modify your configuration appropriately if your setup is not included in the instructions.

Please use the correct release binary: `v17.1.0`.
# Manual steps

## Instructions
## Step 1: Configure `halt-height` using v17.1.0 and restart the node.

- [Upgrading Gaia](#upgrading-gaia)
- [Release Binary](#release-binary)
- [Instructions](#instructions)
- [On-chain governance proposal attains consensus](#on-chain-governance-proposal-attains-consensus)
- [Upgrade date](#upgrade-date)
- [Preparing for the upgrade](#preparing-for-the-upgrade)
- [Backups](#backups)
- [Testing](#testing)
- [Current runtime](#current-runtime)
- [Target runtime](#target-runtime)
- [Upgrade steps](#upgrade-steps)
- [Method I: Manual Upgrade](#method-i-manual-upgrade)
- [Method II: Upgrade using Cosmovisor](#method-ii-upgrade-using-cosmovisor)
- [Manually preparing the binary](#manually-preparing-the-binary)
- [Preparation](#preparation)
- [Expected upgrade result](#expected-upgrade-result)
- [Auto-Downloading the Gaia binary](#auto-downloading-the-gaia-binary)
- [Upgrade duration](#upgrade-duration)
- [Rollback plan](#rollback-plan)
- [Communications](#communications)
- [Risks](#risks)
- [Reference](#reference)
This upgrade requires `gaiad` halting execution at a pre-selected `halt-height`. Failing to stop at `halt-height` may cause a consensus failure during chain execution at a later time.

## On-chain governance proposal attains consensus
There are two mutually exclusive options for this stage:

Once a software upgrade governance proposal is submitted to the Cosmos Hub, both a reference to this proposal and an `UPGRADE_HEIGHT` are added to the [release notes](https://github.com/cosmos/gaia/releases/tag/v17.1.0).
If and when this proposal reaches consensus, the upgrade height will be used to halt the "old" chain binaries. You can check the proposal on one of the block explorers or using the `gaiad` CLI tool.
### Option 1: Set the halt height by modifying `app.toml`

## Upgrade date
* Stop the gaiad process.

The date/time of the upgrade is subject to change as blocks are not generated at a constant interval. You can stay up-to-date by checking the estimated estimated time until the block is produced one of the block explorers (e.g. https://www.mintscan.io/cosmos/blocks/`UPGRADE_HEIGHT`).
* Edit the application configuration file at `~/.gaia/config/app.toml` so that `halt-height` reflects the upgrade plan:

## Preparing for the upgrade

### Backups

Prior to the upgrade, validators are encouraged to take a full data snapshot. Snapshotting depends heavily on infrastructure, but generally this can be done by backing up the `.gaia` directory.
If you use Cosmovisor to upgrade, by default, Cosmovisor will backup your data upon upgrade. See below [upgrade using cosmovisor](#method-ii-upgrade-using-cosmovisor) section.
```toml
# Note: Commitment of state will be attempted on the corresponding block.
halt-height = 20740970
```
* restart gaiad process

It is critically important for validator operators to back-up the `.gaia/data/priv_validator_state.json` file after stopping the gaiad process. This file is updated every block as your validator participates in consensus rounds. It is a critical file needed to prevent double-signing, in case the upgrade fails and the previous chain needs to be restarted.
* Wait for the upgrade height and confirm that the node has halted

### Testing
### Option 2: Restart the `gaiad` binary with command line flags

For those validator and full node operators that are interested in ensuring preparedness for the impending upgrade, you can run a [v17 Local Testnet](https://github.com/cosmos/testnets/tree/master/local) or join in our [Cosmos Hub Public Testnet](https://github.com/cosmos/testnets/tree/master/public).
* Stop the gaiad process.

### Current runtime
* Do not modify `app.toml`. Restart the `gaiad` process with the flag `--halt-height`:
```shell
gaiad start --halt-height 20740970
```

The Cosmos Hub mainnet network, `cosmoshub-4`, is currently running [Gaia v16.0.0](https://github.com/cosmos/gaia/releases/v16.0.0). We anticipate that operators who are running on v16.0.0, will be able to upgrade successfully. Validators are expected to ensure that their systems are up to date and capable of performing the upgrade. This includes running the correct binary and if building from source, building with the appropriate `go` version.
* Wait for the upgrade height and confirm that the node has halted

### Target runtime
Upon reaching the `halt-height` you need to replace the `v17.1.0` gaiad binary with the new `gaiad v17.2.0` binary and remove the `halt-height` constraint.
Depending on your setup, you may need to set `halt-height = 0` in your `app.toml` before resuming operations.
```shell
git clone https://github.com/cosmos/gaia.git
```

The Cosmos Hub mainnet network, `cosmoshub-4`, will run **[Gaia v17.1.0](https://github.com/cosmos/gaia/releases/tag/v17.1.0)**. Operators _**MUST**_ use this version post-upgrade to remain connected to the network. The new version requires `go v1.21` to build successfully.
## Step 2: Build and start the v17.2.0 binary

## Upgrade steps
### Remember to revert `gaiad` configurations
* Reset `halt-height = 0` option in the `app.toml` or
* Remove it from start parameters of the gaiad binary before restarting the node

There are 2 ways to upgrade a node:
We recommend you perform a backup of your data directory before switching to `v17.2.0`.

- Manual upgrade
- Upgrade using [Cosmovisor](https://pkg.go.dev/cosmossdk.io/tools/cosmovisor)
- Either by manually preparing the new binary
- Or by using the auto-download functionality (this is not yet recommended)
```shell
cd $HOME/gaia
git pull
git fetch --tags
git checkout v17.2.0
make install

# verify install
gaiad version
# v17.2.0
```

If you prefer to use Cosmovisor to upgrade, some preparation work is needed before upgrade.
```shell
gaiad start # starts the v17.2.0 node
```

### Method I: Manual Upgrade
# Cosmovisor steps

Make sure **Gaia v16.0.0** is installed by either downloading a [compatible binary](https://github.com/cosmos/gaia/releases/tag/v16.0.0), or building from source. Check the required version to build this binary in the `Makefile`.
## Prerequisite: Alter systemd service configuration

Run Gaia v16.0.0 till upgrade height, the node will panic:
Disable automatic restart of the node service. To do so please alter your `gaiad.service` file configuration and set appropriate lines to following values.

```shell
ERR UPGRADE "v17" NEEDED at height: <UPGRADE_HEIGHT>: upgrade to v17 and applying upgrade "v17" at height:<UPGRADE_HEIGHT>
```
Restart=no
Stop the node, and switch the binary to **Gaia v17.1.0** and re-start by `gaiad start`.
Environment="DAEMON_ALLOW_DOWNLOAD_BINARIES=false"
Environment="DAEMON_RESTART_AFTER_UPGRADE=false"
```

It may take several minutes to a few hours until validators with a total sum voting power > 2/3 to complete their node upgrades. After that, the chain can continue to produce blocks.
After that you will need to run `sudo systemctl daemon-reload` to apply changes in the service configuration.

### Method II: Upgrade using Cosmovisor
There is no need to restart the node yet; these changes will get applied during the node restart in the next step.

#### Manually preparing the binary
## Setup Cosmovisor
### Create the updated gaiad binary of v17.2.0

##### Preparation
### Remember to revert `gaiad` configurations
* Reset `halt-height = 0` option in the `app.toml` or
* Remove it from start parameters of the gaiad binary before starting the node

- Install the latest version of Cosmovisor (`1.5.0`):
#### Go to gaiad directory if present else clone the repository

```shell
go install cosmossdk.io/tools/cosmovisor/cmd/cosmovisor@latest
cosmovisor version
# cosmovisor version: v1.5.0
git clone https://github.com/cosmos/gaia.git
```

- Create a `cosmovisor` folder inside `$GAIA_HOME` and move Gaia `v16.0.0` into `$GAIA_HOME/cosmovisor/genesis/bin`:
#### Follow these steps if gaiad repo already present

```shell
mkdir -p $GAIA_HOME/cosmovisor/genesis/bin
cp $(which gaiad) $GAIA_HOME/cosmovisor/genesis/bin
cd $HOME/gaia
git pull
git fetch --tags
git checkout v17.2.0
make install
```

- Build Gaia `v17.1.0`, and move gaiad `v17.1.0` to `$GAIA_HOME/cosmovisor/upgrades/v17/bin`

#### Check the new gaiad version, verify the latest commit hash
```shell
mkdir -p $GAIA_HOME/cosmovisor/upgrades/v17/bin
cp $(which gaiad) $GAIA_HOME/cosmovisor/upgrades/v17/bin
$ gaiad version --long
name: gaiad
server_name: gaiad
version: 17.2.0
commit: <commit-hash>
...
```

At this moment, you should have the following structure:
#### Or check checksum of the binary if you decided to download it

```shell
.
├── current -> genesis or upgrades/<name>
├── genesis
│ └── bin
│ └── gaiad # old: v16.0.0
└── upgrades
└── v17
└── bin
└── gaiad # new: v17.1.0
```
Checksums can be found on the official release page:
* https://github.com/cosmos/gaia/releases/tag/v17.2.0

- Export the environmental variables:
The checksums file is located in the `Assets` section:
* e.g. [SHA256SUMS-v17.2.0.txt](https://github.com/cosmos/gaia/releases/download/v17.2.0/SHA256SUMS-v17.2.0.txt)

```shell
export DAEMON_NAME=gaiad
# please change to your own gaia home dir
# please note `DAEMON_HOME` has to be absolute path
export DAEMON_HOME=$GAIA_HOME
export DAEMON_RESTART_AFTER_UPGRADE=true
$ shasum -a 256 gaiad-v17.2.0-linux-amd64
<checksum> gaiad-v17.2.0-linux-amd64
```

- Start the node:

### Copy the new gaiad (v17.2.0) binary to cosmovisor current directory
```shell
cosmovisor run start --x-crisis-skip-assert-invariants --home $DAEMON_HOME
cp $GOPATH/bin/gaiad ~/.gaiad/cosmovisor/current/bin
```

Skipping the invariant checks is strongly encouraged since it decreases the upgrade time significantly and since there are some other improvements coming to the crisis module in the next release of the Cosmos SDK.
### Restore service file settings

##### Expected upgrade result

When the upgrade block height is reached, Gaia will panic and stop:

This may take a few minutes.
After upgrade, the chain will continue to produce blocks when validators with a total sum voting power > 2/3 complete their node upgrades.

#### Auto-Downloading the Gaia binary

## Upgrade duration

The upgrade may take a few minutes to complete because cosmoshub-4 participants operate globally with differing operating hours and it may take some time for operators to upgrade their binaries and connect to the network.

## Rollback plan

During the network upgrade, core Cosmos teams will be keeping an ever vigilant eye and communicating with operators on the status of their upgrades. During this time, the core teams will listen to operator needs to determine if the upgrade is experiencing unintended challenges. In the event of unexpected challenges, the core teams, after conferring with operators and attaining social consensus, may choose to declare that the upgrade will be skipped.

Steps to skip this upgrade proposal are simply to resume the cosmoshub-4 network with the (downgraded) v16.0.0 binary using the following command:

```shell
gaiad start --unsafe-skip-upgrade <UPGRADE_HEIGHT>
If you are using a service file, restore the previous `Restart` settings in your service file:
```
Restart=On-failure
```
Reload the service control `sudo systemctl daemon-reload`.

Note: There is no particular need to restore a state snapshot prior to the upgrade height, unless specifically directed by core Cosmos teams.

Important: A social consensus decision to skip the upgrade will be based solely on technical merits, thereby respecting and maintaining the decentralized governance process of the upgrade proposal's successful YES vote.

## Communications

Operators are encouraged to join the `#cosmos-hub-validators-verified` channel of the Cosmos Hub Community Discord. This channel is the primary communication tool for operators to ask questions, report upgrade status, report technical issues, and to build social consensus should the need arise. This channel is restricted to known operators and requires verification beforehand. Requests to join the `#cosmos-hub-validators-verified` channel can be sent to the `#general-support` channel.

## Risks

As a validator performing the upgrade procedure on your consensus nodes carries a heightened risk of double-signing and being slashed. The most important piece of this procedure is verifying your software version and genesis file hash before starting your validator and signing.

The riskiest thing a validator can do is discover that they made a mistake and repeat the upgrade procedure again during the network startup. If you discover a mistake in the process, the best thing to do is wait for the network to start before correcting it.
# Revert `gaiad` configurations

## Reference
Depending on which path you chose for Step 1, either:

[Join Cosmos Hub Mainnet](https://github.com/cosmos/mainnet)
* Reset `halt-height = 0` option in the `app.toml` or
* Remove it from start parameters of the gaiad binary and start node again

0 comments on commit 17f2ba0

Please sign in to comment.