Skip to content

DCL MainNet Deployment

Alexander Shcherbakov edited this page Feb 10, 2022 · 38 revisions

Summary

This is WIP

DCL is in a better position regarding DDoS protection comparing to public Cosmos networks.

DCL is a permissioned network consisting of quite a limited set of trusted or semi-trusted nodes (validators and observers), and we don't require to make all the nodes public to anyone in the world (a company may make its nodes accessible to that company applications only). But Cosmos public networks are permisionless meaning that it may have any number of nodes, and that nodes need to be public for anyone in the world.

Moreover, DCL nodes do not compete for proposing the blocks, they don't play to "game of stake". There is no tokenomics in the permissioned DCL network (at least so far). So, if one node dies or unavailable for some time, this is not a catastrophe. A Node Admin can fix and repair it. But a crashed/non-available node can be a problem in a staking-based network (Cosmos), as the node can not propose new blocks, and may lost the "clients" (delegators) and their tokens.

In other words, DCL is more collaborative, while staking-based networks (like Cosmos) usually consist of competitive entities.

Options for Validators

All options assume that the validator node is not public and accepts incoming connections from trusted validators and observers only (see Options for network protection)

  • Option 1: Cloud, no HSM
    • Option 1A: no Sentry, private keys and secrets at the Validator machine
    • Option 1B: with Sentry, private keys and secrets at the Validator machine
    • Option 1C: no Sentry, private keys and secrets are not at Validator machine (tmkms, HashiCorp Vault)
    • Option 1D: with Sentry, private keys and secrets are not at Validator machine (tmkms, HashiCorp Vault)
  • Option 2: Physical machine, HSM, with Sentries

Recommended Options for DCL 1.0

Option 2 (Physical machine, HSM, with Sentries) or Option 1B (Cloud, no HSM, with Sentry, private keys and secrets at the Validator machine). DCL Deployment

Why use Sentries:

  • Harder to DDoS the real Validator node (in case malicious Validators present)
  • Hides real Validator node's IP, so harder to attack a real validator
  • Can support HSM and Validators at physical machines w/o access to Internet (if not from beginning, then HSM support can be added in future)
  • Public Sentries are essentially Observers, so no need for more Observers
  • Can potentially auto-scale Sentry nodes (create new Sentries when attack is detected)

Why use separate KMS for Validator Keys:

  • Security best practice: do not keep secrets at Validator machine, so that if Validator is compromised, secrets are not accessed
  • In particular, helps to prevent double-signing by Validators (see https://kb.certus.one/hsm.html#double-signing)
  • Please note though, that double signing is not that critical for DCL comparing to permissionless proof-of-stake networks (Cosmos). In DCL nodes don't have any tokens and don't manage public reputation and clients. So, if a node tries to double sign, it will be just slashed (removed from the network). Later on Node Admins and Trustees can investigate what was the reason.

Why use HSM for Validator Keys:

  • The most secure key management
  • Not that critical for DCL comparing to permissionless proof-of-stake networks (Cosmos), see the previous Item.

Options for the network protection (Trust Link between Private Sentries)

https://kb.certus.one/peers.html#private-nodes

Note1: Option 1 (firewall) is expected to be applied in some way in any case as part of DDoS mitigation strategy. So, the main benefit of the P2P VPN is an additional level of encryption (in addition to Tendermint P2P auth encryption).

Note2: Although some unification is desired, every pair of nodes (peers) may independently decide how they create a trusted link: if this is a wireguard, site-to-sire vpn, or just firewall whitelisting (no additional encryption).

  • Option 1: no VPN, just whitelist/blacklist via firewall rules
    • pros:
      • Seems enough and quite easy to do since
        • We can expect/assume that all IPs are static
        • We don't need encryption at IP level, as auth encryption will be done on Tendermint P2P level in any case
          • Done in for example link, Sections 6.6 and 6.7
      • no additional cost (cloud providers)
      • no additional resources (gateways)
    • cons:
      • there might be some concerns in Tendermint P2P auth encryption (one may don't trust Tendermint's implementation)
  • Option 2: IPSec site-to-site VPN (Cloud providers)
    • pros:
      • managed VPN gateway resources (highly available)
      • IPSec - old, trusted technology
      • IPSec encryption best practices (certificates, their rotation, key exchange IKEv2)
      • mature service (access control, configuration automation)
    • cons:
      • p2p cross-cloud connections will require VPN-to-VPN configuration which is not a straightforward thing even if a documentation is good e.g. Google to AWS
        • A single connection routine with additional resources (gateways), even if they are managed for HA
        • Plus multiple VPN connections per a cloud (gateway)
        • Plus additional firewall rules
        • Certificates management
        • Additional costs:
          • E.g. AWS pricing:
            • Site-to-Site VPN connection fee (per hour)
            • Data transfer fee (per Gb)
            • (optionally) accelerated connection and data transfer fees
  • Option 3: WireGuard P2P VPN
    • pros:
      • Encryption
      • easy to install (part of Linux, FreeBSD, Android kernels, Windows ...) and configure, in short:
        • each peer
          • installs the package
          • creates net interface (ip link add...) assign an address and with a mask (ip address add...)
          • creates pub/priv pair of keys (wg genkey and wg pubkey)
        • peers share the pubkeys along with endpoint and private (local) IPs assigned to the created interfaces
        • each peer:
          • for each (other) peer create a record (pubkey, private IP, endpoint IP) in the configuration file and applies using command wg setconf
          • makes the net interface up (ip link set up)
          • configures nodes to send to the private IPs (wireguard will route encrypted packets that to corresponded public endpoints)
      • fast (e.g. vs OpenVPN, some IPSec comparison)
      • Real p2p (full mesh)
      • additional protection:
      • No additional cost
      • not a big codebase (~4000 lines) so security audit more real than for other protocols
    • cons:
      • Young technology
        • but Linus Torvalds accepted it (with a preference to OpenVPN and IPSec) and other known good developers work around the technology (e.g. tailscale project)
      • known (self-realized) trade-offs (but likely no stoppers)
    • comparison to IPSec and OpenVPN
  • Option 4: tinc P2P VPN
    • Mentioned as an option in https://docs.tendermint.com/master/spec/p2p/node.html#validator-node for validators that trust each other (actually our DCL case)
    • pros:
      • May handle IP changes better (?)
      • no additional cost and resources
    • cons:
      • one more VPN protocol: had some vulnerabilities in the past, not often releases, likely no deep security audit in the past
      • requires additional SW that we need to trust
      • May be more tricky for configuration, especially in heterogeneous environment (different cloud providers etc.)

Options for Nodes Discovery

  • Persistent peers between all Validators (or private Sentries if Validator is behind a Sentry Node)
    • This is how our current TestNet is deployed
    • May need to maintain and update the list of peers
  • One or multiple Seed nodes that all nodes use for discovery. The node can be managed by CSA for example.
    • All nodes have to trust and rely on that seed node
  • Every Validator starts up its own Seed Node

Options for Account Keys

Details

Goals

  1. DDoS Protection
  2. Private Key and secrets security
  3. Trusted relationship (can trust query results, no MITM)
  4. Health and monitoring
  5. Stability and performance
  6. High Availability and scalability

1. DDoS Protection

1.1 General notes

  • there are numerous types of attacks: e.g. on different OSI levels, long-lived, highly distributed (some references: wiki, cloudflare.com ) ...
  • cloud providers accomplish well with layer 3 and 4 attacks mitigation and provides a service for other more sophisticated attacks
    • e.g. AWS:
      • AWS Shield: is active by default (at no additional charge) for all users and mitigates layer 3 and 4 attacks
      • for some fee additional protection (AWS Shield Advanced) is provided to help with more sophisticated attacks including layer 7 attacks
      • a list of additional techniques will help to make that even better
    • Google Cloud:
  • firewall rules is a good way but only a part and not sufficient to mitigate all possible (known) types of attacks
  • the same for embedded CosmosSDK/Tendermint protection logic: it allows to prevent mostly application level attacks only
    • Only valid txns are broadcasted to other nodes
    • Read requests are not broadcasted to other nodes
    • Tendermint/Cosmos TPS is quite high
    • Need to attack a lot of ONs
    • Possible to not allow random ONs to be connected to your ON
  • Cosmos SDK targets of attacks:
    • p2p connections
    • client connections (client service)
  • Conclusions (mitigation directions):
    • (in case of cloud) don't ignore cloud provider anti-DDoS services and consider non-free ones as well
    • separate client and p2p channels (so p2p would continue to work even if all client services are down) and protect with different set of techniques
    • hide validators from the public as the most important part of the system, so even if all public relays are down (are being restarted to back online) validators are still healthy and doesn't require recovery

1.2 Recommendations

2. Private Keys and secrets security

3. Trusted relationship

  • [MUST] gRPC/REST over HTTPS (not HTTP)
  • [MUST] Tendermint RPC over HTTPS (not HTTP)
  • [MUST] Clients connect to trusted Observer nodes only. If there is no trusted Observer to connect to, clients should use Tendermint RPC queries and verify proofs via light client
    • There is support for Light Client Proxy Node, so that clients can run a Proxy node, send all RPC queries to that Proxy, and the Proxy will verify the proofs automatically.

3.1 TLS support

  • TLS 1.3 is supported by Tendermint RPC and CLI client (it uses the endpoint)
    • Note. there is no way to config CLI to work with self-signed certificate (e.g. for testing purposes)
  • TLS is not supported by gRPC/REST endpoints (see details)

Thus, DCL (CosmosSDK/Tendermint) node doesn't have full TLS support and we need here a reverse proxy with SSL termination. Options:

  • Option 1:
    • reverse proxy on each edge DCL node (e.g. nginx)
    • pros:
      • most secure: proxy communicates with the node process directly (localhost), unencrypted traffic between proxy and the backend is not visible
    • cons:
      • more securing efforts: more secrets to manage on each machine
      • additional configuration routine (proxy installation and configuration)
      • additional point of failure, node would become unresponsive even if the validator process is healthy
  • Option 2:
    • dedicated node with a reverse proxy for each edge DCL node
    • pros:
      • no noticeable ones
    • cons:
      • the same as for Option 1
      • plus proxy - DCL node unencrypted connection requires additional securing efforts
        • but likely not so much in cloud VPC environment
  • Option 3:
    • cloud resource: cloud providers usually provide few options to implement a reverse proxy pattern
    • pros:
      • certificate management is easy and secure (e.g. AWS ACM)
      • quite straightforward configuration in one place
        • no need to add a certificate on each server
        • no need to install and configure proxy server on each node as well
      • unencrypted communication with DCL node might be considered as secure enough (as a cloud provider declares)
      • cloud resources can be made highly available so no additional point of failure is introduced
    • cons:
      • additional cost (TODO real numbers)
    • sub-options:
      • AWS: docs
        • The AWS Amplify Console offers a rewrites and redirects feature:
          • the simplest to setup and manage
          • but with higher volumes of outgoing traffic this can get expensive
        • Amazon API Gateway ’s REST API type allows users to setup HTTP proxy integrations
          • higher customization degree (comparing to Amplify Console)
          • pricing is based on the number of API calls as well as any external data transfers. External data transfers are charged at the EC2 data transfer rate
        • Amazon CloudFront is able to route incoming requests by configuring its cache behavior rules
          • integration with AWS Lambda@Edge functions helps to make it highly customizable
          • offers most control over caching behavior and customization
          • Amazon CloudFront is charged by request and by Lambda@Edge invocation. The data traffic out is charged with the CloudFront regional data transfer out pricing
        • (additionally) Application Load Balancer
          • it only supports static targets (fixed IP address), no dynamic targets (domain name) but it might be ok for our case

4. Health and monitoring

  • [SHOULD] Monitor performance: prometheus
  • [SHOULD] Monitor logs: ELK stack

4.1 Performance (metrics) monitoring

  • what
    • server metrics:
      • CPU usage
      • memory usage
      • disk utilization
      • network performance
      • IO performance
    • application metrics:
      • metrics endpoints exposed by cosmos-sdk
        • blocks:
          • good ones / missed / missed rate (%)
          • time perspective: all time / 1w / 1d / 1h
          • current block height:
            • sentries
            • validators
        • validator / KMS performance
          • sync latency
          • sign latency
          • signatures per minute
        • number of peers
    • ??? RPC endpoints (to provide supplemental data regarding the networks)
  • how
    • tools:
    • notes:
      • HA setup should be considered
      • monitor (Prometheus) should be monitored itself (e.g. by cloud service)
      • alerts for:
        • downtime (e.g. more than 1 min)
        • ...
  • TODO
    • explore more metrics to consider
    • explore and define metrics thresholds
    • work on tools options

4.2 Logs monitoring

  • what
    • application logs
    • system logs
    • authentication logs
  • how:
    • tools:
    • notes:
      • logs are pushed to a high-availability queuing service
      • log processor (Fluentd or Logstash) pops, parses, tokenizes, indexes and stores in a search engine (e.g. Elasticsearch)
      • alerts may be triggered on specific phrases (e.g. CONSENSUS FAILURE or Disk Full)
      • debugging facilities
  • TODO
    • explore and prepare a map: phrase-event
    • list of events for alerts
    • list of events for debugging
    • work on tools options

5. Stability and performance

  • [MUST] Recommended config
    • disable PEX for private nodes
    • adjust timeouts
  • [SHOULD] State-Sync for new Nodes
  • [SHOULD] Seed Nodes for peer discovery??

6. High Availability and scalability

  • [SHOULD] Multiple Observers (Sentries)
  • [SHOULD] Load Balancers for Observers (Public Sentries)

References