Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon/ingest: Use buffered storage backend for reingest command #5374

Merged
merged 27 commits into from
Jul 18, 2024

Conversation

sreuland
Copy link
Contributor

@sreuland sreuland commented Jul 5, 2024

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

Change db reingest command to accept additional configuration parameters to use a new buffered storage ledger backend(consumes precomputed tx meta files generated from ledger exporter) instead of using captive core backend. There is currently one implementation of the buffered storage reader available, so will see reference on that often which is for Google Cloud Storage(GCS) backend.

Refer to new Reingestion readme docs for details on the invoking db reingest range <from> <to> with new configuration via --ledgerbackend and --datastore-config parameters

Captive core remains the default ledger backend in all cases still, reingestion commands or on live/forward ingestion. Only reingestion allows config to override the ledger backend type.

Coauthors: @urvisavla

Why

Faster reingestion speeds. Ingestion from precomputed tx meta files is faster than consuming same tx meta from captive core pipe.
Closes #4911

Known limitations

only affects reingestion command. Requires an authenticatable GCP cloud connection and the cloud data store must be maintained(receiving data files) from a running ledger exporter process, which is out of band from here.

@sreuland sreuland marked this pull request as ready for review July 9, 2024 16:01
urvisavla and others added 3 commits July 11, 2024 00:27
* services/horizon: Reingest from precomputed TxMeta

* Global network config for reingestion

* Add unit test for new --ledgerbackend flag

* Fix unit test

* Add unittest for network flags validation
services/horizon/cmd/db.go Outdated Show resolved Hide resolved
services/horizon/cmd/db.go Outdated Show resolved Hide resolved
Copy link
Contributor

@tamirms tamirms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sreuland sreuland merged commit d5767df into master Jul 18, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for ingesting via precomputed TxMeta in Horizon
4 participants