Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] TBS: Replace badger with pebble (backport #15235) #15452

Open
wants to merge 6 commits into
base: 8.x
Choose a base branch
from

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Jan 29, 2025

Motivation/summary

This PR replaces badger with pebble as the database for tail-based sampling.

// The database of choice is Pebble, which does not have TTL handling built-in,
// and we implement our own TTL handling on top of the database:
// - TTL is divided up into N parts, where N is partitionsPerTTL.
// - A database holds N + 1 + 1 partitions.
// - Every TTL/N we will discard the oldest partition, so we keep a rolling window of N+1 partitions.
// - Writes will go to the most recent partition, and we'll read across N+1 partitions

Benchmarks

TLDR: +3000% in indexed events/s, +76% intakev2 event rate, while on -75% memory usage and -44% disk usage

See comment for details.

Major design changes

  • As pebble does not support TTL, this PR introduces a partitioning method to enforce TTL. TTL is handled by assigning a rotating partitioning key (not timestamp based) as database key prefix. There's also a background thread to run a "TTL GC Loop": for every TTL, the prefix of keys are rotated, and the expired prefix will be deleted and compacted. This explicit TTL handling enables precise deletion and the lifetime of an entry in the database is strictly bounded to 2*TTL. In fact, there's a knob partitionsPerTTL to adjust the available prefixes to trade between storage overhead and read amplification. e.g. partitionsPerTTL=1 keeps 2*TTL entries with 2 partition reads per key read, while partitionsPerTTL=2 keeps 1.5*TTL entries with 3 partition reads per key read.
  • Not using timestamp based prefix enables TTL adjustments without data loss on EA hot reload / apm-server restart, as prefixes are fixed but TTL-truncated timestamps aren't.
  • Sampling decision and events are stored separately in different pebble databases, as they have vastly different characteristics, which means the optimal pebble option would be different. Compression is enabled for events but disabled for decisions.

image

Other implied changes

  • As TTL is enforced by deleting entries and compacting actively in the TTL GC Loop, there is no TTL validation on stored entries at read time, i.e. it is possible to run apm-server TBS, stop it, wait for 1 year, then restart apm-server, and old entries will be respected. In contrast, badger checks for expiry at read time. As trace IDs are supposed to be unique, the downside of not doing read-time TTL validation should be minimal.
  • Pebble does not use a vlog like badger. In addition to the event and decision DB separation, this also requires changes in monitoring metrics structure. In the draft PR, the sum of table size, WAL size across the 2 DBs are summed and reported in sampling.tail.storage.lsm_size, while sampling.tail.storage.vlog_size is always 0. The change is not decided yet.

TODO:

Useful but not necessary, out of scope of this PR:

Checklist

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

Enable TBS, try various sampling policies, send events, keep it running for over 2 * TTL, ensure that disk usage is bounded, and memory usage is expected.

Related issues

Fixes #15246


This is an automatic backport of pull request #15235 done by [Mergify](https://mergify.com).

This PR replaces badger with pebble as the database for tail-based sampling. Significant performance gains.

The database of choice is Pebble, which does not have TTL handling built-in,
and we implement our own TTL handling on top of the database:
- TTL is divided up into N parts, where N is partitionsPerTTL.
- A database holds N + 1 + 1 partitions.
- Every TTL/N we will discard the oldest partition, so we keep a rolling window of N+1 partitions.
- Writes will go to the most recent partition, and we'll read across N+1 partitions

(cherry picked from commit 0ca58b8)

# Conflicts:
#	go.mod
#	go.sum
#	internal/beater/monitoringtest/opentelemetry.go
#	x-pack/apm-server/main.go
#	x-pack/apm-server/main_test.go
#	x-pack/apm-server/sampling/processor.go
#	x-pack/apm-server/sampling/processor_bench_test.go
#	x-pack/apm-server/sampling/processor_test.go
@mergify mergify bot requested a review from a team as a code owner January 29, 2025 14:59
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Jan 29, 2025
Copy link
Contributor Author

mergify bot commented Jan 29, 2025

Cherry-pick of 0ca58b8 has failed:

On branch mergify/bp/8.x/pr-15235
Your branch is up to date with 'origin/8.x'.

You are currently cherry-picking commit 0ca58b8c.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   NOTICE.txt
	modified:   changelogs/head.asciidoc
	modified:   internal/beater/config/config_test.go
	modified:   internal/beater/config/sampling.go
	modified:   x-pack/apm-server/sampling/config.go
	modified:   x-pack/apm-server/sampling/config_test.go
	deleted:    x-pack/apm-server/sampling/eventstorage/badger.go
	new file:   x-pack/apm-server/sampling/eventstorage/doc.go
	deleted:    x-pack/apm-server/sampling/eventstorage/logger.go
	new file:   x-pack/apm-server/sampling/eventstorage/partition_rw.go
	new file:   x-pack/apm-server/sampling/eventstorage/partitioner.go
	new file:   x-pack/apm-server/sampling/eventstorage/partitioner_test.go
	new file:   x-pack/apm-server/sampling/eventstorage/pebble.go
	new file:   x-pack/apm-server/sampling/eventstorage/pebble_test.go
	new file:   x-pack/apm-server/sampling/eventstorage/prefix.go
	new file:   x-pack/apm-server/sampling/eventstorage/prefix_test.go
	new file:   x-pack/apm-server/sampling/eventstorage/rw.go
	new file:   x-pack/apm-server/sampling/eventstorage/rw_test.go
	deleted:    x-pack/apm-server/sampling/eventstorage/sharded.go
	deleted:    x-pack/apm-server/sampling/eventstorage/sharded_bench_test.go
	modified:   x-pack/apm-server/sampling/eventstorage/storage.go
	modified:   x-pack/apm-server/sampling/eventstorage/storage_bench_test.go
	modified:   x-pack/apm-server/sampling/eventstorage/storage_manager.go
	new file:   x-pack/apm-server/sampling/eventstorage/storage_manager_bench_test.go
	modified:   x-pack/apm-server/sampling/eventstorage/storage_manager_test.go
	deleted:    x-pack/apm-server/sampling/eventstorage/storage_test.go
	deleted:    x-pack/apm-server/sampling/eventstorage/storage_whitebox_test.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   go.mod
	both modified:   go.sum
	both modified:   internal/beater/monitoringtest/opentelemetry.go
	both modified:   x-pack/apm-server/main.go
	both modified:   x-pack/apm-server/main_test.go
	both modified:   x-pack/apm-server/sampling/processor.go
	both modified:   x-pack/apm-server/sampling/processor_bench_test.go
	both modified:   x-pack/apm-server/sampling/processor_test.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@carsonip
Copy link
Member

There's a lot of conflicts as #15094 was not backported to 8.x, but they should be fixed now.

@carsonip
Copy link
Member

run docs-build

Copy link
Contributor Author

mergify bot commented Feb 3, 2025

This pull request is now in conflicts. Could you fix it @mergify[bot]? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b mergify/bp/8.x/pr-15235 upstream/mergify/bp/8.x/pr-15235
git merge upstream/8.x
git push upstream mergify/bp/8.x/pr-15235

Copy link
Contributor Author

mergify bot commented Feb 3, 2025

This pull request has not been merged yet. Could you please review and merge it @carsonip? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport conflicts There is a conflict in the backported pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant