-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TBS: set default sampling.tail.storage_limit
to 0 but limit disk usage to 90%
#15467
Changes from 31 commits
0e138ff
34406b7
a614778
27d2e21
a047c91
6bc1078
b409d03
28e440f
e51d329
e550b11
954eca1
38489a9
cf9e2da
7c89ced
454ab84
d6c84b8
40ed22a
b602c0a
fa9f12c
f333764
d23c40d
06cea30
fe5fa61
5c0a7d9
5474d43
a22f2a3
6342883
2e4206e
0144ddb
6988450
af3fe53
494a540
c6403d5
e66e070
e2034b5
5bc7004
1904ae0
839efb3
682f7d3
18fe30b
ca62f9f
370c7ea
0678a5f
1e1b4cf
bdff329
0d74181
856a5bc
615e94d
7bcfc60
4677241
235d1ed
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,7 @@ import ( | |
"time" | ||
|
||
"github.com/cockroachdb/pebble/v2" | ||
"github.com/cockroachdb/pebble/v2/vfs" | ||
"go.opentelemetry.io/otel/metric" | ||
"golang.org/x/sync/errgroup" | ||
|
||
|
@@ -46,6 +47,10 @@ const ( | |
|
||
// diskUsageFetchInterval is how often disk usage is fetched which is equivalent to how long disk usage is cached. | ||
diskUsageFetchInterval = 1 * time.Second | ||
|
||
// dbStorageLimitFallback is the default fallback storage limit in bytes | ||
// that applies when disk threshold cannot be enforced due to an error. | ||
dbStorageLimitFallback = 5 << 30 | ||
carsonip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
) | ||
|
||
type StorageManagerOptions func(*StorageManager) | ||
|
@@ -62,6 +67,19 @@ func WithMeterProvider(mp metric.MeterProvider) StorageManagerOptions { | |
} | ||
} | ||
|
||
// WithGetDiskUsage configures getDiskUsage function used by StorageManager. | ||
// For testing only. | ||
func WithGetDiskUsage(getDiskUsage func() (DiskUsage, error)) StorageManagerOptions { | ||
1pkg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return func(sm *StorageManager) { | ||
sm.getDiskUsage = getDiskUsage | ||
} | ||
} | ||
|
||
// DiskUsage is the struct returned by getDiskUsage. | ||
type DiskUsage struct { | ||
UsedBytes, TotalBytes uint64 | ||
} | ||
|
||
// StorageManager encapsulates pebble.DB. | ||
// It assumes exclusive access to pebble DB at storageDir. | ||
type StorageManager struct { | ||
|
@@ -83,6 +101,13 @@ type StorageManager struct { | |
// cachedDBSize is a cached result of db size. | ||
cachedDBSize atomic.Uint64 | ||
|
||
getDiskUsage func() (DiskUsage, error) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we call it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should have called it cachedDiskStat and changing diskStatFailed to getDiskUsageFailed. Keeping the call as getDiskUsage as the pebble function is called GetDiskUsage. |
||
// diskStat is disk usage statistics about the disk only, not related to the databases. | ||
diskStat struct { | ||
used, total atomic.Uint64 | ||
} | ||
diskStatFailed atomic.Bool | ||
|
||
// runCh acts as a mutex to ensure only 1 Run is actively running per StorageManager. | ||
// as it is possible that 2 separate Run are created by 2 TBS processors during a hot reload. | ||
runCh chan struct{} | ||
|
@@ -100,6 +125,13 @@ func NewStorageManager(storageDir string, opts ...StorageManagerOptions) (*Stora | |
runCh: make(chan struct{}, 1), | ||
logger: logp.NewLogger(logs.Sampling), | ||
codec: ProtobufCodec{}, | ||
getDiskUsage: func() (DiskUsage, error) { | ||
usage, err := vfs.Default.GetDiskUsage(storageDir) | ||
return DiskUsage{ | ||
UsedBytes: usage.UsedBytes, | ||
TotalBytes: usage.TotalBytes, | ||
}, err | ||
}, | ||
} | ||
for _, opt := range opts { | ||
opt(sm) | ||
|
@@ -211,6 +243,28 @@ func (sm *StorageManager) dbSize() uint64 { | |
|
||
func (sm *StorageManager) updateDiskUsage() { | ||
sm.cachedDBSize.Store(sm.eventDB.Metrics().DiskSpaceUsage() + sm.decisionDB.Metrics().DiskSpaceUsage()) | ||
|
||
if sm.diskStatFailed.Load() { | ||
// Skip GetDiskUsage under the assumption that | ||
// it will always get the same error if GetDiskUsage ever returns one, | ||
// such that it does not keep logging GetDiskUsage errors. | ||
return | ||
} | ||
usage, err := sm.getDiskUsage() | ||
if err != nil { | ||
sm.logger.With(logp.Error(err)).Warn("failed to get disk usage") | ||
1pkg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
sm.diskStatFailed.Store(true) | ||
sm.diskStat.total.Store(0) // setting total to 0 to disable any running disk threshold checks | ||
return | ||
} | ||
sm.diskStat.used.Store(usage.UsedBytes) | ||
sm.diskStat.total.Store(usage.TotalBytes) | ||
} | ||
|
||
// diskUsed returns the actual used disk space in bytes. | ||
// Not to be confused with dbSize which is specific to database. | ||
func (sm *StorageManager) diskUsed() uint64 { | ||
return sm.diskStat.used.Load() | ||
} | ||
|
||
// runDiskUsageLoop runs a loop that updates cached disk usage regularly. | ||
|
@@ -341,30 +395,61 @@ func (sm *StorageManager) WriteSubscriberPosition(data []byte) error { | |
// NewUnlimitedReadWriter returns a read writer with no storage limit. | ||
// For testing only. | ||
carsonip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
func (sm *StorageManager) NewUnlimitedReadWriter() StorageLimitReadWriter { | ||
return sm.NewReadWriter(0) | ||
return sm.NewReadWriter(0, 0) | ||
} | ||
|
||
// NewReadWriter returns a read writer with storage limit. | ||
func (sm *StorageManager) NewReadWriter(storageLimit uint64) StorageLimitReadWriter { | ||
// NewReadWriter returns a read writer configured with storage limit and disk threshold ratio. | ||
func (sm *StorageManager) NewReadWriter(storageLimit uint64, diskThresholdRatio float64) StorageLimitReadWriter { | ||
splitRW := SplitReadWriter{ | ||
eventRW: sm.eventStorage.NewReadWriter(), | ||
decisionRW: sm.decisionStorage.NewReadWriter(), | ||
} | ||
|
||
dbStorageLimit := func() uint64 { | ||
return storageLimit | ||
} | ||
if storageLimit == 0 { | ||
sm.logger.Infof("setting database storage limit to unlimited") | ||
var dbStorageLimit func() uint64 | ||
if sm.diskStatFailed.Load() && storageLimit == 0 && diskThresholdRatio > 0 { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This section feels a bit confusing, what if the disk usage hasn't ran yet and fails after How about we add the disk usage fallback in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. getDiskUsage is called on StorageManager initialization, the first time a TBS processor is created, and before any processor is Run. I can add a comment to make it clear. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I see. It is still a bit complicated to maintain, what do you think about refactoring that bit to ensure fallback is encapsulated within the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ++ on refactoring in a follow up PR when my head is clearer. getDiskUsage is designed to mock pebble GetDiskUsage without modifying internal state. |
||
dbStorageLimit = func() uint64 { | ||
return dbStorageLimitFallback | ||
} | ||
sm.logger.Warnf("failed to get disk usage; overriding storage_limit to fallback default %.1fgb", float64(dbStorageLimitFallback)) | ||
} else { | ||
sm.logger.Infof("setting database storage limit to %.1fgb", float64(storageLimit)) | ||
dbStorageLimit = func() uint64 { | ||
return storageLimit | ||
} | ||
if storageLimit == 0 { | ||
sm.logger.Infof("setting database storage limit to unlimited") | ||
} else { | ||
sm.logger.Infof("setting database storage limit to %.1fgb", float64(storageLimit)) | ||
} | ||
} | ||
|
||
// To limit db size to storage_limit | ||
dbStorageLimitChecker := NewStorageLimitCheckerFunc(sm.dbSize, dbStorageLimit) | ||
dbStorageLimitRW := NewStorageLimitReadWriter("database storage limit", dbStorageLimitChecker, splitRW) | ||
|
||
return dbStorageLimitRW | ||
// diskThreshold returns max used disk space in bytes. | ||
// After which, writes should be rejected. | ||
var diskThreshold func() uint64 | ||
if sm.diskStatFailed.Load() { | ||
diskThreshold = func() uint64 { | ||
return 0 // unlimited | ||
} | ||
sm.logger.Warnf("failed to get disk usage; disabling disk threshold check") | ||
} else { | ||
diskThreshold = func() uint64 { | ||
return uint64(float64(sm.diskStat.total.Load()) * diskThresholdRatio) | ||
} | ||
sm.logger.Infof("setting disk threshold ratio to %.2f", diskThresholdRatio) | ||
} | ||
|
||
// To limit actual disk usage percentage to diskThresholdRatio | ||
diskThresholdChecker := NewStorageLimitCheckerFunc(sm.diskUsed, diskThreshold) | ||
diskThresholdRW := NewStorageLimitReadWriter( | ||
fmt.Sprintf("disk usage ratio exceeding threshold of %.2f", diskThresholdRatio), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is this used? The method def for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It first started as a name of the StorageLimitReadWriter, but now I'm using it more like a description. It is prepended to the error. While changing it to "disk usage threshold" makes it become a name again, having the value of it seems to help understand, because There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, tbh, it feels a bit of a smell. Maybe we can refactor it later to abstract the error outside There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No need to do this now - we can do this in a future PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. updated it slightly to make the passed in argument more like a name. Abstracting the error outside StorageLimitReadWriter to create a DiskThresholdReadWriter is fine, but they would look too similar apart from the error / name. |
||
diskThresholdChecker, | ||
dbStorageLimitRW, | ||
) | ||
carsonip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
return diskThresholdRW | ||
} | ||
|
||
// wrapNonNilErr only wraps an error with format if the error is not nil. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[to reviewer] Q: @lahsivjar suggested that it doesn't need to be configurable. I've made it configurable but undocumented atm. As StorageLimit default is changed, we'll have to update observability docs anyway. I'm thinking we still have to either document this config, or describe this new default behavior as if it is not configurable. I'm inclined to document it, but it also means we'll have to support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this as a config option is okay for now. If we want to expose this (either as a config or documentation) we should discuss a bit on how we want both the options to interact and the user experience it entails. To start with, I would recommend documenting this (using up to 90% of the available disk) as default TBS behaviour with a mention on how to tweak the max using
storage_limit
option.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we expose it, we cannot simply revert this decision.
With the current implementation, one needs to be aware of both config options and understand how they play together. While also checking for not exceeding a certain threshold is a guardrail, it also adds complexity for the user. I am not a huge fan of this complexity.
If the
DiskThresholdRatio
would only be checked if noStorageLimit
is set, I think it would be easier to grasp. Either the APM Server always tries to write up to X GB or it writes up to x% of the overall disk size, but only if available.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I was thinking if it is exposed as undocumented configuration (i.e. only documenting the behaviour) we could revert the decision but if this is not true then I would be reluctant to release this config.
This sounds good to me, however, I would advocate for starting with the x% of overall disk size as a fixed default behaviour for now because I am not sure if this config is the best way forward. So, the behavior would be, if
storage_limit
is set then that is obeyed, otherwise, 90% of the overall disk size would be used for TBS. Does this sound reasonable?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with keeping it an undocumented config, and disable disk usage threshold check whenever StorageLimit is set. The edge case we'll have to handle is when user sets it to 0.
i.e. are we happy with
Alternatives:
I'm happy with no more truly unlimited storage limit but just checking if we want an escape hatch for it. I think it is ok to tell users who want a truly unlimited to set the StorageLimit as maxUInt64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated PR to implement what @simitt suggested. Any non-0 storage limit disables disk usage threshold checks. On non-0 storage limit, disk usage threshold checks using the undocumented default
disk_usage_threshold=0.9
, unless getDiskUsage errored, which would then fallback to 3GB db storage limit.