Skip to content

[backport 8.19] add tbs storage limit metrics#20682

Open
ericywl wants to merge 5 commits into8.19from
backport-tbs-monitoring
Open

[backport 8.19] add tbs storage limit metrics#20682
ericywl wants to merge 5 commits into8.19from
backport-tbs-monitoring

Conversation

@ericywl
Copy link
Contributor

@ericywl ericywl commented Mar 12, 2026

Closes #20531.

This is a manual backport of #20464. Due to differences in TBS implementation between 8.19 and 9.x, the disk-related metrics are not backported.

Related PRs:

@ericywl ericywl self-assigned this Mar 12, 2026
@ericywl ericywl requested a review from a team as a code owner March 12, 2026 12:12
@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
@ericywl ericywl requested a review from carsonip March 12, 2026 12:13
@carsonip carsonip requested review from Copilot and removed request for carsonip March 12, 2026 12:13
@ericywl ericywl requested review from carsonip and removed request for Copilot March 12, 2026 12:13
@ericywl ericywl changed the title [backport] add tbs storage limit metrics [backport 8.19] add tbs storage limit metrics Mar 12, 2026
Copy link
Member

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, 1 nit

Comment on lines +121 to +122
storageLimit := getGauge(t, reader, "apm-server.sampling.tail.storage.storage_limit")
assert.Zero(t, storageLimit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the test might give better confidence if storage limit is non-zero. Also non-zero tests the *0.9 quirk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 98ef2ad.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports tail-based sampling (TBS) storage-limit observability to the 8.19 APM Server by adding a new storage_limit metric and consolidating TBS storage metric registration within the Badger storage manager.

Changes:

  • Add an opt-in OpenTelemetry MeterProvider to eventstorage.StorageManager and register TBS storage metrics there (including new apm-server.sampling.tail.storage.storage_limit).
  • Move metric callback registration/unregistration responsibility from main.go into StorageManager lifecycle (NewStorageManager/Close), and record the effective storage limit in Run.
  • Update monitoring tests to assert the new metric and loosen metric assertions to “contains” to avoid brittleness.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
x-pack/apm-server/sampling/eventstorage/storage_manager.go Adds MeterProvider-based metric instruments/callback registration, records effective storage limit, and unregisters callback on close.
x-pack/apm-server/main.go Passes MeterProvider into NewStorageManager via an option and removes the previous global metric callback registration/unregistration.
x-pack/apm-server/main_test.go Updates monitoring assertions to “contain” and adds coverage for the new storage_limit metric.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +38 to +44
type StorageManagerOptions func(*StorageManager)

func WithMeterProvider(mp metric.MeterProvider) StorageManagerOptions {
return func(sm *StorageManager) {
sm.meterProvider = mp
}
}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StorageManagerOptions is a functional-option type but is named in plural form. The prevailing pattern in this repo uses singular ...Option for these (e.g. LoadConfigOption in internal/beatcmd/config.go and ConfigOption in internal/telemetry/metric_exporter_config.go). Consider renaming this to StorageManagerOption and updating NewStorageManager to accept ...StorageManagerOption for consistency.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it as StorageManagerOptions to be consistent with main. It was a typo in retrospect. Alternatively, fix it in both 9.x and 8.19

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @ericywl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants