chore(datadog_metrics sink): switch series v2 and sketches to zstd compression by vladimir-dd · Pull Request #24956 · vectordotdev/vector

vladimir-dd · 2026-03-18T13:19:11Z

Summary

Rationale: Switch Series v2 (/api/v2/series) and sketches(/api/beta/sketches) to zstd compression.

Add DatadogMetricsCompression enum (Zlib/Zstd) in config.rs with compressor(), content_encoding(), and max_compressed_size() methods
Add compression() method on DatadogMetricsEndpoint: Series v2 and Sketches → Zstd, Series v1 → Zlib
Add max_compressed_size(n) for each scheme: Zlib uses the DEFLATE stored-block worst-case formula; Zstd mirrors the ZSTD_compressBound C macro
Propagate content_encoding through DatadogMetricsRequest and the request builder instead of hardcoding "deflate"
Make DatadogMetricsEncoder::new() infallible — production limits from payload_limits() are always valid; remove CreateError and validate_payload_size_limits
Track buffered_bound for all compressor types (zstd 128KB blocks, zlib 4KB BufWriter) to avoid underestimating compressed payload size
Fix SMP regression benchmark (statsd_to_datadog_metrics): switch to ingress_throughput, which is a better default benchmark of overall throughput

Compressed size capacity estimate:

The encoder needs to decide whether accepting one more metric would exceed the compressed payload limit, without being able to back out a compressor write. The estimate splits into two parts:

Bytes already flushed to the output buffer (get_ref().len()) — exact compressed size
Bytes still in the compressor's internal buffer — estimated via max_compressed_size(buffered_bound + n) (worst-case upper bound)

All compressors buffer internally before flushing (zstd: 128 KB per block, zlib: 4 KB BufWriter). buffered_bound tracks an upper bound on uncompressed bytes not yet visible in get_ref().len(), resetting to n when a flush is detected.

Tests added:

max_compressed_size_is_upper_bound: empirically validates both Zlib and Zstd formulas are true upper bounds using incompressible (Xorshift64) data, and are not overly conservative (slack ≤ 1% + 64 bytes)
zstd_v2_payload_never_exceeds_512kb_with_incompressible_data: end-to-end test with real 512KB limit, verifies payload ≤ 512KB (safety) and > 95% utilization (efficiency) using high-entropy printable ASCII metric names
compressed_limit_is_respected_regardless_of_compressor_internal_buffering: regression test for zstd's 128KB internal buffering — uses a 512-byte compressed limit where get_ref().len() stays 0 throughout, verifying the encoder stops after a handful of metrics (not 100)
zstd_buffered_bound_resets_to_last_metric_size_after_block_flush: white-box test directly verifying buffered_bound resets to exactly n (not 0) after a zstd block flush
encode_series_v2_breaks_out_when_limit_reached_compressed: verifies the hot-path compressed-limit check works correctly for the zstd path
encoding_check_for_payload_limit_edge_cases_v2: proptest that any Series v2 payload decompresses cleanly with zstd and stays within configured limits
v2_series_default_limits_split_large_batches: validates 120k metrics are correctly split across multiple batches with v2 limits
default_batch_config_uses_endpoint_specific_size_limits / v1_batch_config_uses_v1_size_limit / explicit_max_bytes_applies_to_both_endpoints: verify per-endpoint batch size limits

Correctness analysis

V1/zlib path preserved

Series(V1).compression() and Sketches.compression() both return Zlib — no change in compressor selection
Zlib.content_encoding() returns "deflate" — same as the previously hardcoded Content-Encoding header
Zlib.compressor() returns Compression::zlib_default().into() — identical to the old get_compressor()
write_payload_header / write_payload_footer still emit JSON wrapping ({"series":[ / ]}) for V1, nothing for V2/Sketches
The zlib max_compressed_size(n) formula is algebraically identical to the old n + max_compressed_overhead_len(n):
both compute n + (1 + n.saturating_sub(6) / 16384) * 5
The only behavioral change for zlib: buffered_bound now makes the compressed-size estimate slightly more conservative by accounting for the ~4 KB BufWriter buffer. This is more correct than before and the impact is negligible against the 3.2 MB compressed limit

V2/zstd path

The ZSTD_compressBound formula (n + (n >> 8) + correction for <128KB) matches the C library macro exactly
buffered_bound tracking is sound: accumulates on each write (+= n), resets to n (not 0) when a flush is detected — because the triggering write may straddle the block boundary, n is a safe upper bound on what remains buffered
Header/footer bytes written to the compressor are tracked in buffered_bound (header via try_encode, footer is 0 bytes for V2)
reset_state() creates the correct compressor for the endpoint (was previously always zlib via Default)
finish() retains its existing safety net: if the payload exceeds the compressed limit after finalization, it returns TooLarge with a recommended split count

Removed code

CreateError / FailedToBuild: construction is now infallible since limits always come from payload_limits()
validate_payload_size_limits: no longer needed — with_payload_limits() is gated behind #[cfg(test)], production code always uses well-known API limits
is_series(): only consumer was the removed validate_payload_size_limits
get_compressor() / max_compressed_overhead_len() / max_compression_overhead_len(): replaced by DatadogMetricsCompression::compressor() and max_compressed_size()

Vector configuration

sinks: datadog_metrics: type: datadog_metrics inputs: [...] default_api_key: "${DD_API_KEY}" series_api_version: v2 # now correctly uses zstd

How did you test this PR?

Unit tests: all datadog metrics encoder tests pass (cargo test --no-default-features --features sinks-datadog_metrics).

End-to-end correctness test (branch)

Ran scripts/validate_dd_metrics_correctness.py against the real Datadog API. All 18 metric checks passed for both v1 and v2, with identical values:

Metric	v1	v2
counter	50.0	50.0 ✅
gauge	42.5	42.5 ✅
set	1.0	1.0 ✅
dist avg/count/sum/min/max	✅	✅
histogram count/avg	✅	✅
summary sum/count/ratio	✅	✅
multi-tag counter (group:a/b/*)	✅	✅
multi-tag gauge (group:a/b)	✅	✅

All 18 metrics match between v1 and v2.

v1/zlib vs v2/zstd performance benchmark (branch)

Ran scripts/benchmark_dd_metrics_v1_v2.py against the real API at 50k events/sec, 2 repeats, 15s warmup, 60s measure:

Metric	v1/zlib	v2/zstd	Delta
Sent events/s	50,922	50,311	-1.2% (≈equal)
Compressed bytes/s*	3.33 MB/s	1.11 MB/s	-66.6% (better compression)
Avg CPU %	169.7	131.7	-22.4%
Avg RSS (MB)	7,334	2,478	-66.2%
Peak RSS (MB)	10,162	2,710	-73.3%
Delivery ratio	1.27	1.20	≈equal
HTTP requests/s	10.4	124.4	+1093% (expected: smaller 512KB batches vs 3.2MB)

bytes_sent() in the DD metrics service was changed from request_encoded_size()(uncompressed) to request_wire_size() (compressed/on-the-wire);

Key takeaway: v2 delivers the same metric throughput as v1 while using 22% less CPU, 66% less memory, and 67% less bandwidth. The higher HTTP request rate is expected due to the smaller v2 payload limit (512KB vs 3.2MB).

SMP regression benchmark

The statsd_to_datadog_metrics SMP benchmark reported a -69% drop in egress_throughput (compressed bytes received by the blackhole), while ingress_throughput has increased by ~75%:

ingress_throughput benchmark:

egress_throughput benchmark - "regression" here is an improvement(OPW sends out 3x less bytes):

Change Type

New feature

Is this a breaking change?

No

Does this PR include user facing changes?

Yes. Please add a changelog fragment based on our guidelines.

References

Notes

Please read our Vector contributor resources.
Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
Some CI checks run only after we manually approve them.
- We recommend adding a pre-push hook, please see this template
- Alternatively, we recommend running the following locally before pushing to the remote branch:
  - make fmt
  - make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
  - make test
After a review is requested, please avoid force pushes to help us review incrementally.

vladimir-dd · 2026-03-18T19:59:18Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 76fb1c59bd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/sinks/datadog/metrics/config.rs

…ssion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…hput The egress_throughput goal measures compressed bytes received by the blackhole. Switching v2 series from zlib to zstd produces smaller compressed payloads (better compression ratio), which registers as a false regression in egress bytes/sec. ingress_throughput measures how fast Vector consumes statsd data from the generator, which is compression-agnostic and reflects actual pipeline performance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48bdb12f7e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/sinks/datadog/metrics/encoder.rs

…zero limits Start proptest ranges at 1 instead of 0 for uncompressed_limit and compressed_limit. The old validate_payload_size_limits rejected zero limits, but with_payload_limits is now infallible, so finish() can panic on division-by-zero when computing recommended_splits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…sion Sketches endpoint now uses zstd instead of zlib, matching Series v2. Only Series v1 remains on zlib. Validated against real Datadog API: 36/36 correctness checks passed, all 18 metrics match between v1 and v2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pront · 2026-03-20T13:27:03Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4f8e56d64

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-20T14:00:08Z

src/sinks/datadog/metrics/config.rs

+impl DatadogMetricsCompression {
+ pub(super) const fn content_encoding(self) -> &'static str {
+ match self {
+ Self::Zstd => "zstd",


Use Datadog’s expected zstd1 content encoding token

The new Zstd branch returns "zstd", but Datadog’s metrics v2 API documentation and generated clients use MetricContentEncoding::ZSTD1 / contentEncoding: "zstd1" for compressed submits. Because this value is propagated directly to the Content-Encoding header, Series v2 (the default) and sketches requests can be rejected with 4xx on environments that enforce the documented enum, causing dropped metrics instead of the intended compression improvement.

Useful? React with 👍 / 👎.

github-actions bot added the domain: sinks Anything related to the Vector's sinks label Mar 18, 2026

vladimir-dd force-pushed the vladimir-dd/metrics-v2-zstd branch 14 times, most recently from c4c80b6 to fa052b6 Compare March 18, 2026 19:28

vladimir-dd changed the title ~~feat(datadog_metrics sink): add zstd compression for series v2 endpoint~~ feat(datadog_metrics sink): switch series v2 endpoint to zstd compression Mar 18, 2026

chatgpt-codex-connector bot reviewed Mar 18, 2026

View reviewed changes

src/sinks/datadog/metrics/config.rs Show resolved Hide resolved

vladimir-dd force-pushed the vladimir-dd/metrics-v2-zstd branch 3 times, most recently from f5faf86 to 783f621 Compare March 19, 2026 09:34

github-actions bot added the domain: releasing Anything related to releasing Vector label Mar 19, 2026

vladimir-dd changed the title ~~feat(datadog_metrics sink): switch series v2 endpoint to zstd compression~~ WIP: feat(datadog_metrics sink): switch series v2 endpoint to zstd compression Mar 19, 2026

vladimir-dd force-pushed the vladimir-dd/metrics-v2-zstd branch 5 times, most recently from 67c992a to eccada1 Compare March 19, 2026 16:41

vladimir-dd changed the title ~~WIP: feat(datadog_metrics sink): switch series v2 endpoint to zstd compression~~ chore(datadog_metrics sink): switch series v2 endpoint to zstd compression Mar 20, 2026

vladimir-dd force-pushed the vladimir-dd/metrics-v2-zstd branch from 4042a17 to d2df3d5 Compare March 20, 2026 07:56

github-actions bot removed the domain: releasing Anything related to releasing Vector label Mar 20, 2026

vladimir-dd and others added 2 commits March 20, 2026 09:27

chore(datadog_metrics sink): switch series v2 endpoint to zstd compre…

9f1030b

…ssion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

vladimir-dd force-pushed the vladimir-dd/metrics-v2-zstd branch from d2df3d5 to 48bdb12 Compare March 20, 2026 08:28

vladimir-dd marked this pull request as ready for review March 20, 2026 08:28

vladimir-dd requested a review from a team as a code owner March 20, 2026 08:28

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

src/sinks/datadog/metrics/encoder.rs Outdated Show resolved Hide resolved

vladimir-dd and others added 2 commits March 20, 2026 10:13

vladimir-dd changed the title ~~chore(datadog_metrics sink): switch series v2 endpoint to zstd compression~~ chore(datadog_metrics sink): switch series v2 and sketches to zstd compression Mar 20, 2026

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(datadog_metrics sink): switch series v2 and sketches to zstd compression#24956

chore(datadog_metrics sink): switch series v2 and sketches to zstd compression#24956
vladimir-dd wants to merge 4 commits intomasterfrom
vladimir-dd/metrics-v2-zstd

vladimir-dd commented Mar 18, 2026 •

edited

Loading

vladimir-dd commented Mar 18, 2026

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

pront commented Mar 20, 2026

chatgpt-codex-connector bot left a comment

chatgpt-codex-connector bot Mar 20, 2026

Labels

2 participants

Conversation

vladimir-dd commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

V1/zlib path preserved

V2/zstd path

Removed code

Vector configuration

How did you test this PR?

End-to-end correctness test (branch)

v1/zlib vs v2/zstd performance benchmark (branch)

SMP regression benchmark

Change Type

Is this a breaking change?

Does this PR include user facing changes?

References

Notes

vladimir-dd commented Mar 18, 2026

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

pront commented Mar 20, 2026

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Labels

2 participants

vladimir-dd commented Mar 18, 2026 •

edited

Loading