- Notifications
You must be signed in to change notification settings - Fork 835
fix: add structured outputs schema logging for Anthropic and Gemini #3454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add support for logging gen_ai.request.structured_output_schema attribute for Anthropic Claude and Google Gemini APIs, completing coverage across all major LLM providers. Changes: - Anthropic: Log output_format parameter with json_schema type Supports Claude's new structured outputs feature (launched Nov 2025) for Sonnet 4.5 and Opus 4.1 models - Gemini: Log response_schema from generation_config parameter Supports both generation_config.response_schema and direct response_schema kwargs - OpenAI: Already supported (no changes needed) Sample apps added to demonstrate structured outputs for all three providers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
WalkthroughThis PR extends OpenTelemetry instrumentation for multiple LLM providers (Anthropic, Google Generative AI, OpenAI) to track structured output schemas via a new span attribute, includes demo scripts and comprehensive test coverage, and updates dependencies. Changes
Sequence DiagramsequenceDiagram participant App as Application participant Instr as Instrumentation<br/>(span_utils) participant LLMClient as LLM Client<br/>(Anthropic/Google/OpenAI) participant Span as Span Exporter App->>Instr: Call LLM with structured<br/>output schema activate Instr Instr->>Instr: Extract schema from<br/>output_format or<br/>generation_config Instr->>Span: Set LLM_REQUEST_<br/>STRUCTURED_OUTPUT_SCHEMA deactivate Instr Instr->>LLMClient: Forward API request activate LLMClient LLMClient-->>Instr: Response deactivate LLMClient Instr->>Span: Record span with<br/>schema attribute Span-->>App: Instrumented trace Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py (1)
170-177: Consider loggingstructured_output_schemaeven when prompt capture is disabled
output_formathandling sits undershould_send_prompts(), soSpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMAwon’t be set when prompt/content capture is turned off, even though this schema is typically configuration rather than user content. Consider moving this block outside theshould_send_prompts()guard so the attribute is always populated whenoutput_formatis present, aligning with how other providers log this attribute.packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
395-414: Avoid silenttry/except/passwhen serializingresponse_schemaBoth blocks swallow all exceptions when calling
json.dumps(...), which makes schema/serialization issues hard to debug and triggers Ruff warnings (S110, BLE001). Consider narrowing the exception type and logging instead of passing silently, e.g.:- if generation_config and hasattr(generation_config, "response_schema"): - try: - _set_span_attribute( - span, - SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA, - json.dumps(generation_config.response_schema), - ) - except Exception: - pass + if generation_config and hasattr(generation_config, "response_schema"): + try: + _set_span_attribute( + span, + SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA, + json.dumps(generation_config.response_schema), + ) + except (TypeError, ValueError) as exc: + logger.debug( + "Failed to serialize generation_config.response_schema for span: %s", + exc, + ) @@ - if "response_schema" in kwargs: - try: - _set_span_attribute( - span, - SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA, - json.dumps(kwargs.get("response_schema")), - ) - except Exception: - pass + if "response_schema" in kwargs: + try: + _set_span_attribute( + span, + SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA, + json.dumps(kwargs.get("response_schema")), + ) + except (TypeError, ValueError) as exc: + logger.debug( + "Failed to serialize kwargs['response_schema'] for span: %s", + exc, + )This keeps failures non-fatal while giving observability into bad schemas.
Please verify with your supported
generation_config.response_schema/response_schematypes thatjson.dumps(...)(or any custom encoder you choose) behaves as expected across the Google Generative AI SDK versions you intend to support.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py(1 hunks)packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py(1 hunks)packages/sample-app/sample_app/anthropic_structured_outputs_demo.py(1 hunks)packages/sample-app/sample_app/gemini_structured_outputs_demo.py(1 hunks)packages/sample-app/sample_app/openai_structured_outputs_demo.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/sample-app/sample_app/gemini_structured_outputs_demo.pypackages/sample-app/sample_app/anthropic_structured_outputs_demo.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/sample-app/sample_app/openai_structured_outputs_demo.pypackages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
🧬 Code graph analysis (5)
packages/sample-app/sample_app/gemini_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
Traceloop(37-275)init(49-206)packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1)
main(15-52)packages/sample-app/sample_app/openai_structured_outputs_demo.py (1)
main(22-35)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
Traceloop(37-275)init(49-206)packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)
main(15-45)packages/sample-app/sample_app/openai_structured_outputs_demo.py (1)
main(22-35)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (1)
_set_span_attribute(18-22)packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
Traceloop(37-275)init(49-206)packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1)
main(15-52)packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)
main(15-45)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (1)
set_span_attribute(21-25)
🪛 Flake8 (7.3.0)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py
[error] 1-1: 'os' imported but unused
(F401)
packages/sample-app/sample_app/openai_structured_outputs_demo.py
[error] 4-4: 'opentelemetry.sdk.trace.export.ConsoleSpanExporter' imported but unused
(F401)
🪛 Ruff (0.14.5)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
403-404: try-except-pass detected, consider logging the exception
(S110)
403-403: Do not catch blind exception: Exception
(BLE001)
413-414: try-except-pass detected, consider logging the exception
(S110)
413-413: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Lint
🔇 Additional comments (1)
packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)
1-49: Gemini structured outputs demo looks goodThe demo cleanly configures the client from environment, defines a simple JSON schema, and uses
GenerationConfig.response_schemaconsistently with the other providers. No changes needed from my side.
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py Outdated Show resolved Hide resolved
Remove unused imports to fix flake8 lint errors: - Remove unused 'os' import from anthropic_structured_outputs_demo.py - Remove unused 'ConsoleSpanExporter' import from openai_structured_outputs_demo.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (1)
25-25: Consider aligning the prompt with other demos.The prompt in this demo doesn't explicitly request a rating, while the Anthropic and Gemini demos both ask to "rate it." Although structured outputs will enforce the schema regardless, explicitly requesting the rating improves output quality and consistency across demos.
- messages=[{"role": "user", "content": "Tell me a joke about OpenTelemetry"}], + messages=[{"role": "user", "content": "Tell me a joke about OpenTelemetry and rate it from 1 to 10"}],
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py(1 hunks)packages/sample-app/sample_app/openai_structured_outputs_demo.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/sample-app/sample_app/anthropic_structured_outputs_demo.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/sample-app/sample_app/openai_structured_outputs_demo.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR Repo: traceloop/openllmetry PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-08-17T15:06:48.109Z Learning: Instrumentation packages must leverage the semantic conventions package and emit OTel-compliant spans 📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR Repo: traceloop/openllmetry PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-08-17T15:06:48.109Z Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console Applied to files:
packages/sample-app/sample_app/openai_structured_outputs_demo.py
🧬 Code graph analysis (1)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
Traceloop(37-275)init(49-206)packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1)
main(14-51)packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)
main(15-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (6)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (6)
1-5: LGTM! Unused import issue resolved.The imports are clean and all used in the code. The previously flagged
ConsoleSpanExporterimport has been removed.
7-9: LGTM! Proper API key handling.Environment variables are loaded correctly, and the API key is retrieved from the environment as per coding guidelines.
11-13: LGTM!Traceloop initialization is correct with an appropriate app name for this demo.
16-18: LGTM!The Pydantic model is well-defined for structured output validation.
37-38: LGTM!Standard entry point implementation is correct.
23-27: Model and beta API endpoint verified as available; note known SDK parsing issues.Verification confirms that
gpt-4o-2024-08-06is still available and actively supported by OpenAI (including for fine-tuning), and theclient.beta.chat.completions.parsebeta endpoint is available. However, the openai-python SDK has known integration bugs with parse() related to JSON validation and edge cases in parsed responses. Test your structured output handling thoroughly and monitor the openai-python repository for bug fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed d6360b2 in 13 minutes and 44 seconds. Click for details.
- Reviewed
21lines of code in2files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/sample-app/sample_app/anthropic_structured_outputs_demo.py:1
- Draft comment:
Good removal of unused 'os' import to keep the code clean. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
2. packages/sample-app/sample_app/openai_structured_outputs_demo.py:4
- Draft comment:
Removed unused 'ConsoleSpanExporter' import; this is a good cleanup. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_IqIYoUKp7bNNE3SH
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Add comprehensive test coverage for Anthropic structured outputs feature: - Three test scenarios: legacy attributes, with content events, without content - Tests verify gen_ai.request.structured_output_schema attribute is logged - Enhanced span_utils.py to handle both json_schema and json output formats Note: Tests are currently skipped as they require anthropic SDK >= 0.50.0 which supports the output_format parameter. The feature was announced in November 2025 but the SDK version (0.49.0) doesn't yet support it. Tests will be enabled once the SDK is updated. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to 1de9ffa in 89 minutes and 38 seconds. Click for details.
- Reviewed
214lines of code in5files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:170
- Draft comment:
Consider handling cases where the provided schema might not be JSON serializable. Logging or error handling would help diagnose issues. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment is suggesting defensive programming for json.dumps(), but this appears to be speculative rather than identifying a real issue. The function is already wrapped with @dont_throw decorator which should handle exceptions. Additionally, the same pattern of calling json.dumps() without explicit try-catch is used throughout the file (lines 154, 167, 243, 311), so this would be an inconsistent suggestion unless applied everywhere. The comment doesn't point to a specific bug introduced by this change - it's more of a general code quality suggestion that could apply to many places in the codebase. According to the rules, speculative comments should be removed, and comments should only be kept if there's strong evidence of an issue. Could the schema contain non-serializable objects that would cause json.dumps() to fail? Perhaps the @dont_throw decorator doesn't provide adequate error visibility, and explicit logging would be better for debugging. Maybe this specific case is more prone to serialization issues than the other json.dumps() calls in the file. While it's theoretically possible for the schema to be non-serializable, the comment is speculative and doesn't provide evidence that this is a real issue. The @dont_throw decorator already provides error handling at the function level, and the same pattern is used consistently throughout the file. If this were a real concern, it would apply to all json.dumps() calls, not just this one. The comment doesn't identify a specific problem with the change. This comment should be deleted. It's a speculative suggestion about potential error handling that doesn't identify a specific issue with the code change. The function is already protected by the @dont_throw decorator, and the same json.dumps() pattern is used consistently throughout the file without additional error handling.
2. packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py:395
- Draft comment:
If both generation_config.response_schema and kwargs['response_schema'] are provided, the latter overwrites the former. Verify if this override behavior is intended. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the author to verify if the override behavior is intended, which is against the rules. It does not provide a specific suggestion or ask for a test to be written. Therefore, it should be removed.
3. packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py:396
- Draft comment:
Consider logging exceptions in the try/except blocks when setting the structured output schema to aid future debugging. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% This comment is about code that was added in the diff (lines 395-414). It's suggesting a code quality improvement - adding logging to exception handlers. The file already has logging infrastructure in place and uses it elsewhere (lines 70, 118 show similar patterns with logger.warning). The suggestion is actionable and clear. However, I need to consider the rules: "Comments that suggest code quality refactors are good! But only if they are actionable and clear." This seems to fit that criteria. But I also need to consider if this is "obvious or unimportant" - the code currently silently swallows exceptions, which could make debugging harder. The comment is making a reasonable suggestion for better observability. However, the pattern of silent exception handling might be intentional here (similar to the @dont_throw decorator used elsewhere). Looking more carefully, there are other places in the code with similar bare exception handlers that don't log (lines 43-44 in _is_image_part). This suggests the silent exception handling might be a deliberate pattern in this codebase. The silent exception handling might be intentional and consistent with other patterns in the codebase (e.g., _is_image_part function). The comment is suggesting a style preference rather than fixing a clear bug. Without knowing if there's a specific debugging issue or if this pattern is deliberately chosen for performance or other reasons, this might be speculative or stylistic rather than clearly necessary. While there are some instances of silent exception handling in the codebase, there are also examples where exceptions ARE logged (lines 70, 118). The new code is handling JSON serialization which could fail in various ways, and logging would help understand why. However, this is still a "nice to have" suggestion rather than a clear bug fix. The rule states comments should be about clear code changes required, not suggestions for improvements unless they're clearly actionable quality refactors. This is a reasonable code quality suggestion, but it's not a required change - it's more of a "nice to have" improvement. The codebase shows mixed patterns (some exceptions logged, some not), and the silent exception handling might be intentional. Given the rule to only keep comments with STRONG EVIDENCE they are correct and necessary, and that by default we should assume comments are not useful, this should be deleted.
Workflow ID: wflow_QU6hLC1XyLCmcwJo
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed ca5f423 in 34 minutes and 3 seconds. Click for details.
- Reviewed
222lines of code in2files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:178
- Draft comment:
Consider caching the result of output_format.get('json_schema') in a variable for clarity before accessing the 'schema' key. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:172
- Draft comment:
Add an inline comment explaining the difference between 'json_schema' and 'json' types in output_format to aid future maintenance. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
3. packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py:43
- Draft comment:
Remove the duplicate pytest.mark.skip decorator to avoid redundancy. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_ZrOmwwx7Az5swzKf
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
galkleinman left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
neat: consider moving magic strings to consts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (3)
17-40: Structured output schema and OUTPUT_FORMAT definitionThis setup looks correct for a JSON-schema‑backed structured output; if you want the example to fully reflect the “1 to 10” description, you could optionally add
"minimum": 1/"maximum": 10to theratingproperty, but it isn’t required for validating the instrumentation behavior.
43-60: Duplicate skip decorator and ARG001 on fixturesYou have two identical
@pytest.mark.skipdecorators on this test; one is sufficient. Also, Ruff’s ARG001 oninstrument_legacyis expected here because it’s a pytest fixture injected by name, so it doesn’t need to be referenced in the body.-@pytest.mark.skip(reason="Requires anthropic SDK >= 0.50.0 with structured outputs support") -@pytest.mark.skip(reason="Requires anthropic SDK >= 0.50.0 with structured outputs support") +@pytest.mark.skip(reason="Requires anthropic SDK >= 0.50.0 with structured outputs support")
106-151: Reduce duplication and make log-count assertions less brittle
test_anthropic_structured_outputs_with_events_with_contentand..._with_no_contentare almost identical apart from the instrumentation fixture and expected logging, so you could factor shared setup/assertions into a helper or parametrize over(fixture, expected_log_count)to cut duplication. Also, hard‑codinglen(logs) == 2may be fragile if the instrumentation later adds extra events—consider asserting a minimum count or filtering logs by an identifying attribute instead. ARG001 on theinstrument_with_content/instrument_with_no_contentparameters is similarly expected pytest‑fixture behavior.Also applies to: 153-197
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (4)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)packages/opentelemetry-instrumentation-anthropic/tests/utils.py (1)
verify_metrics(7-71)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (1)
reader(37-41)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)
🪛 Ruff (0.14.5)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
47-47: Unused function argument: instrument_legacy
(ARG001)
109-109: Unused function argument: instrument_with_content
(ARG001)
156-156: Unused function argument: instrument_with_no_content
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Lint
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (1)
62-104: Span, schema, metrics, and logs assertions for legacy pathThe assertions on gen‑ai prompt/completion attributes,
LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA, request/response models, metrics, and legacy log behavior give solid end‑to‑end coverage for the Anthropic structured‑output path once the SDK version is bumped.
- Update anthropic SDK from >=0.36.0 to >=0.50.0 to support structured outputs - Updated to version 0.74.1 which includes the output_format parameter - Remove skip decorators from structured outputs tests - Tests are ready to run once VCR cassettes are recorded with valid API key To record cassettes: export ANTHROPIC_API_KEY=your_key_here cd packages/opentelemetry-instrumentation-anthropic poetry run pytest tests/test_structured_outputs.py --record-mode=once 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (1)
44-44: Consider using the stable model identifier"claude-sonnet-4-5"instead of the dated variant.The model identifier
"claude-sonnet-4-5-20250929"is valid and officially supported by Anthropic. However, using the stable identifier"claude-sonnet-4-5"(without the date suffix) would be more maintainable and future-proof, as it automatically uses the latest available version of Sonnet 4.5 rather than pinning to a specific release date. If you need to pin to a specific version, the current approach is fine; otherwise, consider updating lines 44, 105, and 151 to use"claude-sonnet-4-5"for consistency with standard Anthropic API practices.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-anthropic/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-anthropic/pyproject.toml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (5)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)packages/opentelemetry-instrumentation-anthropic/tests/utils.py (1)
verify_metrics(7-71)packages/opentelemetry-instrumentation-anthropic/tests/conftest.py (1)
anthropic_client(70-71)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (1)
reader(37-41)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)
🪛 Ruff (0.14.5)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
41-41: Unused function argument: instrument_legacy
(ARG001)
102-102: Unused function argument: instrument_with_content
(ARG001)
148-148: Unused function argument: instrument_with_no_content
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (2)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (2)
12-35: LGTM! Well-structured schema definitions.The JOKE_SCHEMA and OUTPUT_FORMAT structures are clearly defined and align with Anthropic's structured outputs API format. The schema properly constrains the response with required fields and additionalProperties set to False for strict validation.
40-42: Static analysis warnings are false positives.The Ruff warnings about unused function arguments (
instrument_legacy,instrument_with_content,instrument_with_no_content) are false positives. These are pytest fixtures used for their side effects—they configure the instrumentation before each test runs. This is a standard pytest pattern where fixtures don't need to be explicitly referenced in the test body.Also applies to: 101-103, 147-149
packages/opentelemetry-instrumentation-anthropic/pyproject.toml Outdated Show resolved Hide resolved
| } | ||
| | ||
| | ||
| @pytest.mark.skip(reason="Requires anthropic SDK >= 0.50.0 with structured outputs support") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apply skip decorator consistently across all structured output tests.
Only the first test has the skip decorator for SDK version >= 0.50.0, but all three tests use the same beta.messages.create API with output_format and betas=["structured-outputs-2025-11-13"]. If the SDK version requirement applies to the first test, it should apply to all three tests that exercise the same structured outputs feature.
Apply this diff to add the skip decorator to the remaining tests:
+@pytest.mark.skip(reason="Requires anthropic SDK >= 0.50.0 with structured outputs support") @pytest.mark.vcr def test_anthropic_structured_outputs_with_events_with_content( instrument_with_content, anthropic_client, span_exporter, log_exporter, reader ):+@pytest.mark.skip(reason="Requires anthropic SDK >= 0.50.0 with structured outputs support") @pytest.mark.vcr def test_anthropic_structured_outputs_with_events_with_no_content( instrument_with_no_content, anthropic_client, span_exporter, log_exporter, reader ):Also applies to: 100-100, 146-146
🤖 Prompt for AI Agents
In packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py around lines 38, 100, and 146, the pytest.mark.skip decorator for "Requires anthropic SDK >= 0.50.0 with structured outputs support" is only applied to the first test; add the same @pytest.mark.skip(reason="Requires anthropic SDK >= 0.50.0 with structured outputs support") decorator immediately above the other two test functions (lines ~100 and ~146) so all tests using beta.messages.create with output_format and betas=["structured-outputs-2025-11-13"] are consistently skipped when the SDK requirement is not met. Replace LogRecord with Event from opentelemetry._events to fix compatibility with OpenTelemetry SDK 1.38.0. The LogRecord API no longer supports event_name parameter; events are now emitted using the Event class with a name parameter. Changes: - Replace LogRecord import with Event from opentelemetry._events - Update event_emitter.py to create Event instances instead of LogRecord - Update test assertions to use event.name instead of event.event_name All 40 tests now passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Remove @pytest.mark.skip decorator from test_anthropic_structured_outputs_legacy now that anthropic SDK 0.74.1 is installed. Tests require VCR cassettes to be recorded with valid ANTHROPIC_API_KEY: poetry run pytest tests/test_structured_outputs.py --record-mode=once 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…tputs Update the minimum version constraint from >=0.50.0 to >=0.74.0 to ensure structured outputs support is available. Version 0.74.1 includes the necessary .parse() and transform_schema() methods for structured outputs. Verified: - Anthropic SDK 0.74.1 installed - beta.messages.parse() method available - beta.messages.create() with output_format supported - All 15 legacy tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/event_emitter.py (1)
6-7: Event-based emission wiring looks correct; only minor naming nitSwitching to
Eventwithname,body, andattributes=EVENT_ATTRIBUTESkeeps semantics aligned with the tests that assert onlog.log_record.nameandgen_ai.system. Only minor nit is shadowing theeventparameter with a localeventvariable in both helpers; consider renaming the local (e.g.otel_event) for clarity, but it's not functionally problematic.Also applies to: 212-218, 236-241
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (3)
12-35: Structured-output schema and OUTPUT_FORMAT are clear and reusableThe
JOKE_SCHEMAandOUTPUT_FORMATconstants are well-factored and make the tests readable. If you ever want stronger validation, you could also assertadditionalProperties/requiredin the tests when inspecting the schema attribute, but that's optional.
38-97: Legacy structured-outputs test covers key attributes and metricsThe legacy path test exercises span prompt/completion, validates
LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMAcontents, checks request/response model attribution, parses the JSON response, and verifies metrics and absence of events when using legacy attributes. This is a solid end-to-end check of the new schema attribute on spans.Static analysis warning about
instrument_legacybeing unused is expected for a pytest fixture used only for side effects; if ARG001 is enforced in CI you can silence it with a# noqa: ARG001on that parameter, but functionally this is fine.
99-188: Event-mode structured-outputs tests look good; consider asserting log contents if neededBoth event-mode tests (with and without content) correctly:
- exercise the same
beta.messages.createstructured-outputs path,- assert the presence and basic structure of
LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA,- confirm request/response model attribution,
- validate the JSON response shape, and
- verify metrics plus the expected event count (2 logs).
That gives good coverage of the new behavior. If you later want stronger regression protection, you might reuse
assert_message_in_logsfromtest_messages.pyhere to check event bodies as well, but it's not strictly necessary.Similar to the first test,
instrument_with_contentandinstrument_with_no_contentbeing unused in the body is normal for fixtures; add# noqa: ARG001only if your linter treats this as an error.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/event_emitter.py(3 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_messages.py(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/event_emitter.pypackages/opentelemetry-instrumentation-anthropic/tests/test_messages.pypackages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (4)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)packages/opentelemetry-instrumentation-anthropic/tests/utils.py (1)
verify_metrics(7-71)packages/opentelemetry-instrumentation-anthropic/tests/conftest.py (1)
anthropic_client(70-71)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)
🪛 Ruff (0.14.5)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
40-40: Unused function argument: instrument_legacy
(ARG001)
101-101: Unused function argument: instrument_with_content
(ARG001)
147-147: Unused function argument: instrument_with_no_content
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_messages.py (1)
2442-2453: Helper update matches new Event naming semanticsAsserting on
log.log_record.nameis consistent with usingEvent(name=...)in the emitter. The remaining checks on system attribute and body still validate the important parts of the payload.
Reverting changes from commit 0f2d11a that incorrectly changed from LogRecord to Event API. Main branch is working correctly with LogRecord and event_name.
- Fix output_format structure to use direct 'schema' field instead of nested 'json_schema.schema' - Update span_utils.py to extract schema from correct location - Simplify tests for events mode (no request attribute checks) - Add VCR cassettes for structured outputs tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (2)
32-87: Good coverage of span attributes; consider tightening schema assertionThis test nicely validates:
- Prompt and completion content/roles
- Presence of
LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA- Request/response model attributes
- Legacy mode avoiding logs
To strengthen regression protection around structured outputs, you could optionally assert full equality between
schema_attrandJOKE_SCHEMA(not just presence of keys) so schema drift is caught immediately:- assert "properties" in schema_attr - assert "joke" in schema_attr["properties"] - assert "rating" in schema_attr["properties"] + assert schema_attr == JOKE_SCHEMA
34-35: Handle Ruff ARG001 warnings for fixture-only parametersRuff flags
instrument_legacy,instrument_with_content, andinstrument_with_no_contentas unused arguments, even though they are pytest fixtures used for side effects only.If Ruff is enforced on tests, consider one of:
- Explicitly “use” them in the body:
def test_...(..., instrument_legacy, ...): _ = instrument_legacy # noqa: ARG001 ...
- Or add a per-line/per-file
# noqa: ARG001as appropriate.- Or configure Ruff to ignore
ARG001in test files.This keeps the fixture pattern while satisfying the linter.
Also applies to: 91-92, 120-121
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_structured_outputs/test_anthropic_structured_outputs_legacy.yaml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_structured_outputs/test_anthropic_structured_outputs_with_events_with_content.yaml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_structured_outputs/test_anthropic_structured_outputs_with_events_with_no_content.yaml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_structured_outputs/test_anthropic_structured_outputs_with_events_with_no_content.yaml
- packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_structured_outputs/test_anthropic_structured_outputs_legacy.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
🧰 Additional context used
📓 Path-based instructions (2)
**/cassettes/**/*.{yaml,yml,json}
📄 CodeRabbit inference engine (CLAUDE.md)
Never commit secrets or PII in VCR cassettes; scrub sensitive data
Files:
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_structured_outputs/test_anthropic_structured_outputs_with_events_with_content.yaml
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR Repo: traceloop/openllmetry PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-08-17T15:06:48.109Z Learning: Instrumentation packages must leverage the semantic conventions package and emit OTel-compliant spans 🪛 Ruff (0.14.5)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py
34-34: Unused function argument: instrument_legacy
(ARG001)
91-91: Unused function argument: instrument_with_content
(ARG001)
120-120: Unused function argument: instrument_with_no_content
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (3)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (3)
10-29: Schema andOUTPUT_FORMATsetup looks solidThe shared
JOKE_SCHEMAandOUTPUT_FORMATdefinitions are clear, reusable across tests, and accurately capture the structured output shape you’re exercising. This aligns well with the intent to validate structured-output schema handling.
89-116: Events-with-content test focuses correctly on logs and payloadThis test exercises the events-with-content path: it reuses the same structured output call, asserts the JSON payload has the expected keys, and verifies that two logs are emitted. That’s a good, focused check on the logging behavior without duplicating all the span assertions from the legacy test.
118-144: Events-with-no-content test mirrors content-path behavior appropriatelyThis test mirrors the previous one for the “no content” instrumentation variant while revalidating span count, operation name, and the structured JSON payload plus log count. The symmetry between the two variants makes the behavior easy to compare and maintain.
| headers: | ||
| CF-RAY: | ||
| - 9a30f7bc0cccf169-TLV | ||
| Connection: | ||
| - keep-alive | ||
| Content-Encoding: | ||
| - gzip | ||
| Content-Type: | ||
| - application/json | ||
| Date: | ||
| - Sun, 23 Nov 2025 13:21:09 GMT | ||
| Server: | ||
| - cloudflare | ||
| Transfer-Encoding: | ||
| - chunked | ||
| X-Robots-Tag: | ||
| - none | ||
| anthropic-organization-id: | ||
| - 617d109c-a187-4902-889d-689223d134aa | ||
| anthropic-ratelimit-input-tokens-limit: | ||
| - '2000000' | ||
| anthropic-ratelimit-input-tokens-remaining: | ||
| - '2000000' | ||
| anthropic-ratelimit-input-tokens-reset: | ||
| - '2025-11-23T13:21:07Z' | ||
| anthropic-ratelimit-output-tokens-limit: | ||
| - '400000' | ||
| anthropic-ratelimit-output-tokens-remaining: | ||
| - '400000' | ||
| anthropic-ratelimit-output-tokens-reset: | ||
| - '2025-11-23T13:21:09Z' | ||
| anthropic-ratelimit-tokens-limit: | ||
| - '2400000' | ||
| anthropic-ratelimit-tokens-remaining: | ||
| - '2400000' | ||
| anthropic-ratelimit-tokens-reset: | ||
| - '2025-11-23T13:21:07Z' | ||
| cf-cache-status: | ||
| - DYNAMIC | ||
| request-id: | ||
| - req_011CVQtbb1HQigBLDM6oAQT3 | ||
| retry-after: | ||
| - '53' | ||
| strict-transport-security: | ||
| - max-age=31536000; includeSubDomains; preload | ||
| x-envoy-upstream-service-time: | ||
| - '3261' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Scrub Anthropic identifiers from cassette response headers
The response headers currently include anthropic-organization-id and request-id with what appear to be real identifiers. Per the repo guidelines for cassettes (“never commit secrets or PII; scrub sensitive data”), these should be replaced with stable placeholders (e.g., org_XXXXXXXX / req_XXXXXXXX) before committing.
🤖 Prompt for AI Agents
In packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_structured_outputs/test_anthropic_structured_outputs_with_events_with_content.yaml around lines 58 to 104, scrub the sensitive Anthropic identifiers in the response headers by replacing the real anthropic-organization-id and request-id values with stable placeholders (for example use org_XXXXXXXX and req_XXXXXXXX respectively), preserving the YAML structure and quoting style so the cassette remains valid.
Summary
Adds support for logging the
gen_ai.request.structured_output_schemaattribute for Anthropic Claude and Google Gemini APIs, completing coverage across all major LLM providers.Changes
Anthropic Claude
output_formatparameter withjson_schematypeanthropic-beta: structured-outputs-2025-11-13packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.pyGoogle Gemini
response_schemafromgeneration_configparameterresponse_schemakwargspackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pyOpenAI
Sample Apps
Added demonstration apps for all three providers:
packages/sample-app/sample_app/openai_structured_outputs_demo.py(tested ✅)packages/sample-app/sample_app/anthropic_structured_outputs_demo.pypackages/sample-app/sample_app/gemini_structured_outputs_demo.pyTesting
OpenAI sample app tested successfully and shows the
gen_ai.request.structured_output_schemaattribute being logged correctly.Related Documentation
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.
Important
Adds structured output schema logging for Anthropic and Google Gemini APIs, with sample apps and tests.
gen_ai.request.structured_output_schemafor Anthropic and Google Gemini APIs.output_formatwithjson_schematype inspan_utils.py.response_schemafromgeneration_configor kwargs inspan_utils.py.test_structured_outputs.pyfor Anthropic, currently skipped due to SDK version.anthropic_structured_outputs_demo.py,gemini_structured_outputs_demo.py, andopenai_structured_outputs_demo.pyfor demonstration.This description was created by
for ca5f423. You can customize this summary. It will automatically update as commits are pushed.