feat(refactoring): Support Structured Logging (JSON)#30170
feat(refactoring): Support Structured Logging (JSON)#30170crazywoola merged 17 commits intolanggenius:mainfrom
Conversation
Summary of ChangesHello @41tair, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the backend's logging infrastructure to support structured JSON logging. The primary motivation is to enhance observability and simplify log analysis by providing machine-readable logs that are easier to parse, query, and aggregate in modern logging systems. It introduces a new logging module with context variables for request and trace IDs, configurable output formats, and improved integration with OpenTelemetry for distributed tracing, ensuring a unified and consistent logging experience across the application, including Flask requests and Celery tasks. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a robust structured logging implementation, which is a fantastic improvement for observability. The new core.logging module is well-structured, using contextvars for request context and providing a flexible JSON formatter. The changes are thoughtfully applied across the application, including Flask request hooks and Celery tasks, ensuring consistent trace and request IDs. My review includes several suggestions, primarily focused on moving local imports to the top level of modules for better code style and maintainability. I also noted a small opportunity for code simplification in the new JSON formatter and a suggestion to enhance test coverage for one of the identity filters. Overall, this is a high-quality contribution that significantly enhances the application's logging capabilities.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
crazywoola left a comment
There was a problem hiding this comment.
This configuration LOG_OUTPUT_FORMAT should be added to the
https://github.com/langgenius/dify/blob/2c919efa69686ce4d6f6d677da067a597597bfc6/api/.env.example
and
https://github.com/langgenius/dify/blob/2c919efa69686ce4d6f6d677da067a597597bfc6/docker/.env.example
There was a problem hiding this comment.
Pull request overview
This PR introduces structured logging support with JSON format to improve integration with modern log aggregation and observability platforms (ELK, Datadog, Loki, etc.). The implementation standardizes log outputs with consistent field naming and provides configurable format switching between human-readable text and machine-readable JSON.
Key Changes
- Introduced new logging infrastructure with context-aware filters, JSON formatter, and request-scoped context variables using Python's
contextvars - Integrated OpenTelemetry trace propagation through W3C traceparent headers for distributed tracing across services
- Added configuration option
LOG_OUTPUT_FORMATto toggle betweentextandjsonformats (defaults to text for backward compatibility)
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
api/core/logging/context.py | New module providing framework-agnostic request context using contextvars for thread-safe logging |
api/core/logging/filters.py | New logging filters that extract trace/span IDs from OpenTelemetry and user identity from Flask-Login |
api/core/logging/structured_formatter.py | New JSON formatter that outputs structured logs with standardized fields (ts, severity, trace_id, identity, etc.) |
api/core/logging/__init__.py | Module initialization exposing logging components |
api/extensions/ext_logging.py | Refactored to support both text and JSON formats, added filter integration, maintained backward compatibility |
api/extensions/ext_celery.py | Initialize logging context for Celery tasks similar to Flask request lifecycle |
api/extensions/otel/instrumentation.py | Simplified exception logging to record on current span instead of creating new spans |
api/libs/external_api.py | Removed explicit log_exception call to avoid duplicate logging (framework handles this) |
api/app_factory.py | Initialize request context on each request, inject X-Span-Id header alongside X-Trace-Id |
api/configs/feature/__init__.py | Added LOG_OUTPUT_FORMAT configuration option |
api/core/helper/trace_id_helper.py | Added helpers for span ID extraction and W3C traceparent header generation |
api/core/helper/ssrf_proxy.py | Inject traceparent headers for distributed tracing when OpenTelemetry is disabled |
api/core/plugin/impl/base.py | Inject traceparent headers for plugin daemon requests |
api/tests/unit_tests/core/logging/test_context.py | Comprehensive tests for logging context module |
api/tests/unit_tests/core/logging/test_filters.py | Tests for trace and identity context filters |
api/tests/unit_tests/core/logging/test_structured_formatter.py | Tests for JSON formatter with various scenarios |
api/tests/unit_tests/core/logging/test_trace_helpers.py | Tests for trace helper functions |
api/tests/unit_tests/core/helper/test_ssrf_proxy.py | Updated tests for SSRF proxy with new tracing behavior |
api/tests/unit_tests/libs/test_external_api.py | Simplified test by removing sys.exc_info mocking (no longer needed) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0e81b2d to ab5df57 Compare
Important
Fixes #<issue number>.Fixes #30169
Summary
What
This PR introduces structured logging support (JSON format) to the backend service. It standardizes log outputs, making them machine-readable and consistent.
Why
To better integrate with modern log aggregation and observability systems (e.g., ELK, Datadog, Loki). Structured logging eliminates the need for complex regex parsing and improves query capabilities on log fields.
Key Changes
level,msg,ts,trace_id).console(text) andjsonformats (defaulting to text for dev environments).Fixes #30169
Screenshots
Checklist
dev/reformat(backend) andcd web && npx lint-staged(frontend) to appease the lint gods