- Notifications
You must be signed in to change notification settings - Fork 4
ActiveSupport Notifications
RubyLLM::Agents emits ActiveSupport::Notifications events throughout the middleware pipeline, giving you real-time observability into every execution, cache interaction, budget check, and reliability event.
Notifications fire independently of database tracking — even if track_executions is disabled, subscribers still receive events.
All events use the ruby_llm_agents. prefix and are organized by domain:
| Event | Domain | Description |
|---|---|---|
ruby_llm_agents.execution.start | Execution | Agent execution begins |
ruby_llm_agents.execution.complete | Execution | Agent execution succeeded |
ruby_llm_agents.execution.error | Execution | Agent execution failed |
ruby_llm_agents.cache.hit | Cache | Response served from cache |
ruby_llm_agents.cache.miss | Cache | Cache lookup found no match |
ruby_llm_agents.cache.write | Cache | Response written to cache |
ruby_llm_agents.budget.check | Budget | Budget check performed before execution |
ruby_llm_agents.budget.exceeded | Budget | Execution blocked by budget limit |
ruby_llm_agents.budget.record | Budget | Spend recorded after execution |
ruby_llm_agents.reliability.fallback_used | Reliability | Fallback model succeeded after primary failed |
ruby_llm_agents.reliability.all_models_exhausted | Reliability | All models (primary + fallbacks) failed |
Fired when an agent execution begins, before the LLM call.
Payload:
| Field | Type | Description |
|---|---|---|
:agent_type | String | Agent class name |
:model | String | Configured model |
:tenant_id | String/nil | Tenant identifier |
:execution_id | Integer/nil | Database execution ID |
ActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.start") do |*, payload| Rails.logger.info "[LLM] Starting #{payload[:agent_type]} with #{payload[:model]}" endFired when an agent execution succeeds.
Payload:
| Field | Type | Description |
|---|---|---|
:agent_type | String | Agent class name |
:agent_type_symbol | Symbol | Agent type (:chat, :embed, etc.) |
:execution_id | Integer/nil | Database execution ID |
:model | String | Configured model |
:model_used | String | Model that actually ran |
:tenant_id | String/nil | Tenant identifier |
:status | String | "success" |
:duration_ms | Integer | Execution time in milliseconds |
:input_tokens | Integer | Prompt tokens |
:output_tokens | Integer | Response tokens |
:total_tokens | Integer | Sum of input + output |
:input_cost | Float | Cost of input tokens |
:output_cost | Float | Cost of output tokens |
:total_cost | Float | Total execution cost |
:cached | Boolean | Whether response came from cache |
:attempts_made | Integer | Number of attempts (retries + fallbacks) |
:finish_reason | String | "stop", "length", "tool_calls", etc. |
:time_to_first_token_ms | Integer/nil | TTFT (streaming only) |
:error_class | nil | Always nil on success |
:error_message | nil | Always nil on success |
ActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.complete") do |*, payload| StatsD.timing("llm.duration", payload[:duration_ms]) StatsD.increment("llm.executions", tags: [ "agent:#{payload[:agent_type]}", "model:#{payload[:model_used]}" ]) StatsD.gauge("llm.cost", payload[:total_cost]) StatsD.histogram("llm.tokens", payload[:total_tokens]) endFired when an agent execution fails (after all retries and fallbacks are exhausted).
Payload: Same fields as execution.complete, but with:
| Field | Type | Description |
|---|---|---|
:status | String | "error" or "timeout" |
:error_class | String | Exception class name |
:error_message | String | Exception message |
ActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.error") do |*, payload| StatsD.increment("llm.errors", tags: [ "agent:#{payload[:agent_type]}", "error:#{payload[:error_class]}" ]) if payload[:error_class] == "Timeout::Error" Slack::Notifier.new(ENV['SLACK_WEBHOOK']).ping( "Timeout in #{payload[:agent_type]} after #{payload[:duration_ms]}ms" ) end endFired when a cached response is returned instead of calling the LLM.
Payload:
| Field | Type | Description |
|---|---|---|
:agent_type | String | Agent class name |
:cache_key | String | The cache key that matched |
Fired when a cache lookup finds no match and the LLM will be called.
Payload: Same as cache.hit.
Fired after a successful LLM response is written to cache.
Payload: Same as cache.hit.
# Track cache hit rate hits = 0 total = 0 ActiveSupport::Notifications.subscribe(/ruby_llm_agents\.cache\.(hit|miss)/) do |name, *, payload| total += 1 hits += 1 if name.end_with?(".hit") StatsD.gauge("llm.cache.hit_rate", hits.to_f / total) if total > 0 endFired before execution when budget limits are checked.
Payload:
| Field | Type | Description |
|---|---|---|
:agent_type | String | Agent class name |
:tenant_id | String/nil | Tenant identifier |
Fired when an execution is blocked because budget limits have been reached.
Payload: Same as budget.check.
ActiveSupport::Notifications.subscribe("ruby_llm_agents.budget.exceeded") do |*, payload| PagerDuty.trigger( summary: "Budget exceeded for #{payload[:agent_type]}", details: { tenant_id: payload[:tenant_id] } ) endFired after spend is recorded following a successful execution.
Payload:
| Field | Type | Description |
|---|---|---|
:agent_type | String | Agent class name |
:tenant_id | String/nil | Tenant identifier |
:total_cost | Float | Cost recorded |
:total_tokens | Integer | Tokens used |
ActiveSupport::Notifications.subscribe("ruby_llm_agents.budget.record") do |*, payload| StatsD.increment("llm.spend", payload[:total_cost], tags: [ "agent:#{payload[:agent_type]}", "tenant:#{payload[:tenant_id]}" ]) endFired when the primary model fails and a fallback model succeeds.
Payload:
| Field | Type | Description |
|---|---|---|
:agent_type | String | Agent class name |
:primary_model | String | The model that failed |
:used_model | String | The fallback model that succeeded |
:attempts_made | Integer | Total attempts across all models |
ActiveSupport::Notifications.subscribe("ruby_llm_agents.reliability.fallback_used") do |*, payload| StatsD.increment("llm.fallback", tags: [ "agent:#{payload[:agent_type]}", "primary:#{payload[:primary_model]}", "used:#{payload[:used_model]}" ]) endFired when all models (primary + fallbacks) fail, just before raising AllModelsExhaustedError.
Payload:
| Field | Type | Description |
|---|---|---|
:agent_type | String | Agent class name |
:models_tried | Array | List of all models attempted |
ActiveSupport::Notifications.subscribe("ruby_llm_agents.reliability.all_models_exhausted") do |*, payload| Slack::Notifier.new(ENV['SLACK_WEBHOOK']).ping( ":rotating_light: All models exhausted for #{payload[:agent_type]}: #{payload[:models_tried].join(', ')}" ) endActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.complete") do |*, payload| # Handle event endSubscribe to all events in a domain:
# All execution events (start, complete, error) ActiveSupport::Notifications.subscribe(/ruby_llm_agents\.execution\./) do |name, *, payload| Rails.logger.info "[LLM] #{name}: #{payload[:agent_type]}" end # All reliability events ActiveSupport::Notifications.subscribe(/ruby_llm_agents\.reliability\./) do |name, *, payload| Rails.logger.warn "[LLM] #{name}: #{payload[:agent_type]}" end # All events ActiveSupport::Notifications.subscribe(/ruby_llm_agents\./) do |name, *, payload| Rails.logger.debug "[LLM] #{name}: #{payload.inspect}" endsubscriber = ActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.complete") do |*, payload| # Handle event end # Later ActiveSupport::Notifications.unsubscribe(subscriber)# config/initializers/ruby_llm_agents.rb ActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.complete") do |*, payload| tags = ["agent:#{payload[:agent_type]}", "model:#{payload[:model_used]}"] StatsD.timing("llm.duration_ms", payload[:duration_ms], tags: tags) StatsD.increment("llm.executions.success", tags: tags) StatsD.histogram("llm.cost", payload[:total_cost], tags: tags) StatsD.histogram("llm.tokens", payload[:total_tokens], tags: tags) end ActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.error") do |*, payload| tags = ["agent:#{payload[:agent_type]}", "error:#{payload[:error_class]}"] StatsD.increment("llm.executions.error", tags: tags) end ActiveSupport::Notifications.subscribe("ruby_llm_agents.reliability.fallback_used") do |*, payload| StatsD.increment("llm.fallback", tags: [ "agent:#{payload[:agent_type]}", "from:#{payload[:primary_model]}", "to:#{payload[:used_model]}" ]) end# config/initializers/ruby_llm_agents.rb slack = Slack::Notifier.new(ENV['SLACK_WEBHOOK']) ActiveSupport::Notifications.subscribe("ruby_llm_agents.budget.exceeded") do |*, payload| slack.ping(":money_with_wings: Budget exceeded for #{payload[:agent_type]} (tenant: #{payload[:tenant_id]})") end ActiveSupport::Notifications.subscribe("ruby_llm_agents.reliability.all_models_exhausted") do |*, payload| slack.ping(":rotating_light: All models exhausted for #{payload[:agent_type]}: #{payload[:models_tried].join(', ')}") end# config/initializers/ruby_llm_agents.rb ActiveSupport::Notifications.subscribe(/ruby_llm_agents\./) do |name, started, finished, id, payload| duration = ((finished - started) * 1000).round(2) Rails.logger.tagged("LLM") do Rails.logger.info "#{name} (#{duration}ms) #{payload.except(:error_message).inspect}" end endAll notification calls are wrapped in a rescue block to ensure that subscriber errors never break agent execution. If a subscriber raises an exception, the notification is silently dropped and execution continues normally.
# This is safe — a buggy subscriber won't crash your agent ActiveSupport::Notifications.subscribe("ruby_llm_agents.execution.complete") do |*, payload| raise "oops" # Will not affect agent execution end- Execution Tracking - Database-backed execution logging
- Reliability - Retries, fallbacks, circuit breakers
- Budget Controls - Spending limits and alerts
- Caching - Response caching
- Production Deployment - Monitoring setup
- Model Fallbacks - Alerting on fallback usage