Stats
179
reputation 1.4m
reached 6
answers 4
questions About
Senior Backend Engineer | Infra & System Design @ Scale
Kafka · Redis · Python · Postgres · GenAI · Distributed Systems · Observability
I design backend systems that stay reliable at scale, adapt fast to product needs, and fail predictably.
8+ years across infra-heavy teams building telemetry pipelines, orchestrators, and LLM-backed systems under concurrency, latency, and audit constraints.
🔩 What I Build
- Distributed Cloud Applications → Microservices with predictable scale & recoverability
- Stream Processing Pipelines → Kafka + Postgres + Redis under 10M+ event loads
- Telemetry + Observability Systems → Tracing, metrics, SLA diagnostics (Prometheus, OTel)
- LLM Agent Infrastructure → Memory-backed, tool-using multi-agent execution engines
- Control Plane & Coordination → Consensus-safe orchestration, retries, failover resilience
🧠 Core Expertise
- Distributed Systems: queues, state machines, eventual consistency
- Infra Design: ingestion, orchestration, API contracts, failure budgets
- Stream Processing: Kafka, Redis, Celery, Prefect
- Observability: OpenTelemetry, Prometheus, Grafana, Sentry
- GenAI Integration: agent memory, structured planning, tool use
- Cloud & Ops: Docker, Kubernetes, AWS (ECS, CloudWatch), Terraform (basic)
🚀 Key Outcomes
- Built streaming ingestion pipelines handling 10M+ events/month
- Cut P95 latency by 45% and ETL time by 30% in clinical telemetry
- Reduced cross-region failures by 35% through retry-safe orchestration
- Logged full agent memory + tool usage telemetry for enterprise GenAI workflows
- Redis-based observability platform acquired by Redis Inc (folded into RedisInsight)
🛠️ Featured Projects
- 🌐 redis-observability: Redis monitoring UI with SCAN, editor, command sandbox — acquired by Redis Inc
- 🛸 agent-memory-system: Graph + embedding memory system for LLMs (Neo4j + Redis + FAISS)
🌱 Side Projects & Explorations
- ⚙️ Building async-safe retry orchestration layer in Python
- 🛰️ Exploring SLURM + Nomad for agent and data job scheduling
- 📈 Planning trace graph UI for OpenTelemetry-based debugging
🌍 Connect with Me
- 🔗 GitHub
- 💬 Twitter / X
- 🧠 Stack Overflow
Currently exploring Senior/Staff roles in distributed systems, observability, or cloud-native infra teams (e.g. telemetry, ingestion, real-time processing).
DMs open — let’s build resilient systems.
Badges
View all badges This user doesn’t have any gold badges yet.
2
silver badges
- Necromancer
× 2Dec 23, 2024
13
bronze badges
- Citizen PatrolOct 4, 2018
- CaucusJun 5
- CommentatorJan 16, 2017

