Skip to content
View ShivamB25's full-sized avatar

Block or report ShivamB25

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ShivamB25/README.MD

Shivam Bansal

I build AI gateways, inference tooling, and backend systems.

Most of my public work since February 2021 sits at the intersection of LLM infra, developer tooling, and upstream debugging when production systems get weird.

What I do

  • AI infrastructure for large model workloads, especially B200/H200-era deployment questions
  • RAG systems with tenant isolation and operational guardrails that hold up in production
  • LLM deployment workflows that are easier to run and debug in production
  • Upstream debugging and OSS fixes when tools break under real usage

Selected projects

LLMizer

Private LLM deployment orchestration platform with a FastAPI backend, a Next.js dashboard, and LiteLLM as the provider layer. I use it to manage the messy parts of model rollout, provider switching, and day-to-day operations.

Async-first text-to-speech library built around Gemini TTS workflows. Includes batch processing, multi-speaker dialogue support, and support for 24 languages.

Video processing backend that converts uploads to HLS, pushes jobs through RabbitMQ, and is set up for containerized deployment.

An experiment in policy and compliance tooling that turns policy PDFs into structured rules and testable validation flows.

Open source work since 2021

How I work

  • I like product work, but I am usually most useful in the messy middle: infrastructure, integration problems, reliability fixes, and ugly edge cases.
  • When I hit a bug in a tool I depend on, I try to reproduce it properly, file something maintainers can act on, and send a fix upstream if I can.
  • Public issue reports I have raised include LiteLLM, OpenCode, Serena, Modal, AWS Neuron samples, Flexprice, and Komf.

I like this part of the job more than I probably should. A lot of my better work starts with reproducing a bug nobody else wants to touch, then either fixing it upstream or building around it.

Tech I use most

  • Languages: TypeScript, Python, Go, Rust, C++
  • AI/ML: LiteLLM, OpenAI, Hugging Face, vector search, LoRA workflows, RAG pipelines
  • Backend: FastAPI, Node.js, Express, service-to-service APIs
  • Frontend: Next.js, React, Tailwind
  • Infra: Docker, Kubernetes, Helm, GCP, AWS, Prometheus
  • Data: PostgreSQL, MongoDB, Redis, Scylla, ClickHouse

Elsewhere

Pinned Loading

  1. hls-microservice-backend hls-microservice-backend Public

    TypeScript 1