Keshav
Madhav.
I'm an AI engineer building the agentic stack at VerbaFlo: orchestrators, retrieval, tracing, and the tooling that makes all of it debuggable. Previously founding front-end at PrudentBit.
What I actually
build.
Read it in 5 seconds. Or 5 minutes. Both work.
I build production AI systems and the tools teams debug them with.
Multi-agent orchestrators
Runtime12-agent parallel execution across MongoDB, Postgres, and Milvus. SSE streaming, semantic cache, query expansion. Custom runtime, not a framework wrapper.
Retrieval and RAG
PipelinesProduction vector stores with hybrid search, query expansion, reranking, and text-to-SQL on top.
Internal tooling
ToolingA unified MCP debugger for the whole team. Hand it a ticket, it walks the codebase and traces, then RCAs the bug.
Full-stack surfaces
Full-stackElectron apps, VS Code extensions, tracing UIs, dashboards. I own the runtime and the interfaces.
I'm Keshav Madhav, an AI engineer who ships full systems, not just prompts. Custom orchestrators, retrieval infra, and the tooling that makes LLM pipelines debuggable.
At VerbaFlo, I own core pieces of our AI stack: a multi-agent orchestrator with parallel tool-calling, retrieval pipelines with hybrid search, text-to-SQL across MongoDB and Postgres, and a company-wide MCP debugger the team uses daily.
Before that, I was the founding front-end engineer at PrudentBit, where I built the Immune product suite from scratch. End-to-end encrypted storage and sharing, plus a Next.js 14 migration that cut load times in half.
The best AI systems are the ones you can actually see thinking. That's why I spend as much time on tracing and debuggable UIs as I do on the pipelines.
Want a longer read? Keep scrolling.
Where I've shipped,
and what I built there.
A short, honest log of two roles that shaped how I build: one front-of-house, one brain-of-house.
Own core pieces of the AI stack: orchestration, retrieval, evaluation, and the internal tooling the team debugs against.
Agentic orchestrator
Built a production multi-agent system with parallel tool-calling. Text-to-SQL pulls live MongoDB analytics for business questions.
Vector + RAG infra
Manage Milvus vector stores backing our RAG pipeline: chunking, embedding, hybrid search, and reranking for FAQ retrieval.
Automated campaigns
Own the AI-driven campaign systems (call, WhatsApp, email) that let customers target thousands of users with personalized flows.
Agentic Copilot
I built the agent harness from scratch — no LangGraph, no CrewAI — that drops a reasoning model into a safe, read-only sandbox and gives it real freedom: it thinks, writes its own task list, spawns parallel sub-agents, researches its own data paths, and streams a cited answer across MongoDB, Postgres, and Milvus.
- Custom harness: extended thinking, a self-managed task plan, and a delegate tool that spawns up to 8 parallel sub-agents
- Read-only, tenant-isolated sandbox — ~25 tools, a scratch-workspace, and its own technical wiki it can grep
- Cut warm-turn LLM cost ~73% via prompt-cache placement (83–95% cache reads)
Conversation Simulation
An Electron + Puppeteer replay engine that drives live widgets with LLM-powered user personas, runs conversations through our full pipeline, and catches regressions before prod. Also the team's investigation tool for customer issues.
- Single-pool Chrome: 8 concurrent replays from ~7.9 GB → 2.5 GB RAM
- Built-in MCP server exposes transcripts + traces to Claude Code / Cursor for AI-assisted RCA
- ClickUp QA mode, S3 auto-update, and Google-OAuth admin in one shell
Unified Debugging MCP
An MCP server that turns any agentic IDE into an autonomous debugger. Hand it a ticket and it navigates the codebase, MongoDB, Elasticsearch traces, and metrics, then RCAs the bug. Resolves 95%+ of issues without human intervention.
- Ticket to full RCA in one shot: code, conversation, traces, DB
- Works in Claude Code, Cursor, Codex, Windsurf, any MCP client
- Custom tool routing and guardrails for reliable agent loops
LLM Engineering Platform
An end-to-end platform for the AI team (~170K LOC, 236 API endpoints): a trace explorer over Elasticsearch, a playground that compares models side-by-side across 14 providers, a replay engine that re-fires production traffic against live deploys, Git-Sync that commits prompt edits straight to PRs, and a tool-using AI assistant that can investigate the whole system.
- Side-by-side compare across 14 providers; bulk-replay 5–500 traces scored by an LLM judge
- Service Replay: up to 20,000 production traces against a live deploy in one sweep
- Git-Sync prompts → PR, a saved-trace Vault, and a tool-using AI assistant with its own toolset
Employee #1 on front-end. Built the full Immune product suite from scratch.
Immune product suite
Led front-end for Immunefiles, Immunevault, and ImmuneShare: end-to-end encrypted storage, vaulting, and sharing.
SSR migration
Migrated from React 18 SPA to Next.js 14 with SSR and edge caching. Cut load times in half.
Live admin dashboard
Built a real-time admin console with 15+ interactive graphs and streaming data.
Cross-platform integrations
Shipped apps for Microsoft Teams, Outlook, and Gmail.
Projects I'm proud of.
A mix of things I open-sourced, simulations I couldn't stop building, and work I contributed to other people's projects. (VerbaFlo internal tools live up there ↑ with the rest of the job.)
A VS Code extension that runs real Python Jinja2 templates via Pyodide (WASM Python). JSON/YAML/TOML variable binding, IntelliSense for 50+ filters, hover docs. 10K+ installs.
- TypeScript
- VS Code API
- Pyodide
- Jinja2

A from-scratch remake that got out of hand. Physics-driven prestige tree, per-baker story modals, 30+ track music library, 15+ minigames, animated tutorial, and way too many easter eggs. Vanilla JS + Canvas, no engine.

An N-body gravity playground that scales from a few planets to 21,000 bodies at 30fps. Space-probe mode, heatmaps, grid warping, vector fields. Pure Canvas + WebGL, no engine.

An infinite Minecraft built in the browser with Three.js — 8.7M triangles and 4,000+ chunks in view. A pool of web workers generates AND greedily meshes each chunk off the main thread (deterministic ghost-cell borders, so every chunk meshes exactly once), a second worker drives a slippy-map minimap, plus fixed-timestep physics, mining/building, biomes (desert, forest, cherry-blossom, snow), and uncapped FPS via a MessageChannel scheduler.
Harsh Kedia's knowledge-graph engine for codebases. Indexes a repo into a structural graph exposed via MCP + Sigma.js. I contributed physics and render improvements so large graphs stay readable.
- TypeScript
- Sigma.js
- WebGL
- Python
- MCP
More builds

A 3D product landing page with ThreeJS, React Three Fiber, and GSAP scroll animations. Prismic CMS for content.

Hand-tuned Brainfuck interpreter in vanilla JS. Runs 1B ops in 6 seconds. JIT-style translation, tape prefetching, loop fusion.

Wolfenstein-style raycast shooter in vanilla JS. Textured walls, floor/ceiling casting, a live minimap, and dynamic lighting at 120+ fps. Pure Canvas, no engine.
3D flocking + N-body gravity sim with playback, presets, color grading, and video export.

Web version of the VS Code extension. Edit Jinja2 templates and variables side-by-side with instant rendering via Pyodide. Variable extraction, form mode, and whitespace control.

GPT-4o-powered merge game. Drag any two items together and the model invents what they become.

Video-calling app with instant, scheduled, and recorded meetings. Built on stream.io.
Tools I reach for, and the
systems I know cold.
I'm not precious about the stack. I'll use whatever's right. But these are the ones I've actually shipped with.
AI / Agents
- Multi-agent orchestration
- RAG pipelines
- Text-to-SQL
- Evals
- MCP
- LangGraph
- LangChain
Data
- MongoDB
- PostgreSQL
- Milvus
- Redis
- Elasticsearch
- Snowflake
- Kafka
- Supabase
Observability
- Prometheus
- Grafana
- Jaeger
- OpenTelemetry
Frontend
- Next.js
- React
- TypeScript
- Framer Motion
- GSAP
- ThreeJS
- WebGL
Backend
- Python
- FastAPI
- Django
- Node.js
Desktop & Native
- Electron
- Puppeteer
- WASM
- VS Code extensions
- Swift
Infra
- Docker
- Kubernetes
- AWS
- GCP
Sushant University
B.Tech Computer Science, AI / ML specialization
2021 to 2025 · Gurgaon, IN
New Delhi / Gurgaon, IN
Working remote / hybrid across IST. Happy to jump on something ambitious.
Available for collaborations · +91 78272 29447
Have a problem that
smells interesting?
I'm always up for a conversation about agentic systems, retrieval infra, or building real products on top of LLMs. The faster path is email.
Based in New Delhi / Gurgaon, IN. Typical response within 24h. I take on a small number of outside projects each year, so ping me and let's see.



