Sasi Sundar
Founder at Giant.
AI agents trust MCP responses they shouldn't.
Schema violations. Null responses. Timeouts.
Vouqis is a reliability gateway that sits between your AI agent and MCP server, catching failures before they reach users.
AI systems do not fail because models are weak.
They fail because infrastructure silently lies.
Most production failures happen when every layer reports success while the outcome is wrong. HTTP 200 returns. Logs show no errors. Dashboards are green. The agent proceeds on a broken result.
I'm studying these failures and building infrastructure to prevent them.
Runtime MCP reliability gateway.
Vouqis sits between your AI agent and MCP server. Every request is intercepted. Every response is validated. Failures are caught before they reach the agent.
At runtime, Vouqis validates protocol behavior, detects schema violations, catches null responses, and classifies failure modes. The reliability audit helps you discover problems. The gateway stops them from happening.
Architecture
A crash is visible. A silent failure — where the system proceeds on a wrong result — compounds undetected. By the time it surfaces, the damage is deep in the state.
Read →Status codes report transport success. Schema validation, null checks, and timeout boundaries require a separate probe layer. Most teams don't have one.
Read →Application-layer monitoring misses the gap between what the MCP server says and what the agent does with it. Trust scoring needs to operate closer to the wire.
Read →Dashboards can be green while the system is broken. Observability tells you what happened. Validation tells you whether it should have.
Read →A taxonomy of how MCP servers fail in production and why most are invisible.
What transport-layer success codes actually tell you — and what they don't.
How to turn protocol probe results into a number your CI/CD can gate on.
The probe types that almost got cut, and why the Trust Score almost became a percentage.
Why the tooling gap for AI agents mirrors what happened to software deployment in 2010.
MCP servers fail silently in production. Schema mismatches, null responses, and malformed envelopes all return HTTP 200. Standard monitoring catches none of it.
10 deterministic probe types targeting schema validation, null detection, timeout boundaries, and malformed envelope handling. Probes run before the agent touches production traffic.
Trust Score 92/100 on Exa's MCP endpoint. 1 probe failed: mjr-02 (malformed JSON-RPC envelope). Score output is numeric and deterministic — CI/CD can gate on it with --fail-below.
Schema validation catches more failures than runtime monitoring. Timeout boundaries are rarely tested but frequently the point of failure. CI/CD gating requires deterministic exit codes, not dashboards.
Truth over comfort.
Reliability over features.
Evidence over opinions.
Simplicity over complexity.
Customer pain before founder excitement.
No silent failures.
I'm building AI infrastructure from inside the problem. Not consulting on it. Not writing about it. Building it daily, shipping it, and studying what breaks.
Final-year BTech AIML — CGPA 8.2, PSCMR College
Founder of Vouqis — MCP Reliability Gateway for production AI agents
Building AI infrastructure daily — from IDE to deployment
Researching reliability failures in agentic systems
15+ AI systems shipped with explicit evaluation loops
Focused on one problem: making AI agents reliable in production.