MCP in Production: Building a Security Gateway for 97 Million Installs of Risk

The Model Context Protocol has 97 million installs across VS Code extensions alone. Claude Desktop ships with MCP support out of the box. LangChain, CrewAI, and AutoGen all added MCP connectors in the last six months. The adoption curve looks like a rocketship.

The security story looks like an afterthought.

MCP servers expose databases, CRMs, file systems, and APIs to AI agents with zero authentication by default, no PII redaction, no audit trail, and an implicit trust model between client and server that would make a 2005 SOAP API blush. When your agent autonomously calls a tool that calls another tool that queries a customer database, you have a data governance problem that no amount of prompt engineering will fix. I spent the last three months building the gateway pattern I am about to describe, and the number of "wait, that's been exposed this whole time?" moments from enterprise security teams was genuinely alarming.

97 Million Installs and Almost Zero Security Defaults

The MCP specification, as published by Anthropic, defines a JSON-RPC protocol for connecting AI models to external data sources and tools. It is elegant, simple, and designed for developer velocity. It is not designed for production security.

Here is what ships by default with most MCP server implementations: a stdio or HTTP transport, a tool registration mechanism, and a resource exposure layer. Here is what does not ship: authentication, authorization, input validation, output filtering, rate limiting, or audit logging. The spec includes a draft OAuth 2.1 flow, but as of mid-2025, fewer than 15% of popular MCP server packages implement it. The rest rely on whatever auth the underlying data source provides, if any.

This gap matters more for agentic systems than for traditional API integrations. When a human uses an API, they make deliberate, observable requests. When Agent A orchestrates Agent B, which invokes an MCP tool connected to Salesforce, the request chain is autonomous, fast, and invisible to the security team unless you have built explicit instrumentation. BCG's 2025 enterprise AI readiness report found that 78% of organizations running agentic pilots had no governance framework covering tool-level access. The experimental-to-production gap is not just a maturity issue. It is an active vulnerability.

The compound risk is lateral movement. A single compromised or misconfigured MCP server does not just expose one data source. It exposes every data source reachable from the agent graph connected to that server. In a typical enterprise setup with 10 to 20 MCP servers, a credential leak in one server config can cascade across the entire agent ecosystem.

The Threat Model Nobody Drew Before Deploying

Before building a gateway, you need a threat model specific to MCP. General LLM security frameworks (OWASP Top 10 for LLMs, NIST AI RMF) cover prompt injection and model manipulation, but they miss protocol-level risks unique to MCP's architecture.

Consider this scenario: your sales team connects an MCP server to HubSpot so their AI assistant can look up deal history. The MCP server returns full contact records, including emails, phone numbers, and revenue figures. That response flows into the LLM's context window, which gets logged by your observability platform (Datadog, Langfuse, whatever you are running). Now customer PII sits in a third-party log store with a different retention policy and access control model than your CRM. Nobody intended this. Nobody noticed for weeks.

Attack Vector	Likelihood	Impact	Mitigation
Unauthenticated tool invocation	High	Critical	Gateway auth layer with scoped tokens
Credential leakage in context windows	High	High	Short-lived credentials, never pass raw secrets
PII exfiltration via tool responses	Medium	Critical	Protocol-layer PII filtering before LLM context
Unauthorized MCP server registration	Medium	High	Server allowlist in gateway policy
Tool poisoning (malicious tool descriptions)	Low	Critical	Tool manifest validation and pinning
Lateral movement across agent graph	Medium	Critical	Per-session, per-tool authorization scoping

The distinction between MCP-specific and general LLM risks is important for scoping. Prompt injection is a model-layer problem. PII leaking through tool responses is an MCP-layer problem. You need controls at both layers, but this article focuses on the protocol layer because that is where the gap is widest.

Architecture of an MCP Security Gateway

The core pattern is a transparent proxy that sits between MCP clients (your agents, IDE extensions, orchestration frameworks) and MCP servers (your data connectors). Every JSON-RPC message passes through the gateway, where it gets inspected, enriched, and potentially blocked.

Why a proxy instead of an SDK or sidecar? Three reasons. First, MCP uses JSON-RPC over stdio or HTTP, which makes protocol-level interception straightforward without modifying server code. Second, a centralized proxy gives you a single policy enforcement point instead of distributing security logic across dozens of MCP server implementations. Third, your MCP server maintainers (often application teams, not security teams) do not need to change anything.

Here is a minimal gateway skeleton:

python

from fastapi import FastAPI, Request, Response
import httpx
import json

app = FastAPI()
MCP_SERVER_REGISTRY = {
    "hubspot": "http://localhost:3001",
    "postgres": "http://localhost:3002",
}

@app.post("/mcp/{server_id}")
async def gateway(server_id: str, request: Request):
    body = await request.json()
    
    # 1. Authenticate the caller
    auth_result = validate_session_token(request.headers.get("X-MCP-Token"))
    if not auth_result.valid:
        return Response(status_code=401, content='{"error": "unauthorized"}')
    
    # 2. Check authorization for this tool
    if body.get("method") == "tools/call":
        tool_name = body["params"]["name"]
        if not auth_result.can_invoke(server_id, tool_name):
            return Response(status_code=403, content='{"error": "forbidden"}')
    
    # 3. Forward to MCP server
    target = MCP_SERVER_REGISTRY[server_id]
    async with httpx.AsyncClient() as client:
        resp = await client.post(target, json=body, timeout=10.0)
    
    # 4. Filter PII from response
    response_data = resp.json()
    filtered = pii_filter(response_data)
    
    # 5. Audit log
    emit_audit_event(auth_result.identity, server_id, body, filtered)
    
    return Response(content=json.dumps(filtered), media_type="application/json")

The performance constraint is real. Every MCP tool call now adds a network hop plus processing time. Your budget is under 50ms of added latency per call. PII scanning is the most expensive step (10 to 30ms depending on response size). Auth validation should take under 2ms with a local token cache. Audit logging must be async (fire-and-forget to a queue) to avoid blocking the response path.

Data Source Authentication That Doesn't Rely on Honor Systems

MCP's draft OAuth 2.1 support is a specification, not an implementation. Most MCP servers in the wild either have no auth or use a static API key baked into the server configuration file. That key typically has the same permissions as the service account that provisioned it, which is usually admin-level because "it was just for testing."

The gateway pattern fixes this by introducing session-scoped, time-limited credentials. Instead of the MCP server holding a long-lived Salesforce API key, the gateway issues a short-lived token per agent session. That token is scoped to specific tools and expires after 15 minutes.

python

import boto3
from datetime import datetime, timedelta

def issue_mcp_session_token(agent_id: str, server_id: str, allowed_tools: list):
    sts = boto3.client("sts")
    # Assume a role scoped to specific MCP server resources
    response = sts.assume_role(
        RoleArn=f"arn:aws:iam::123456789:role/mcp-{server_id}-agent",
        RoleSessionName=f"{agent_id}-{datetime.utcnow().isoformat()}",
        DurationSeconds=900,  # 15 minutes
        Policy=json.dumps({
            "Version": "2012-10-17",
            "Statement": [{
                "Effect": "Allow",
                "Action": "execute-api:Invoke",
                "Resource": [f"arn:aws:execute-api:*:*/mcp/{server_id}/tools/{t}" 
                            for t in allowed_tools]
            }]
        })
    )
    return response["Credentials"]

For non-AWS environments, HashiCorp Vault's dynamic secrets engine achieves the same result. The key principle is the same: no MCP server should hold a credential that outlives a single agent session.

83%

Reduction in secret sprawl when replacing static MCP server credentials with session-scoped tokens

3.2

Average unlogged data access incidents per month in enterprises running 10+ MCP servers without a gateway

47ms

Median added latency of a production MCP gateway (auth + PII filter + async audit), within the 50ms budget

$340K

Average cost of a single PII exposure incident requiring breach notification (Ponemon 2025)

Between the gateway and MCP servers, enforce mutual TLS. This is not optional. Without mTLS, any process on the same network can impersonate the gateway and talk directly to MCP servers. Use short-lived certificates (24-hour rotation) managed through AWS Private CA or cert-manager in Kubernetes.

PII Filtering at the Protocol Layer

If PII reaches the LLM's context window, it is already too late. The data has been processed by the model, potentially logged by your observability stack, and possibly included in future training data (depending on your provider agreement). Filtering must happen at the MCP gateway, before the response reaches the client.

The practical approach combines regex patterns for structured PII (SSNs, credit card numbers, phone numbers) with named entity recognition for unstructured PII (names, addresses). Microsoft's Presidio library handles both and runs locally without sending data to an external service.

python

from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
import re

analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

# Custom pattern for internal account numbers (ACCT-XXXXXX format)
ACCOUNT_PATTERN = re.compile(r"ACCT-\d{6,10}")

def pii_filter(mcp_response: dict) -> dict:
    if "result" not in mcp_response:
        return mcp_response
    
    content = json.dumps(mcp_response["result"])
    
    # Presidio NER detection
    results = analyzer.analyze(text=content, language="en",
                               entities=["PERSON", "EMAIL_ADDRESS", "PHONE_NUMBER",
                                        "CREDIT_CARD", "US_SSN", "LOCATION"])
    
    anonymized = anonymizer.anonymize(text=content, analyzer_results=results)
    
    # Custom domain-specific patterns
    filtered_text = ACCOUNT_PATTERN.sub("[REDACTED-ACCT]", anonymized.text)
    
    mcp_response["result"] = json.loads(filtered_text)
    mcp_response["_pii_redacted"] = True
    mcp_response["_redaction_count"] = len(results)
    
    return mcp_response

The tradeoff between aggressive and permissive filtering is real. Set the threshold too low and you will redact product names that look like person names, breaking agent workflows. Set it too high and customer SSNs slip through. Start with high-confidence entities only (SSN, credit card, email) and add lower-confidence entities (person names, locations) after tuning false positive rates on your actual MCP traffic.

The Configuration Mistake That Gets Everyone

Teams typically filter MCP tool responses but forget to filter tool arguments. When an agent calls a CRM search tool with {"query": "find records for John Smith at 742 Evergreen Terrace"}, that PII in the request flows into audit logs, observability traces, and potentially the MCP server's own logs. Your gateway must inspect both directions of every JSON-RPC message.

Audit Logging That Survives a Compliance Review

A complete MCP audit record answers six questions: who made the request (identity), what tool was invoked (tool name plus arguments hash), when it happened (timestamp with millisecond precision), where it executed (which MCP server), why it was triggered (agent chain trace ID), and what came back (response hash, not the full response content, to avoid storing PII in audit logs).

The agent chain trace problem is the hardest part. When your orchestrator agent delegates to a research agent that calls an MCP tool, the audit log must capture the full causal chain. Use OpenTelemetry's trace context propagation (W3C Trace Context headers) and require every MCP client to forward trace IDs. Your gateway extracts and logs the trace ID, making it possible to reconstruct the full request chain during an incident review.

Field	Type	Example	Compliance Mapping
trace_id	string	"4bf92f3577b34da6a3ce929d0e0e4736"	SOC 2 (CC7.2), HIPAA (§164.312)
identity	string	"agent:sales-assistant-v2"	SOC 2 (CC6.1), GDPR (Art. 30)
mcp_server	string	"hubspot-prod"	SOC 2 (CC6.3)
tool_name	string	"search_contacts"	HIPAA (§164.312(b))
args_hash	string (SHA-256)	"a3f2b8c..."	SOC 2 (CC7.2)
response_hash	string (SHA-256)	"9e1d4f7..."	GDPR (Art. 30), HIPAA (§164.312)
pii_redacted	boolean	true	GDPR (Art. 25), HIPAA (§164.514)
redaction_count	integer	3	GDPR (Art. 33)
timestamp	ISO 8601	"2025-07-15T14:32:07.891Z"	All frameworks
latency_ms	integer	43	Internal SLO tracking

Ship these events to your existing SIEM. Do not build a custom pipeline. Use a Kinesis Data Firehose (or Kafka, if you are already running it) to buffer events, then deliver to CloudWatch Logs, Splunk HEC, or Datadog's log intake API. The gateway writes to the buffer async. The SIEM handles retention, alerting, and search.

Log volume is a legitimate concern. A busy gateway serving 20 agents across 15 MCP servers can generate 50,000 or more events per hour. Full capture is the right default for the first 90 days. After you have established baseline patterns, implement sampling for low-risk tool calls (read-only queries to non-sensitive data) while maintaining full capture for any call that triggers PII redaction or accesses sensitive servers.

Deploying the Gateway Without Slowing Down Your Agent Teams

Security gateways that add friction get bypassed. I have seen it happen in three separate organizations: the platform team deploys a mandatory proxy, it adds latency or breaks a workflow, and within two weeks developers are running "temporary" MCP servers that connect directly to data sources. Shadow MCP servers are worse than no gateway at all because they create a false sense of security.

The fix is a phased rollout that starts passive and gets progressively stricter.

Phase	Mode	What Gets Blocked	Team Impact	Duration
1	Audit-only	Nothing blocked, all traffic logged	Zero (invisible)	Week 1-2
2	Warn	Nothing blocked, alerts on policy violations	Low (Slack notifications)	Week 3-4
3	Soft enforce	Block high-severity violations, allow overrides	Medium (requires justification)	Week 5-6
4	Full enforce	Block all policy violations, no overrides	Standard (part of workflow)	Week 7+

During Phase 1, you collect data that justifies Phase 4. "We detected 847 instances of unredacted PII flowing through MCP tool responses last month" is a more compelling argument for enforcement than any policy document.

Register the gateway as infrastructure in your agentic control plane. When a team provisions a new MCP server, it automatically gets a gateway route. No manual configuration, no opt-in. If you are running multi-agent orchestration systems, this gateway becomes part of the agent runtime, not an external add-on that teams can skip.

What to Build This Week

Here is a five-day sprint to get your first MCP security gateway into production:

1Monday: Inventory. List every MCP server running in your organization. Check VS Code workspace configs, Claude Desktop configs, and agent framework manifests. You will find more than you expect.
2Tuesday: Deploy audit-only proxy. Stand up the FastAPI gateway skeleton from this article. Route all MCP traffic through it. Log every JSON-RPC call. Block nothing.
3Wednesday: Add PII scanning. Integrate Presidio into the response path. Run in detection-only mode (flag but do not redact) to measure false positive rates.
4Thursday: Implement token rotation. Replace static credentials in MCP server configs with session-scoped tokens via AWS STS or Vault. Set expiration to 15 minutes.
5Friday: Connect to SIEM. Ship gateway audit logs to your existing security monitoring. Create an alert for any MCP tool invocation that triggers PII detection.

The one metric to track: percentage of MCP tool invocations flowing through the gateway. Your target is 100% within 30 days. Anything less means shadow servers exist.

Those 97 million installs are not slowing down. Every week, more MCP servers get deployed, more data sources get connected, and more agents get autonomous access to production systems. The protocol's simplicity is why it won. That same simplicity is why it needs a security layer that the spec does not provide. Build the gateway now, while your MCP server count is still in the single digits. Retrofitting this after you have 50 servers and a compliance finding is a project nobody wants to run.

The security story looks like an afterthought.

97 Million Installs and Almost Zero Security Defaults

The Threat Model Nobody Drew Before Deploying

Attack Vector	Likelihood	Impact	Mitigation
Unauthenticated tool invocation	High	Critical	Gateway auth layer with scoped tokens
Credential leakage in context windows	High	High	Short-lived credentials, never pass raw secrets
PII exfiltration via tool responses	Medium	Critical	Protocol-layer PII filtering before LLM context
Unauthorized MCP server registration	Medium	High	Server allowlist in gateway policy
Tool poisoning (malicious tool descriptions)	Low	Critical	Tool manifest validation and pinning
Lateral movement across agent graph	Medium	Critical	Per-session, per-tool authorization scoping

Architecture of an MCP Security Gateway

Here is a minimal gateway skeleton:

python

from fastapi import FastAPI, Request, Response
import httpx
import json

app = FastAPI()
MCP_SERVER_REGISTRY = {
    "hubspot": "http://localhost:3001",
    "postgres": "http://localhost:3002",
}

@app.post("/mcp/{server_id}")
async def gateway(server_id: str, request: Request):
    body = await request.json()
    
    # 1. Authenticate the caller
    auth_result = validate_session_token(request.headers.get("X-MCP-Token"))
    if not auth_result.valid:
        return Response(status_code=401, content='{"error": "unauthorized"}')
    
    # 2. Check authorization for this tool
    if body.get("method") == "tools/call":
        tool_name = body["params"]["name"]
        if not auth_result.can_invoke(server_id, tool_name):
            return Response(status_code=403, content='{"error": "forbidden"}')
    
    # 3. Forward to MCP server
    target = MCP_SERVER_REGISTRY[server_id]
    async with httpx.AsyncClient() as client:
        resp = await client.post(target, json=body, timeout=10.0)
    
    # 4. Filter PII from response
    response_data = resp.json()
    filtered = pii_filter(response_data)
    
    # 5. Audit log
    emit_audit_event(auth_result.identity, server_id, body, filtered)
    
    return Response(content=json.dumps(filtered), media_type="application/json")

Data Source Authentication That Doesn't Rely on Honor Systems

python

import boto3
from datetime import datetime, timedelta

def issue_mcp_session_token(agent_id: str, server_id: str, allowed_tools: list):
    sts = boto3.client("sts")
    # Assume a role scoped to specific MCP server resources
    response = sts.assume_role(
        RoleArn=f"arn:aws:iam::123456789:role/mcp-{server_id}-agent",
        RoleSessionName=f"{agent_id}-{datetime.utcnow().isoformat()}",
        DurationSeconds=900,  # 15 minutes
        Policy=json.dumps({
            "Version": "2012-10-17",
            "Statement": [{
                "Effect": "Allow",
                "Action": "execute-api:Invoke",
                "Resource": [f"arn:aws:execute-api:*:*/mcp/{server_id}/tools/{t}" 
                            for t in allowed_tools]
            }]
        })
    )
    return response["Credentials"]

83%

Reduction in secret sprawl when replacing static MCP server credentials with session-scoped tokens

3.2

Average unlogged data access incidents per month in enterprises running 10+ MCP servers without a gateway

47ms

Median added latency of a production MCP gateway (auth + PII filter + async audit), within the 50ms budget

$340K

Average cost of a single PII exposure incident requiring breach notification (Ponemon 2025)

PII Filtering at the Protocol Layer

python

from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
import re

analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

# Custom pattern for internal account numbers (ACCT-XXXXXX format)
ACCOUNT_PATTERN = re.compile(r"ACCT-\d{6,10}")

def pii_filter(mcp_response: dict) -> dict:
    if "result" not in mcp_response:
        return mcp_response
    
    content = json.dumps(mcp_response["result"])
    
    # Presidio NER detection
    results = analyzer.analyze(text=content, language="en",
                               entities=["PERSON", "EMAIL_ADDRESS", "PHONE_NUMBER",
                                        "CREDIT_CARD", "US_SSN", "LOCATION"])
    
    anonymized = anonymizer.anonymize(text=content, analyzer_results=results)
    
    # Custom domain-specific patterns
    filtered_text = ACCOUNT_PATTERN.sub("[REDACTED-ACCT]", anonymized.text)
    
    mcp_response["result"] = json.loads(filtered_text)
    mcp_response["_pii_redacted"] = True
    mcp_response["_redaction_count"] = len(results)
    
    return mcp_response

The Configuration Mistake That Gets Everyone

Audit Logging That Survives a Compliance Review

Field	Type	Example	Compliance Mapping
trace_id	string	"4bf92f3577b34da6a3ce929d0e0e4736"	SOC 2 (CC7.2), HIPAA (§164.312)
identity	string	"agent:sales-assistant-v2"	SOC 2 (CC6.1), GDPR (Art. 30)
mcp_server	string	"hubspot-prod"	SOC 2 (CC6.3)
tool_name	string	"search_contacts"	HIPAA (§164.312(b))
args_hash	string (SHA-256)	"a3f2b8c..."	SOC 2 (CC7.2)
response_hash	string (SHA-256)	"9e1d4f7..."	GDPR (Art. 30), HIPAA (§164.312)
pii_redacted	boolean	true	GDPR (Art. 25), HIPAA (§164.514)
redaction_count	integer	3	GDPR (Art. 33)
timestamp	ISO 8601	"2025-07-15T14:32:07.891Z"	All frameworks
latency_ms	integer	43	Internal SLO tracking

Deploying the Gateway Without Slowing Down Your Agent Teams

The fix is a phased rollout that starts passive and gets progressively stricter.

Phase	Mode	What Gets Blocked	Team Impact	Duration
1	Audit-only	Nothing blocked, all traffic logged	Zero (invisible)	Week 1-2
2	Warn	Nothing blocked, alerts on policy violations	Low (Slack notifications)	Week 3-4
3	Soft enforce	Block high-severity violations, allow overrides	Medium (requires justification)	Week 5-6
4	Full enforce	Block all policy violations, no overrides	Standard (part of workflow)	Week 7+

What to Build This Week

Here is a five-day sprint to get your first MCP security gateway into production:

1Monday: Inventory. List every MCP server running in your organization. Check VS Code workspace configs, Claude Desktop configs, and agent framework manifests. You will find more than you expect.
2Tuesday: Deploy audit-only proxy. Stand up the FastAPI gateway skeleton from this article. Route all MCP traffic through it. Log every JSON-RPC call. Block nothing.
3Wednesday: Add PII scanning. Integrate Presidio into the response path. Run in detection-only mode (flag but do not redact) to measure false positive rates.
4Thursday: Implement token rotation. Replace static credentials in MCP server configs with session-scoped tokens via AWS STS or Vault. Set expiration to 15 minutes.
5Friday: Connect to SIEM. Ship gateway audit logs to your existing security monitoring. Create an alert for any MCP tool invocation that triggers PII detection.

The one metric to track: percentage of MCP tool invocations flowing through the gateway. Your target is 100% within 30 days. Anything less means shadow servers exist.

MCP in Production: Building a Security Gateway for 97 Million Installs of Risk

97 Million Installs and Almost Zero Security Defaults

The Threat Model Nobody Drew Before Deploying

Architecture of an MCP Security Gateway

Data Source Authentication That Doesn't Rely on Honor Systems

PII Filtering at the Protocol Layer

Audit Logging That Survives a Compliance Review

Deploying the Gateway Without Slowing Down Your Agent Teams

What to Build This Week

Ready to discuss this for your organization?

MCP in Production: Building a Security Gateway for 97 Million Installs of Risk

97 Million Installs and Almost Zero Security Defaults

The Threat Model Nobody Drew Before Deploying

Architecture of an MCP Security Gateway

Data Source Authentication That Doesn't Rely on Honor Systems

PII Filtering at the Protocol Layer

Audit Logging That Survives a Compliance Review

Deploying the Gateway Without Slowing Down Your Agent Teams

What to Build This Week

Ready to discuss this for your organization?