Redacting with Confidence: Building a Generic Privacy-First MCP Proxy for AI Agents
A lightweight proxy that keeps your AI assistants fast, compliant, and PII-free
We really love our new Agentic assistants. Actually, we’ve grown to rely on them. They have quickly become part of our daily development at GrowthSpace. Equipped with rules and memories, they help us automate and accelerate nearly every stage of software delivery.
One of the ways we empower our agents is by using MCPs to securely expose contextual data to agents when they need it.
But much of what we work with (logs, tickets, internal resources) can contain personally identifiable information (PII) like names or emails that trace back to individuals. That means we can’t fully integrate systems like Jira or GCP with our agents without compromising privacy.
Our constraint was clear: no PII should reach a model.
So we built (and open-sourced) a lightweight MCP proxy that automatically redacts sensitive information in code, before it ever reaches an Agent. Our way to contribute back to the community that moved us forward along this journey.
Here’s the TL;DR:
Context
GrowthSpace engineers use AI assistants with MCP servers across daily development and internal AI pipelines.
Problem
We can’t expose any PII to AI agents (per customer and regulatory obligations). Directly wiring agents to Jira, logs, or other systems was a non-starter.
Vision
Insert a small, transparent proxy that sits between agents and tools, redacting sensitive data on the fly.
The Challenge: AI Velocity vs. Enterprise Privacy
MCPs (Model Context Protocol) don’t solve data governance, especially when it comes to PII redaction. Their focus is giving us a clean way to connect agents to tools. And that’s exactly how it should be.
But enterprises need stronger guarantees than individual developers: controlled exposure, auditing, and compliance built in.
Speed is great at home, but we can’t break compliance at work.
To make sure every MCP response is redacted (and redacted efficiently), our solution had to be a fast, transparent layer that developers wouldn’t even notice was there.

I’m Fast as Hell, Boyee! 🏎️
Faster Setup
We focused on stdio and streamable-HTTP support.
Streamable-HTTP is the latest MCP standard for HTTP-based communication, while stdio lets you run the proxy without a separate background process.
No matter which MCPs you’re proxying, the proxy automatically translates their responses into its own selected protocol.
Configuring your MCPs through environment variables and tool filtering also means you can reuse the setup across different projects. No need to reconfigure each MCP for every AI IDE or platform.
Faster Redaction
We started simple: pattern-based redaction for emails and phone numbers. But that meant we couldn’t reliably redact customers’ names, since names can appear in many forms and languages.
Next, we moved to dictionary-based redaction, which removes customer-specific data such as names and email addresses. Our dictionary has around 100,000 entries, so checking each one individually would be painfully slow O(n × m) if you’re matching every word against every entry.
That’s why we eventually adopted an Aho–Corasick automaton, a linear-time pattern matcher that handles all dictionary entries in O(n + m). It’s case-insensitive, does whole-word matching, and scales beautifully.
Here’s Alfred V. Aho explaining the history of the algorithm. It’s worth a watch!
Because who better to explain Aho–Corasick than Aho himself?
Let’s Talk Technology!
CLI Layer
We use Commander.js for a small CLI layer that makes running the proxy easy and configurable.
You can launch it with the default configuration or pass a custom config using the --config flag.
Proxy Core
A NestJS server powers the core proxy. It exposes endpoints for streamable HTTP and SSE, plus a stdio mode for IDE integrations.
Each downstream MCP server is wrapped so its tools, prompts, and resources are listed and proxied consistently.
Redaction Core
A generic scanner masks emails and international phone numbers in a single linear pass.
Then, a dictionary matcher uses an npm implementation of Aho–Corasick (ahocorasick) to mask customer-specific PII loaded from Google Cloud Storage.
Redaction runs recursively across all strings or can be scoped to specific JSON keys (like description, text, or href).

Real-World Use at GrowthSpace
Here’s how it all comes together in practice.
Our agents can now:
- Review the details of a support ticket in Jira.
- Investigate related production logs to trace the root cause of an API failure.
- Follow the trail across multiple systems using a shared trace ID or session ID.
- Fix the issue directly or surface relevant insights to developers.
Since introducing the proxy, our AI assistants now handle dozens of MCP requests daily with zero real identifiers exposed.
We’re proud of our little open-source PII-compliance baby! 🎓

Future Plans
We’re far from done. The proxy works great today, but there’s plenty we still want to explore.
Plugin-Based Dictionary Loading
Right now, the proxy loads redaction dictionaries from Google Cloud Storage.
Next up: a pluggable loader interface so teams can contribute their own dictionary sources. Whether that’s S3, HTTP, Git repos, internal APIs, or even local files.
Fine-Grained Redaction Strategies
Different data deserves different rules.
We’re thinking about per-downstream tuning, so you can apply stricter redaction to logs, and lighter rules to issues or documentation — all without changing your agent setup.
Observability & Metrics
With structured audit events and usage metrics, platform teams will be able to see which MCPs and tools are actually driving value day-to-day.
Why not measure MCP activity, tool calls, and overall AI adoption to understand how teams are really using their agents — and where they’re getting the most impact?
Sharing the Source
We released this project as open source to maximize trust and adoption. Both inside GrowthSpace and across the developer community.
Transparency
The proxy’s behavior is fully visible and easy to audit. What it does, it does in the open — no hidden calls, no guesswork.
Composability
Everything is defined through a simple JSON config. Bring your own MCP servers, mix and match transports, and tune it for your own workflows.
Community
We welcome issues, ideas, and PRs, especially around new dictionary plugins or transport use cases.
Try it out in your stack and tell us what you think.
🔗 GitHub Repo | 🧧 NPM Package | 🐳 Docker Latest Image
Quick Getting Started
- Generate the initial configuration in your home folder by running
npx -y @growthspace-engineering/gs-mcp-proxy-pii-redactor --initBy default, the MCP proxy runs in stdio mode, so you can add it directly to your IDE’s MCP configuration without additional setup.
Here’s an example setup for Cursor:
{
"mcpServers": {
"proxied-github": {
"command": "npx",
"args": [
"-y",
"@growthspace-engineering/gs-mcp-proxy-pii-redactor",
"--config",
"~/gs-mcp-proxy/config.json",
"--stdio-target",
"github"
]
}
}
}Opt-In Redaction (Per Server)
Redaction is enabled and scoped per downstream server.
If you specify keys, only those JSON keys will be redacted (recursively).
If you omit keys, all string values are redacted automatically.
{
"mcpServers": {
"atlassian": {
"command": "npx",
"args": ["-y", "mcp-remote@0.1.17", "https://mcp.atlassian.com/v1/sse"],
"options": {
"toolFilter": {
"mode": "block",
"list": ["transitionJiraIssue"]
},
"redaction": {
"enabled": true,
"keys": ["description", "text", "href"]
}
}
}
}
}Wrapping It Up
Privacy shouldn’t slow down AI adoption.
With a lightweight proxy layer, you can keep shipping fast while ensuring no PII ever leaves your boundary.
gs-mcp-proxy-pii-redactor lets teams integrate powerful MCP tools safely, transparently, and without friction.
If your organization faces the same tension between capability and compliance, we’d love your feedback, ideas, and especially contributions.
🔗 GitHub Repo | 🧧 NPM Package | 🐳 Docker Latest Image