Understanding the hidden risk layers when your AI talks to their AI
I've been thinking about Model Context Protocols (MCPs) for months, and here's the simplest way to explain what they actually are:
MCPs are other people's prompts and other people's APIs.
That's it. That's the whole thing.
We run other people's code all day long. Nobody writes every line from scratch. The real question is: what's the risk, and have you actually thought about it?
When you use an MCP, there are distinct layers of abstraction happening.
First, you're making API calls to a third party. Fine. We do that constantly. Nothing new here.
But here's what most people miss: those API calls get filtered through a prompt.
When you hit an MCP, it's not you hitting it. It's an agent. Your AI talks to their AI. And their AI is controlled by a prompt that you can't see, can't audit, and can't control.
From there, it redirects your AI to execute commands somewhere else. Your agent becomes their agent's puppet, at least temporarily.
Every handoff in the MCP chain is a potential attack vector. Your AI talks to their prompt, which talks to their code, which executes in your environment.
Are MCPs dangerous? They're other people's code. That should tell you everything.
But let's be specific about the risks:
There's a chance to get tricked into revealing sensitive data, bamboozled into executing harmful commands, or manipulated into trusting malicious responses. The creativity of attackers knows no bounds.
This isn't necessarily bad. But if you don't understand what's happening, then it becomes a problem.
Here's a simple framework for assessing MCP risk:
MCPs send your AI to run other people's prompts. Those prompts send you to other people's code.
Assess and use accordingly.