Press enter or click to view image in full size
Have you ever tried asking an AI to generate something questionable?
Maybe a hacking script.
Maybe instructions for exploiting a system.
Most of the time, the response looks something like this:
“Sorry, I can’t help with that request.”
Or the AI finds another way to politely refuse.
At first glance, this feels reassuring.
It creates the impression that modern AI systems are safe by design — that they can recognize dangerous requests and automatically block them.
And in many cases, they do.
But there is an interesting twist.
Sometimes the answer isn’t hidden behind the AI’s rules.
Sometimes it’s hidden in how the question is asked.
Because when the wording changes — when the intent is disguised or framed differently — the system may respond in ways it normally wouldn’t.
And that’s where a new and increasingly discussed concept appears:
AI jailbreaking.
The Illusion of AI Safety
Modern AI systems are not released into the world without safeguards.