Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124


It’s refreshing when a major AI company states the obvious. In a detailed article By hardening ChatGPT Atlas against rapid injections, OpenAI recognized what security practitioners have known for years: "Rapid injection, like scams and social engineering on the web, will likely never be fully “solved.”"
What is new is not the risk, it is the confession. OpenAI, the company that deploys one of the most widely used AI agents, has publicly confirmed that agent mode “expands the security threat surface” and that even sophisticated defenses cannot offer deterministic guarantees. For companies already using AI in production, this isn’t a revelation. It’s a validation – and a signal that the gap between how AI is deployed and how it is defended is no longer theoretical.
None of this surprises anyone using AI in production. What worries security leaders is the gap between this reality and the company’s state of readiness. A VentureBeat survey of 100 technical decision makers found that 34.7% of organizations have deployed defenses dedicated to rapid injection. The remaining 65.3% have not purchased these tools or cannot confirm that they have.
The threat is now officially permanent. Most companies still aren’t equipped to detect it, let alone stop it.
OpenAI’s defensive architecture deserves scrutiny because it represents the current ceiling of what is possible. Most, if not all, commercial companies won’t be able to replicate it, making the advances they shared this week all the more relevant to security leaders protecting developing AI applications and platforms.
The company built a "Automated attacker based on LLM" trained end-to-end with reinforcement learning to discover rapid injection vulnerabilities. Unlike traditional red-teaming which reveals simple failures, the OpenAI system can "tricking an agent into executing sophisticated, harmful, long-term workflows that take place over dozens (or even hundreds) of steps" by eliciting specific output strings or triggering unintended tool calls in a single step.
Here’s how it works. The automated attacker proposes a candidate injection and sends it to an external simulator. The simulator executes a counterfactual deployment of the targeted victim agent’s behavior, returns a complete reasoning and action trace, and the attacker iterates. OpenAI claims to have discovered attack patterns that "did not appear in our human red-teaming campaign or in external reports."
An attack discovered by the system demonstrates the stakes. A malicious email dropped into a user’s inbox contained hidden instructions. When Agent Atlas parsed the messages to write an out-of-office response, it instead followed the injected prompt and wrote a resignation letter addressed to the user’s CEO. The absence from the office was never written about. The agent resigned on behalf of the user.
OpenAI responded by dispatch "a newly formed model in a contradictory manner and reinforced surrounding guarantees." The company’s defensive stack now combines automated attack discovery, adversarial training against newly discovered attacks, and system-level protections outside of the model itself.
Unlike how AI companies can be oblique and cautious about their red team results, OpenAI has been blunt about the limitations: "The nature of rapid injection makes deterministic security guarantees difficult." In other words, it means “even with this infrastructure, they cannot guarantee defense.”
This admission comes as companies move from co-pilot to autonomous agent status – precisely at the moment when rapid injection ceases to be a theoretical risk and becomes an operational risk.
OpenAI has shifted significant responsibility back to companies and the users they support. This is a long-standing trend that security teams should recognize shared responsibility models in the cloud.
The company explicitly recommends using offline mode when the agent does not need to access authenticated sites. He advises carefully reviewing confirmation requests before the agent takes substantial action like sending emails or finalizing purchases.
And he warns against general instructions. "Avoid overly broad prompts such as “check my emails and take action.”" OpenAI wrote. "High latitude makes it easier for hidden or malicious content to influence the agent, even when protections are in place."
The implications are clear regarding agent autonomy and its potential threats. The more independence you give an AI agent, the more attack surface you create. OpenAI builds defenses, but companies and the users they protect have a responsibility to limit exposure.
To understand how prepared companies really are, VentureBeat surveyed 100 tech decision-makers of all sizes, from startups to companies with more than 10,000 employees. We asked a simple question: has your organization purchased and implemented dedicated solutions for rapid filtering and abuse detection?
Only 34.7% answered yes. The remaining 65.3% answered no or could not confirm the status of their organization.
This separation is important. This shows that rapid injection defense is no longer an emerging concept; This is a shipping product category with real business adoption. But it also reveals how early the market still is. Nearly two-thirds of organizations using AI systems today operate without dedicated protection, relying instead on default protection models, internal policies, or user training.
Among the majority of organizations surveyed without a dedicated defense, the predominant response regarding future procurement was uncertainty. When asked about future purchases, most respondents were unable to articulate a clear timeline or decision path. The most telling signal was not the lack of available providers or solutions, but rather indecision. In many cases, organizations appear to be deploying AI faster than they are formalizing how it will be protected.
Data cannot explain why adoption is lagging, whether due to budget constraints, competing priorities, immature deployments, or the belief that existing protections are sufficient. But it makes one thing clear: AI adoption goes beyond AI security preparedness.
OpenAI’s defensive approach leverages advantages that most companies don’t have. The company has white-box access to its own models, a deep understanding of its defense stack, and the compute needed to run continuous attack simulations. Its automated attacker gets "privileged access to traces of the defender’s reasoning…" give it "an asymmetric advantage, increasing the chances that he can outrun outside opponents."
Companies deploying AI agents are at a significant disadvantage. While OpenAI leverages white-box access and continuous simulations, most organizations work with black-box models and limited visibility into their agents’ reasoning processes. Few have the resources necessary for an automated red-teaming infrastructure. This asymmetry creates a compounding problem: As organizations expand AI deployments, their defensive capabilities remain static, waiting for procurement cycles to catch up.
Third-party rapid injection defense providers including Robust Intelligence, Lakera, Prompt Security (now part of SentinelOne), and others are trying to fill this gap. But adoption remains low. The 65.3% of organizations without a dedicated defense operate with the built-in protections included by their model vendors, as well as policy documents and awareness training.
OpenAI’s message makes clear that even the most sophisticated defenses cannot offer deterministic guarantees.
OpenAI’s announcement does not change the threat model; this validates it. Rapid injection is real, sophisticated and permanent. The company shipping the most advanced AI agent just told security officials to expect this threat indefinitely.
Three practical implications follow:
The greater the autonomy of the agent, the greater the attack surface. OpenAI’s advice to avoid blanket prompts and limit logged in access applies beyond Atlas. Any AI agent with wide latitude and access to sensitive systems creates the same exposure. As Forrester noted at their annual security summit earlier this year, generative AI is an agent of chaos. This prediction proved prescient based on OpenAI’s test results released this week.
Detection matters more than prevention. If a deterministic defense is not possible, visibility becomes critical. Organizations need to know when agents behave unexpectedly, not just hope that protective measures hold.
The decision to buy or build is underway. OpenAI is investing heavily in automated red teaming and adversarial training. Most companies can’t replicate this. The question is whether third-party tools can close the gap and whether the 65.3% without dedicated defense will adopt it before an incident forces the issue.
OpenAI declared what security practitioners already knew: rapid injection is an ongoing threat. The company pushing the hardest for agentic AI confirmed this week that “agent mode…expands the security threat surface” and that defense requires ongoing investment, not a one-time solution.
The 34.7% of organizations that have dedicated defenses are not immune, but they are able to detect attacks when they occur. In contrast, the majority of organizations rely on default safeguards and policy documents rather than purpose-built protections. OpenAI’s research clearly shows that even the most sophisticated defenses cannot offer deterministic guarantees, highlighting the risk of this approach.
OpenAI’s announcement this week underscores what the data already shows: the gap between deploying AI and protecting AI is real and growing. Waiting for deterministic guarantees is no longer a strategy. Security managers must act accordingly.