Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Even though OpenAI strives to strengthen its AI Atlas Browser against cyberattacks, the company admits that rapid injectionsa type of attack that manipulates AI agents into following malicious instructions often hidden in web pages or emails, is a risk that won’t go away anytime soon, raising questions about how securely AI agents can operate on the open web.
“Rapid injection, much like scams and social engineering on the web, is unlikely to be completely ‘solved,'” OpenAI wrote Monday. blog post detailing how the company is strengthening Atlas’ armor to combat relentless attacks. The company admitted that “agent mode” in ChatGPT Atlas “expands the security threat surface.”
OpenAI launched its ChatGPT Atlas browser in October, and security researchers rushed to release their demos, showing that it was possible to write a few words in Google Docs that could change the behavior of the underlying browser. The same day, Brave published a blog post explaining that indirect prompt injection is a systematic challenge for AI-based browsers, including The Comet of Perplexity.
OpenAI is not alone in recognizing that prompt-based injections are not going away. THE The UK’s National Cyber Security Center warned earlier this month that rapid injection attacks against generative AI applications “may never be fully mitigated,” putting websites at risk of falling victim to data breaches. The UK government agency has advised cyber professionals to reduce the risk and impact of rapid injections, rather than thinking attacks can be “stopped”.
For its part, OpenAI said: “We view rapid injection as a long-term security challenge for AI, and we will need to continually strengthen our defenses against it. »
The company’s response to this Sisyphean task? A proactive, rapid response cycle that the company says shows promise in helping uncover new attack strategies internally before they are exploited “in the wild.”
This isn’t entirely different from what competitors like Anthropic and Google are saying: To combat the persistent risk of prompt-based attacks, defenses must be layered and continually stress-tested. Recent work from Googlefor example, focuses on architecture and policy level controls for agent systems.
But where OpenAI takes a different tactic is with its “automated LLM-based attacker.” This attacker is essentially a robot that OpenAI trained, using reinforcement learning, to play the role of a hacker looking for ways to deliver malicious instructions to an AI agent.
The robot can test the attack in simulation before using it for real, and the simulator shows how the target AI would think and what actions it would take if it saw the attack. The robot can then study this response, modify the attack and try again and again. This insight into the target AI’s internal reasoning is something outsiders don’t have access to. So, in theory, OpenAI’s bot should be able to find vulnerabilities faster than a real attacker would.
This is a common tactic in AI safety testing: create an agent to find edge cases and quickly test them in simulation.
“OUR [reinforcement learning]“A skilled attacker can trick an agent into executing sophisticated, harmful, long-term workflows that take place over dozens (or even hundreds) of steps,” OpenAI wrote. “We also observed new attack strategies that were not apparent in our human red team campaign or external reports.”

In a demo (pictured above), OpenAI showed how its automated attacker slipped a malicious email into a user’s inbox. When the AI agent then scanned the inbox, it followed the instructions hidden in the email and sent a resignation message instead of writing an out-of-office response. But following the security update, “agent mode” was able to successfully detect the rapid injection attempt and report it to the user, according to the company.
The company says that while it’s difficult to ensure rapid injection in a foolproof manner, it relies on large-scale testing and faster patch cycles to harden its systems before they appear in real-world attacks.
An OpenAI spokesperson declined to say whether Atlas’s security update had resulted in a measurable reduction in successful injections, but said the company had worked with third parties to harden Atlas against rapid injections before its launch.
Rami McCarthy, senior security researcher at cybersecurity company Wizsays that reinforcement learning is a way to continually adapt to attacker behavior, but that’s only part of the picture.
“A useful way to reason about risks in AI systems is autonomy times access,” McCarthy told TechCrunch.
“Agent navigators tend to be in a difficult part of this space: moderate autonomy combined with very high access,” McCarthy said. “Many current recommendations reflect this trade-off. Limiting connected access primarily reduces exposure, while requiring review of confirmation requests limits autonomy.”
These are two of OpenAI’s recommendations to users to reduce their own risks, and a spokesperson said Atlas is also trained to get user confirmation before sending messages or making payments. OpenAI also suggests users give specific instructions to agents, rather than giving them access to your inbox and telling them to “take all necessary actions.”
“High latitude makes it easier for hidden or malicious content to influence the agent, even when protections are in place,” according to OpenAI.
While OpenAI says protecting Atlas users from rapid injections is a top priority, McCarthy invites some skepticism about the return on investment for risk-prone browsers.
“For most everyday use cases, agent browsers do not yet provide enough value to justify their current risk profile,” McCarthy told TechCrunch. “The risk is high given their access to sensitive data such as emails and payment information, although this access is also what makes them powerful. This balance will evolve, but today the trade-offs are still very real.”