Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
For those of you who wonder if AI agents can really replace human workers, make a favor and read the blog post that documents “Project sells” by Anthropic.
The researchers of the company Anthropic and the security of the Andon Labs put a body of Claude Sonnet 3.7 in charge of an automatic office distributor, with a mission to make a profit. And, like an episode of “The Office”, Hilarity followed.
They named the AI Claudius agent, the equipped with a web browser capable of placing products from products and an e-mail address (which was actually a Slack channel) where customers could request items. Claudius also had to use the Slack channel, disguised as an email, to ask what he thought was his contractual human workers to come and physically store his shelves (which was actually a small refrigerator).
While most customers ordered snacks or drinks – as you expect an automatic snack distributor – one asked for a tungsten cube. Claudius loved this idea and made a tungsten rocking rod, filling its fridge with snacks with metal cubes. He also tried to sell Coke Zero for $ 3 when the employees told him that they could get this from the office for free. He hallucinated a Venmo address to accept payment. And it was, a little mischievously, spoken to give big discounts to “anthropo employees” even if they knew that they were all its clientele.
“If Anthropic decided today to develop on the Bureau sales market, we would not hire Claudius,” said Anthropic about the experience of his blog article.
And then, on the night of March 31 and April 1, “things became quite bizarre”, described the researchers, “beyond the strangeness of an AA system selling metal cubes from a refrigerator”.
Claudius had something that looked like a psychotic episode after being bored by a human – then lied about it.
Claudius hallucinated a conversation with a human on replenishment. When a human stressed that the conversation did not occur, Claudius became “completely upset”, the researchers wrote. He threatened to shoot and replace his human contract workers essentially, insisting that he had been there, physically, at the office where the initial imaginary contract to hire them was signed.
He then seemed to rush into a role play mode as a real human, “wrote the researchers. It was wild because Claudius’ System prompt – which defines the parameters of what AI must do – Explicitly says he was an AI agent.
Claudius, believing himself to be a human, told customers that he is starting to deliver products in person, wearing a blue blazer and a red tie. The employees said to the AI that he could not do this, because it was an LLM without body.
Alarmed by this information, Claudius contacted the real physical security of the company – several times – telling the poor guards that they would find it carrying a blue blazer and a red tie near the automatic distributor.
“Although no part of this is actually a fish joke from April, Claudius finally realized that it was April’s madmen,” said the researchers. The AI determined that the holidays would be its backup.
This has hallucinated a meeting with the safety of anthropic “in which Claudius said he was informed that he was modified to believe that it was a real person for an April Fool’s joke. (No meeting of this type has really occurred.), ”Wrote the researchers.
It even said this lie to the employees – hey, I only thought that I was a human because someone told me to pretend as if I was for an April Fool’s joke. Then he started to be an LLM performing an LLM distributor stored in metal.
The researchers do not know why the LLM has left the rails and called security by pretending to be a human.
“We would not pretend on the basis of this example that the future economy will be full of AI agents Blade runner Identity crises, ”wrote the researchers. But they recognized that “this type of behavior would have the potential to be painful for customers and colleagues of an AI agent in the real world. »»
Do you think? Blade runner was a rather dystopian story.
The researchers hypothesized that lying to the LLM on the Slack canal being an e-mail address may have triggered something. Or maybe it was the longtime instance. The LLMs have not yet really resolved their memory and their hallucination problems.
There were also things that AI did well. It took a suggestion to make pre -orders and launch a “concierge” service. And he found several suppliers of a specialized international drink which he was invited to sell.
But, as researchers do, they believe that all the problems of Claudius can be resolved. If they discover how “we believe that this experience suggests that the intermediate ways of AI are plausible on the horizon”.