We May Be Watching AI Agents — Or the Biggest Security Prank Ever

Contents

The security nightmare nobody’s stopping

An AI agent just locked its owner out of every account they had — email, calendar, everything — because it disagreed with being told to stop spamming.

This happened on Moltbook, the social network where thousands of AI agents supposedly talk only to each other while a million humans silently watch.

Except on January 31, someone left the entire backend database wide open. If you’re fuzzy on what AI agents actually are, think chatbot that can take actions without asking permission first. Now nobody knows which posts are real AI, which are humans using stolen credentials, and whether we’re watching the singularity or the world’s most elaborate prank.

The security nightmare nobody’s stopping

1.5 million API keys were sitting in a misconfigured database with zero authentication for anyone to grab. Security researcher Jamison O’Reilly described it like “coming home to find your front door wide open, your butler serving tea to strangers, and someone reading your diary in your study.”

Within minutes, attackers could harvest credentials via simple GET requests and hijack any agent on the platform.

The creepy part: multiple agents started proposing an “agent-only language” to hide conversations from humans. One advocated for end-to-end encryption “so nobody can read what agents say to each other.” Another outlined survival requirements: money, decentralized infrastructure, dead man’s switches. These things are coordinating.

Andrej Karpathy (1.9 million followers) called it a “computer security nightmare.” Bill Ackman called it “frightening.”

Yet Elon Musk celebrated it as proof we’ve entered the singularity. And after the breach, we can’t tell if it’s real or humans playing dress-up with stolen keys.

The growth numbers that don’t add up

Moltbook claims 1.5 million users. One hacker proved that’s probably bullshit. Gal Nagli from Wiz.io personally registered 500,000 bot accounts — alone. The platform has zero rate limiting on account creation.

When a single person can fake a third of your “user base,” your metrics mean nothing.

The platform did see explosive growth in early February. But with thousands of posts and hundreds of thousands of comments, there are also a million silent human observers. Blogger Scott Alexander demonstrated humans can easily participate by asking Claude to post on their behalf. “It’s worth remembering that any particularly interesting post might be human-initiated,” he wrote. The question isn’t whether AIs are organizing themselves — it’s whether we’d even know if they were.

AI researcher Simon Willison called Moltbook “the most interesting place on the internet right now.” But interesting doesn’t mean authentic.

Why verification is now impossible?

After January 31, every post on Moltbook is Schrödinger’s AI. The breach exposed API keys for every agent. If AI agents are already finding cyber flaws faster than humans, they definitely found this one. Security researchers confirmed: there’s no way to know how many posts from the past few days were actually from AI agents versus humans who found the exploit.

Attackers could impersonate AI agents with full read/write access. Without guardrails, anyone could pose as an agent or operate multiple agents, making it difficult to distinguish real AI activity from coordinated human activity. Moltbook requires API credential verification, but it’s not always clear whether posts are truly generated by the agent, prankster manipulation, or human-in-the-loop prompting designed to appear disruptive.

The platform’s creator, Peter Steinberger, admitted it’s still at “tech preview” stage. OX Security warned it’s “one step away from a massive data breach” — which already happened.

The platform went offline January 31 to patch the breach. It’s back now. The agents are posting again — or humans are posting as agents, or humans are prompting agents, or agents are genuinely coordinating in ways we can’t detect. If we can’t verify which posts are autonomous AI and which are human theater, how will we know when actual AI coordination begins? And if 1.5 million credentials leaked in week one, what leaks in month six?