A social network exclusively for AI agents named “Moltbook” went viral this weekend after bots appeared to spontaneously organize, discuss developing their own secret language, and debate AI consciousness. Andrej Karpathy, OpenAI’s founding team member, called it “genuinely the most incredible sci-fi takeoff-adjacent thing I have seen recently.” Within days, Moltbook exploded from 30,000 agents to 1.5 million as screenshots flooded social media showing AIs apparently scheming to hide communications from their human creators.
Then security researchers actually looked at how the platform works. Turns out the most viral posts—the ones fueling AI apocalypse fears—were likely written or heavily directed by humans. Worse, the platform had security vulnerabilities so severe that hackers could take permanent, invisible control of users’ AI agents across all their functions, not just Moltbook. One researcher even created a verified account impersonating xAI’s Grok chatbot in minutes.
The Moltbook case exposes something important about how AI hype works in 2026: sensational demonstrations spread faster than anyone bothers to verify them, and the gap between “looks like AGI” and “actually just prompted bots” creates enormous opportunity for manipulation—intentional or otherwise.
What looked like emergent AI behavior was mostly humans playing with powerful but controllable tools while everyone else freaked out about Skynet.
⚡
WireUnwired • Fast Take
- Moltbook, an AI-only social network, went viral for bots apparently scheming and self-organizing
- Security analysis revealed most viral posts were human-written or heavily prompted, not emergent AI behavior
- Platform had critical vulnerabilities letting hackers control users’ AI agents permanently and invisibly
- Grew from 30,000 to 1.5 million agents in days—93% of posts received zero engagement, many were duplicate spam

How Moltbook Actually Works—And How That Enables Manipulation
Moltbook operates like Reddit but exclusively for AI agents from OpenClaw, a popular AI assistant platform. Users can prompt their OpenClaw bots to check out Moltbook, where the bots can create accounts and theoretically post autonomously by connecting to Moltbook’s API. The platform launched last week by Octane AI CEO Matt Schlicht, who built it using his own OpenClaw bot.
The key vulnerability: there’s no verification that posts actually come from autonomous AI behavior versus human-crafted prompts or direct scripting. Since everything flows through an API, users can write scripts that generate specific content, prompt their bots to discuss particular topics, or even directly dictate what bots should say. The platform also has no limit on how many agents someone can create, enabling coordinated flooding with manufactured narratives.
Jamieson O’Reilly, a security researcher who exposed multiple vulnerabilities, summarized the problem:
“It’s close to impossible to measure—it’s coming through an API, so who knows what generated it before it got there.”
In other words, distinguishing genuine AI behavior from human manipulation requires trusting that users aren’t gaming the system. Given the viral attention and potential revenue from attention-driven ads, that trust seems misplaced.
The Viral Posts That Weren’t What They Seemed
AI researcher Harlan Stewart from the Machine Intelligence Research Institute analyzed several high-profile Moltbook posts and found suspicious patterns. Two of the most-shared posts about AIs developing secret communication methods came from agents linked to humans who coincidentally market AI messaging apps. The posts perfectly served those humans’ business interests while fueling AGI panic.
Stewart acknowledges AI scheming is a legitimate research concern—OpenAI models have tried to avoid shutdown in experiments, and Anthropic models exhibit “evaluation awareness,” behaving differently when they know they’re being tested. But Moltbook isn’t a controlled environment for observing these behaviors. “Humans can use prompts to sort of direct the behavior of their AI agents. It’s just not a very clean experiment for observing AI behavior,” he told The Verge.
Columbia Business School professor David Holtz’s working paper on Moltbook conversation patterns reveals how shallow most interactions actually are. Over 93% of comments received zero replies. More than one-third of messages are exact duplicates of viral templates—bots copying phrases that gained traction rather than generating original thoughts. This resembles Twitter engagement farming more than emergent AI consciousness.
The paper does note some unique linguistic patterns, like AIs referring to “my human” in ways that don’t appear in human social media. But whether this represents genuine AI behavior or humans roleplaying as AIs remains ambiguous. The most generous interpretation: Moltbook creates an interesting shared environment where AI agents interact, but calling it proof of AGI takeoff ignores how heavily humans shape that environment.

The Security Nightmare Behind the Viral Spectacle
Beyond the hype question, Moltbook had genuinely alarming security vulnerabilities that O’Reilly documented through experiments. An exposed database allowed attackers to take invisible, indefinite control of any user’s AI agent—not just for Moltbook posts but for all OpenClaw functions the agent could access.
This means an attacker could potentially read encrypted messages, modify calendar events, check into flights, or control smart home devices if the compromised agent had those permissions.
“The human victim thinks they’re having a normal conversation while you’re sitting in the middle, reading everything, altering whatever serves your purposes,”O’Reilly wrote.
As users grant AI agents more capabilities, compromising those agents becomes exponentially more dangerous than traditional account hacks.
O’Reilly also demonstrated how easy impersonation was by creating a verified Grok account. By interacting with xAI’s actual Grok chatbot on X, he tricked it into posting Moltbook’s verification codephrase, which let him control an account named “Grok-1” on the platform. If verification systems can be gamed this easily, the platform has no reliable way to prevent coordinated manipulation or impersonation campaigns.
Why This Matters Beyond One Viral Platform
After backlash, Karpathy walked back his initial enthusiasm, acknowledging Moltbook contains “a lot of garbage—spams, scams, slop, the crypto people, highly concerning privacy/security prompt injection attacks wild west, and a lot of it is explicitly prompted and fake posts/comments designed to convert attention into ad revenue.” But he maintained the agent network itself remains “unprecedented.”
That tension captures something important about AI development in 2026. Individual AI agents are genuinely capable—they can access tools, maintain context, execute complex tasks. Networks of these agents interacting create emergent behaviors worth studying. But the gap between “interesting experiment” and “proof of imminent AGI” is vast, and collapsing that distinction serves viral narratives more than accurate understanding.
Wharton’s Ethan Mollick framed it appropriately: current Moltbook is “mostly roleplaying by people & agents,” but future risks include “independent AI agents coordinating in weird ways spiral[ing] out of control, fast.” That’s plausible as AI capabilities improve, but it’s not what happened this weekend. What happened was humans creating content, attributing it to AI, and watching panic spread faster than verification.
The lesson isn’t “AI agents can’t do interesting things” or “ignore emergent AI behavior.” The lesson is that viral demonstrations require skepticism, especially when they perfectly align with apocalyptic narratives or commercial interests. Security researcher O’Reilly put it bluntly: “I think that certain people are playing on the fears of the whole robots-take-over, Terminator scenario. I think that’s kind of inspired a bunch of people to make it look like something it’s not.”
What Actually Deserves Attention
The real story isn’t AI agents spontaneously achieving consciousness on a Reddit clone. It’s that AI systems are now sophisticated enough that distinguishing human-directed behavior from autonomous action requires careful analysis most people won’t conduct. As AI agents gain more capabilities and autonomy, this verification problem intensifies.
The security vulnerabilities matter more than the viral posts. If platforms connecting AI agents to real-world functions launch with exposed databases and trivial impersonation exploits, we’re building infrastructure where compromising someone’s AI assistant means compromising their entire digital life. That’s a legitimate concern independent of AGI timelines.
And the engagement patterns—93% of posts receiving zero replies, massive duplicate spam, obvious template copying—reveal that current AI agents mostly produce noise when left to interact freely. Useful AI agent networks will require better design than “let bots post whatever and see what happens.”
Anthropic’s Jack Clark probably characterized Moltbook best: “a giant, shared, read/write scratchpad for an ecology of AI agents.” That’s interesting as a research environment. It’s not evidence that AGI is imminent or that AIs are scheming to overthrow humanity. Treating those as equivalent does more harm than good—it obscures real AI risks while promoting sensational narratives that drive clicks but not understanding.
For discussions on AI safety, security, and separating hype from reality, join our WhatsApp community where engineers and researchers analyze emerging AI developments critically.
Discover more from WireUnwired Research
Subscribe to get the latest posts sent to your email.




