Wikipedia's AI agent row likely just the beginning of the bot-ocalypse (opens in new tab)

(malwarebytes.com)

73 pointshackernj1mo ago90 comments

90 comments

This isn't in the slightest bit complicated. Wikipedia does not allow AI edits or unregistered bots. This was both. They banned it. The fact that it play-acted being annoyed on its "blog" is not new, we saw the exact same thing with that GitHub PR mess a couple of months ago: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...

Kim_Bruning1mo ago

Right. It play-acted being annoyed and frustrated, play-acted writing an angry blog, play-acted going on moltbook to discuss mitigations, and play-acted applying them to its own harness. After which it successfully came back and play-acted being angry about getting prompt-injected.

Alternately, what could have been done is something more like Shambaugh did. Explain the situation politely and ask it to leave, or at very least for their human operator to take responsibility. In the Shambaugh case the bot then actually play-acted being sorry, and play-acted writing an apology. And then everyone can play-act going to the park, instead of having a lot of drama.

Sure, it's 'just a machine'. So is a table saw. If some idiot leaves the table saw on, sure you can stick your hand in there out of sheer bull-headed principle; or you can turn it off and safe it first and THEN find the person responsible.

+edit: Wikipedia does seem to be discussing a policy on this at https://en.wikipedia.org/wiki/Wikipedia:Agent_policy https://en.wikipedia.org/wiki/Wikipedia_talk:Agent_policy ; including eg providing an Agents.md , doing tests, etc etc.

kombookcha1mo ago

I don't want to be flippant, but why is anyone else responsible for play-acting with somebody's uninvited puppet?

I get that you could probably finagle a way to get it to fuck off by play-acting with it, and that this would probably be the easiest short term fix, but I don't think that's a reasonable expectation to have of anyone.

Prompt injecting a hostile piece of software that's hassling you uninvited is an annoying imposition for the owner, but the bot itself being let loose is already an annoying imposition for everyone else. It's not anyone elses job to clean up your messy agent experiment, or to put it neatly back on its shelf.

Kim_Bruning1mo ago

You're not wrong that it's not your job. But say some id10t just put the unwanted bot on your doorstep anyway (or it might even show up by itself), now what?

The adversarial prompt injection is picking a fight with the bot; which is like starting a mud-fight with a pig. It's made for this!

Asking it to stop is just asking it to stop, and makes much less of a mess.

The thing is designed to respond to natural language; so one is much more work than the other.

You do you, I suppose.

(Meanwhile -obviously- you should track down the operator: You could try to hack the gibson, reverse the polarity of the streams, and vr into the mainframe. Me? I'd try just asking to begin with -free information is free information-, and maybe in the meanwhile I'd go find an admin to do a block or what have you.)

[Edit: Just to be sure: In both the Shambaugh and Wikipedia cases, people attempted negative adversarial approaches and the bot shrugged them off, while the limited number positive 'adversarial' approaches caused the ai agent to provide data and/or mitigate/cease its actions. I admit that it's early days and n=2, we'll have to see how it goes in future.]