All the years of discussing programming/security best practices
Then cut to 2026 and suddenly its like we just collectively decided software quality doesn't matter, determinism is going out the window, and its becoming standard practice to have bots on our local PC constantly running unknown shell commands
A truly absurd amount of capital was deployed which triggered a cascade of reactions by the people in charge of capital at other places. They are extremely anxious that everything will change under their feet, and if they don't start using as much as humanly possible of it right about now they die.
That's it.
The tools have definitely found some use, there's more to learn on how else they can be used, and maybe over time smart people will settle on ways to wrangle it well. The messaging from the execs though, is not that, it is "you'll be measured on how much you use this, we don't know for what or how, it's for you to figure out but don't dare to not use it".
I do understand their anxiety, their job is to not let their companies die, and make the most money as they can in the process; a seemingly major shift on the foundations of their orgs will cause fear.
But we have not collectively decided that it was safe, and good, to run rampant with these tools without caring for all that was learnt since software was invented...
The whole industry is like a fashion show and has been for a long time. This is just exceptionally stupid compared to moderately stupid things before. I see it ore that everyone's wearing pink feathered chicken suits because it's in fashion. If you don't wear a pink feathered chicken suit then you're a luddite scumbag who doesn't deserve the respect of your peers.
However some of us still have enough self-respect not to be seen dead in a pink feathered chicken suit. I mean I'm still pissed off at half the other stuff we do in the industry. I haven't even really looked at the chicken suits yet.
Seems we are digging our graves as a species and don't even realize it. I mean Sam Altman is already saying it taking 20 years to train a human is a Big Problem.
Also, customers outsource the risk to their vendors, so as long as there's someone to sue, nobody worries about doing it right. Ship it now and pay the lawyers later.
Humans will kill us by it damage amplifying their worst characteristics.
Thus we'll die of a pandemic because some idiot LLM'ed up positive looking virology data when they were being too lazy to verify something. Everyone will trust it because they don't really care as long as it looks about right.
And then we won't need to, because at that point it will be too late.
We don’t say “a rogue plane killed 300 people today when it crashed into a mountain”.
The only difference in the AI case is that some people are attempting to shift blame for their incompetence into a computer system, and the media is going along with it because it increases clicks.
From TFA:
"But the agent also independently publicly replied to the question after analyzing it, without getting approval first."
Is this new to people? I figured this out when I first entered the industry. The messages have never been particularly subtle.
We’ve covered so many issues already on our blog (grith.ai)
I self taught and wrote a small saas in 2017. Pays well enough to support me.
I'm building a new one using AI this year. I promise you, it's better built and more secure than what my previous still in use Saas is.
We're the dummies that have to run around picking up dookies like a new puppy in the house.
My thinking is, this will increase the demand for backup and other resilience solutions.
This occurred long time ago comrade 'aeblyve.
I saw the sea change in 2008 when quality process got replaced with velocity and testing tasks. I've watched everything from Experian and health record data leaks to Windows 11 since that change. Software quality hasn't mattered for a long time.
https://github.com/kstenerud/yoloai
I can't go back anymore. Going back to a non-sandboxed Claude feels like going back to a non-adblocked browser.
It makes it sound like a rogue AI hacked Meta.
Instead, the "wild" thing here is that someone let an agent speak on their behalf with no review. The agent posted inaccurate instructions which someone else followed.
Those instructions lead to a brief gap in internal ACL controls, sounds like. I'm sorry, but given that the US government gave 14 year olds off incel Discords full access to Social Security data, this is not shocking by comparison.
To be clear, it is dumb and rude to let an agent speak on your behalf _without even reviewing it_.
This will eventually lead to a bigger snafu, of course. Security teams should control or at least review the agent permissions of every installation. Everyone is adopting this stuff, and a whole lot of people are going to set it up lazily/wrong (yolo mode at work).
AI use without checking its output (at least at the moment) is firing without aiming. Sure, you can fire really fast. But who cares if you don't hit what you need to? The point wasn't to just shoot bullets, the point was to hit your target!
I mean, you might make a case that enough of them hit the target that shooting fast is a net win, and accept the occasional friendly fire incident. That might possibly be true. Or it might not. I'm not sure that everyone trying to run fast has really done the calculation, though.
Because a human would have been fired for posting something that incorrect and dangerous
It also doesn’t care.
If there is a year or two between writing your security fuck up and it being discovered the likelihood of repercussions drops significantly.
And there was no test environment to validate the change before it was made.
Multiple process & mechanism failures, regardless of where the bad advice came from.
Now, some people claim that you need to improve the reliability of your productive tasks so you can remove the verifications and be faster. Those people are, of course, a bunch of coward Luddites.
The language of this article is a great example, "... thanks to an AI agent that gave an employee inaccurate technical advice ...".
It should more-correctly read, " ... thanks to the people who made it possible for an AI agent to give an employee inaccurate technical advice ... ".
It is at our peril that we deem it acceptable to blame a black box for an error, especially at scale.
Wow, no mishandled user data? A striking change of standard operating procedure from Meta here.
Actually the later information in the story directly contradicts that, so The Verge probably shouldn’t have just quoted this line if their reporting is in opposition to it.
Regardless, this is one of the more insidious things about these tools. They often get minor but critical things wrong in the midst of mostly correct information. And people think they can analyze the data presented to them and make logical judgments, but that’s just not the case.
The article points out that “a human could have done the same thing” but, between the overly confident tone of the text generated by these tools, and the fact that weirdly people trust the LLM output more than they trust other humans (who generally admit or at least hint when they aren’t actually experts on a topic), it’s actually far worse when one of these bots gets something wrong.
<insert takes long drag tweet[1] here>
I personally find "LLMs can do $THING poorly" and "LLMs can do $THING well" articles kinda boring at this point. But! I'm hopeful that stories like this will shift the industry's focus towards robustness instead of just short-term efficiency. I suspect many decision making and change management processes accidentally benefited from just being a bit slow.
If I post a question to the internal payment team's forum about a critical processing issue and some "payments bot" replies to me, should I be at fault for trusting the answer?
That is politics. Not engineering.
Assigning a human to "check the output every time" and blaming them for the faults in the output is just assigning a scapegoat.
If you have to check the AI output every single time, the AI is pointless. You can just check immediately.
There is a point to using LLMs. They can save time by doing a first pass. But when they do the last pass, disasters will follow.
1. Check frequency (between every single time and spot checks).
2. Check thoroughness (between antagonistic in-depth vs high level).
I'd agree that, if you're towards the end of both dimensions, the system is not generating any value.
A lot of folks are taking calculated (or I guess in some cases, reckless) risks right now, by moving one or both of those dimensions. I'd argue that in many situations, the risk is small and worth it. In many others, not so much.
We'll see how it goes, I suppose.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." ~ Brian Kernighan
So I asked AI to give it a good name, and it said “statistical wandering” or “logical improv”.
https://www.psypost.org/scholars-ai-isnt-hallucinating-its-b...
The AI "led to" the incident , true. But do nt forget that this, like all similar incidents , is a human failure
AI is a tool with no agency. People make mistakes using it, thone mistakes are the responsibility of the humans
Claw AIs absolutely do have agency in the sense of being able to independently perform actions on their own, based on their "understanding" of a goal given by a "principal". I can't think of a better word than "agent" for that.
Can you perhaps share a archive.org link if possible?