If you want nuanced dialog, start with nuance and make room for other people's opinions.
Things I did not say:
- It's reasonable for a program fault to reboot a computer.
- It's reasonable not to check error conditions.
- It's reasonable for a program to halt and catch fire.
I wish I were making up that last one, but I'm not. Here are some unreasonable quotes:
Think of your torrent software. If you crank your firewall to block it while it's running it will not crash. If your disk fills up it won't crash.
No one was saying it was okay for the program to crash.
'Halt and catch fire' is not generally considered a proportionate response
No one was advocating this.
I've no idea where you're getting this 'unplugged the disk' thing from, AV software does not work this way.
The AV software denied all disk I/O. If you have nowhere to put data, and no access to data, then you don't have a disk. You have a paperweight.
As for displaying random data, why would the programmer want to do this?
Obviously, the programmer did not "want" to do this. This is what happens when you try to do GPU programming and the I/O is suddenly cut. I've seen it, which is why I said it.
But your engines shutting down can be permanent or transient. Just like disk I/O failing.
The disk I/O didn't fail. It was completely cut off by the AV program. There was no chance of it resuming until the scan completed, which could take far longer than the 5 minutes required to reboot the computer.
Speaking of which, here's where that stupid "the program rebooted the computer" myth stemmed from:
According to one such report filed by Merge Healthcare in February, Merge Hemo suffered a mysterious crash right in the middle of a heart procedure when the screen went black and doctors had to reboot their computer.
To me, it sounds like they restarted the computer to get the AV program to stop. The program did not "crash so hard it rebooted the computer."
Here's how the program works:
Merge Hemo consists of two main modules. The main component is the actual medical device, connected to the catheters, through which data acquisition takes place. This component is connected to a local PC or tablets via a serial port.
The second component is a software package that runs on the doctor's computer or tablet and takes recorded data and logs it or displays it on the screen via simple-to-read charts.
So we see that the company does not have control over their environment. They have no say over what the doctor's computers are like. They have to live with the fact that the doctors' computers are running Windows, and that they run AV scans. It's not up to them.
This is important, because if the company had independently decided it was reasonable to deploy their software with an AV package, then the fault would lay with the company. But they didn't. Now, what can the company do?
Your point was that the software should behave gracefully in this environment. I agree; that was my point too.
The various people in this thread took what I said and morphed it into something so far from reality that I'm frankly a little worried that people are believing it. If I try to get a job, people might read this and conclude that I'm somehow advocating for 300-second crashes. Seriously?
My sole, singular point was this: Small programs are reliable programs. You can't have bugs in what you don't write.
That means a lot of things. But it does not mean "do not handle error conditions." I didn't even say that this program should exit. I said that the spectacular crash led to pinpointing the AV scan as the source of the issue.
I was called incompetent (indirectly), that my position was "extreme," and that I "denied all possibility of nuance." Ok. Sure.
I've re-read the entire article and this entire thread to double check myself and make sure that my assumptions are correct here, so if you see a mistake, please call it out with a quote.
I agree that I'm now being a little shall-we-say heated, and it's annoying that I'm now doing that because of how much I was provoked here. Actually, this is more amusing than annoying. If the whole world is claiming you came across poorly, then you came across poorly, regardless of what you think. I'm wondering where it all went wrong. So please, tell me: What aggression do you feel I started with? I'm genuinely hoping to learn here.
Isn't this all a little tedious? Why are we even doing this? Aren't there more interesting thoughts to think than litigating what someone did or didn't say? I don't know why this happened, and I don't know specifically what you want. But I'm open to suggestions.
Thank you for the advice. I appreciate it. Sorry for the sour grapes.
As I wrote in my other two posts tonight (more like morning, sigh) is that I tend to tune out emotional and aggressive writing styles. That's probably why my writing style tends to look aggressive. It's just the type of debates I tend to end up in (sigh, again).
So I apologize again if that got you upset at me.
AV software does not deny all disk IO. It just denies write access to a file very briefly and then goes on to the next file as the article stated it was a scheduled scan.
So you do have a disk but a few files temporarily can't be written to (still can be read). The program will get an error code from the write function and can just try to write again.
> The disk I/O didn't fail. It was completely cut off by the AV program. There was no chance of it resuming until the scan completed, which could take far longer than the 5 minutes required to reboot the computer.
This is where you are incorrect. And AV scan does not lock the entire disk for the duration of the scan. It locks and releases each file as it scans them. Fire up process monitor from sysinternals and look for yourself.
Tried it with my AV scanner with a manual scan. Looks like mine doesn't even do a lock on most of the files when doing a manual scan. So at most the file just couldn't be deleted.
> My sole, singular point was this: Small programs are reliable programs. You can't have bugs in what you don't write.
I pretty much agree with you on that one. Smaller programs are more reliable than larger programs.
> What aggression do you feel I started with? I'm genuinely hoping to learn here.
I found it humorous that you saw aggression in my words and that wpietri saw aggression in your words. Me, I just learned to tune out that sort of thing in other peoples posts.
> I said that the spectacular crash led to pinpointing the AV scan as the source of the issue.
And that's what I was arguing about in my original post. You know what Root Cause Analysis is? If not read up about it here: https://en.wikipedia.org/wiki/Root_cause_analysis
My argument was that while the crash identified the AV Scan as a causal factor, it wasn't the root cause. From the wikipedia article: "Though removing a causal factor can benefit an outcome, it does not prevent its recurrence within certainty."
The root cause was that the programmer didn't handle the error code that his file was locked. There are many more causal factors that can trigger the exact same outcome: indexing service, backup program, shadow copy, etc.
Unrelated to our debate, the medical company only blamed the AV software and the IT Staff. Not one mention that their program had a bug.
The fact that their release notes warned against AV software means that they knew their program was deficient. That's what really pisses me off.