Count to ten when a plane goes down (opens in new tab)

(johncbeck.tumblr.com)

1112 pointsUltraMagnus11y ago273 comments

273 comments

Firstly, firing the intern doesn't make sense - it was a mistake waiting to happen and he just happened to do it at the wrong time.

Secondly, the punishment meted out should be:

1. Proportional to the degree of carelessness (in this case not that much since he accidentally hit a wrong key adjacent to the right one, didn't mow down anybody while driving drunk)

2. Inversely proportional to the likelihood of the error (in this case the likelihood was very high since the reset key was a. uncovered/single-press b. right next to single reset key).

3. Proportional to intention (this was a completely unintentional error)

If you say, that the punishment should also be dependent on the degree of damage, I would say that the responsibility of managing the risk of such damage wasn't his but of the person responsible for implementing such a high risk design. If such a person is not around, find the person who approved such a design. Government departments are usually very good with paper trail.

jaredandrews11y ago

The author has actually responded to this on his blog: http://johncbeck.tumblr.com/post/92502108047/so-what-did-you....

    So what did you do after you got fired from the embassy?

    What I didn’t say is that it was the last day of my summer internship. The next summer they invited me back again. Everyone understood it was a mistake, but by officially firing me, someone had been punished … :)

So I guess it wasn't a big deal after all.

zeidrich11y ago

Punishment is not a good way to correct behavior.

Your number 2 is particularly wrong. For punishment to work in affecting behavior, you can't punish for a very unlikely event, especially accidental.

Punishment changes behavior by making people anxious and afraid of the punishment. If you punish something that's very unlikely, it does no good. It's like if pushing ctrl-F restarted the stations, and the guy has never pushed ctrl-F before, and never been punished for it. It's very unlikely that he would ever push ctrl-F. But he happens to trip getting up and accidentally hit ctrl-F while he's catching his balance. Does that warrant a heavier punishment? It's more unlikely, certainly, but what would the punishment change about his behavior?

Punishment works because you are afraid of it. It works because you want to avoid it. But accidents don't happen because of defiance or a rational decision making process.

If you were punished moderately and frequently for a common mistake like hitting F7, it could correct behavior because you would be more vigilant when hitting F6. Having it be proportional to the degree of carelessness in terms of correcting behavior is not important. If someone is more careless, they will get more frequent punishment. If the punishment is too strong, it will just make people fearful instead of correcting the behavior.

Firing someone who consistently makes mistakes is a corrective action, not punitive.

Punishment is generally more of a cultural thing and less of a means of correcting an issue. Punishment is expected, so it's delivered. In western culture we have a particular need to find someone responsible and punish them. Rarely though do you feel "I don't want to get punished, so I am going to do this right." but it's not uncommon to think "I don't want to get punished so I'll avoid this altogether."

Corrective behavior is better when it's not punitive. Look at the design of the software, correct that problem. Look at the systems that allowed this to happen, correct them. Work with the staff and find out why this could happen, help them correct it. If people are punished for writing the software poorly, they're just going to cover up the flaws that they find instead of bringing them to light to correct them. If staff are punished for making mistakes, they're going to hide them instead of seeing if they can fix them.

Punishment is often just a game to abdicate responsibility. "Oh, it wasn't my fault. It was his fault. The proof that it is his fault is that he got punished for it. I've done my part to solve this problem."

Especially in complex environments like corporations and government, I think that the last thing you should do is look for a person to blame. Instead of looking for the person responsible for implementing the design, or the person who approved it. Look at why it was implemented, how it was approved. Instead of pinning it on an individual, pin it on a system.

I think you should only look at an individual if they are committing malfeasance for the purpose of benefiting themselves outside of the system. If the person approved the design because they weren't aware of the potential risk, then find out why. If they approved it because there was supposed to be another safeguard to stop it from accidentally happening, find out why that wasn't there. If they approved it because they gave the contract to their friend who wasn't the best decision, and overlooked issues for a cut, then go ahead and blame them.

If there's a problem with the person, say the designer was just irreconcilably bad, then remove him. If it's a problem with training, then train him. If it was something he did as a greenhorn in the past, and now he's much better, then for God's sake don't punish him for a mistake he made years ago when he was put into a project that was more important than the skills he was hired with, unless he grossly lied about his skills.

nathanb11y ago

I think you're wrong to ignore the consequences of the actions as an input to the punishment. A small amount of unintentional carelessness that causes huge damage could still be punished. One could argue that a certain degree of mindfulness in critical situations is a job requirement, and casually making a careless -- though unintended -- mistake demonstrates a lack of mindfulness indicating that the person is not properly qualified for the job.

I understand that we don't want a culture that fires people for making the sort of mistake anyone might make. But to be so careless on a day that is clearly an exception where something more important than standard business procedure is going on, can't you at least see why firing the intern for such a lack of mindfulness might at least make sense, even if you disagree with it?

I interned for a government organization that maintains hydroelectric dams and the software that controls them throughout the Southeastern US. A careless mistake could -- in the worst case -- cause blackouts, cost the company millions of dollars, or even cost lives (if the data-control feedback loop caused a turbine to spin up at the wrong time or to fail to shut off in an emergency). And, as is quite common in organizations with non-software-engineers running the show, the development processes were entirely haphazard. The environment was such that it would be really easy for me to push unreviewed code, or to make a stupid deployment mistake, or to be careless in a number of ways that the system didn't protect me against.

But it was OK, because they hired smart, competent people who understand the need to triple-check, if necessary, before committing. People who understood the gravity of the situation, and who didn't phone it in if they weren't feeling it that day. If I demonstrated that I wasn't one of those people, I would fully expect to be fired.

woodchuck6411y ago

> I think you're wrong to ignore the consequences of the actions as an input to the punishment.

This equivocates on "consequences" of actions, though. It's obvious that the consequences of hitting F7 before the incident were understood by all responsible to be low enough that any intern could be expected to make the right decision. After the incident, the consequences of hitting F7 were sharply increased such that no future intern would ever be allowed to make that decision. But then you can't make an argument that assumes "consequences" were the same at both points in time.

We make this fallacy all the time probably because we're designed by evolution to reassess the morality of an action based on consequences. It works as a social heuristic for shaming or rewarding people but it makes no rational sense that the morality of an action should retroactively change based on future consequences. You can see similar behavior in our rewarding athletes for profound genetic advantages, or punishing criminals for profound genetic deficits. The consequences somehow redeem or condemn, and they should do neither.

marbletiles11y ago

No, the consequences were the same before and after the incident: a total system reboot. The varying factor here was temporal: it was usually a low-risk action when the office was empty, a high-risk one when the office was full.

The negligence on the intern's part was to make decisions and act without regard for risk as if he was in the low-risk window despite the evidence he was actually in the high-risk one (all the already-active PCs).

It makes perfect sense that the punishment should reflect inappropriate regard being given for known consequences. That's what negligence is.

1 more reply

hackuser11y ago

> it was OK, because they hired smart, competent people who understand the need to triple-check, if necessary, before committing. People who understood the gravity of the situation, and who didn't phone it in if they weren't feeling it that day.

In my experience that is not nearly sufficient for implementing any process that can't tolerate errors. It is necessary to have conscientious people of course, but they still are humans. Given the opportunity for 2,000 hours a year, year after year, they will screw up.

Humans are very bad at following procedures. For recent examples, consider the people operating our nuclear missiles and those protecting our bomb-grade nuclear materials. If even they don't have enough motivation to follow procedure ...

samspot11y ago

I got a strong impression from the article that he had no idea there were extra people there, or that there was even a critical situation. He wrote as though he was doing a mundane daily task, and said he was really surprised his boss was even there. Given those details, he had no reason to have a heightened sense of awareness. He also mentions that a reset of all workstations should have had no impact at the time.

In this situation, the secretary playing the game is just as culpable as the intern, which is to say, not really responsible.

nathanb11y ago

Well, he mentioned that he noticed an exceptional circumstance right when he arrived -- notably, that computers which were usually his responsibility to turn on were already on. You could argue that a less negligent (and more aware) individual would have extrapolated that into a heightened sense of awareness.

67726e11y ago

There is a wholly different level of responsibility and ability to ensure quality in your scenario that is simply not present in the article.

In one case accidentally pressing the wrong key deleted incredibly important data, while in your case you have plenty of time to review and ensure quality at your leisure.

acdha11y ago

He's used a longer version of that story to view this as a management lesson – it's definitely not just “blame the intern”:

http://globis.jp/774-2

kyrra11y ago

Sorta weird how the 2 articles differ from one another about being "fired".

Count to ten article:

> I, naturally, felt terrible and was, appropriately, fired.

Honesty Wins article:

> But, naturally, that day was my last day of work at the American Embassy. But, not because I was fired; although, I might have been fired if that day didn’t just happen to be the last scheduled day of my summer internship.

acdha11y ago

Bureaucratic jit-jitsu:

http://johncbeck.tumblr.com/post/92502108047/so-what-did-you...

“Oh, don't worry boss, we sacked the fool who made that mistake!”

genericuser11y ago

I find it surprising that the exact key he fat fingered wasn't burned into his long term memory. Not that it would of been helpful to him down the line, actually the level of obsessing which would of taken place after the fact to remember it this many years later would of been quite counter productive.

It is simply that given the simplicity and consequences of the error, it is the type of thing I generally see people beating them selves up over until they can not forget it.

(In case you are saying to yourself but he did remember the key, look at the two versions in one he says F6 machine reboot F7 all reboot, and in the other he says F7 machine reboot F8 all reboot, indicating that while I hope he knew the keys functions then, he has since forgotten the exact key, or is substituting F keys for story telling purposes)

HCIdivision1711y ago

I bet this entire article comment thread would be completely different with this additional context. Thanks, it certainly paints a better light on the situation!

mikegreco11y ago

The author states they felt it was appropriate when they were fired. In what world would it be appropriate to get fired for a single, simple, incredibly easy to make mistake? Doubly insane when there were exactly zero safeguards in place to prevent the mistake from being made.

dimitar11y ago

According to his next post:

What I didn’t say is that it was the last day of my summer internship. The next summer they invited me back again. Everyone understood it was a mistake, but by officially firing me, someone had been punished … :)

basicallydan11y ago

Well, that clears it up - and it's quite a clever way to tick the arbitrary "someone took the fall" box.

apples2apples11y ago

Itoh should have been fired.

1 more reply

Scuttles11y ago

Makes sense (and is convenient). The people at the embassy had to have someone to blame, and report back to their superiors that they had dealt with the situation appropriately.

marbletiles11y ago

The world where you know you that a) you have an incredibly powerful key with no safeguards at your fingertips and b) you might be in a breaking news situation and nonetheless you go for the key right next to the dangerous one carelessly enough that you miss?

Think about it: Unix is equally as "insane". If you're the guy on the console who meant to clean out some crap dir and accidentally typoed "rm -rf /" and then caused an international crisis you're going to get fired too.

Then years later HN will call for Dennis Ritchie to get fired instead.

megablast11y ago

Also, the situation where somebody has to be fired.

I imagine that someone wanted someone's head, so whose head should it have been? They guy who wrote the system couldn't be fired, he was in a different company. And maybe a macro has been assigned to that key, so it wasn't his fault anyway.

colanderman11y ago

The person in charge of minimizing risk to their internal systems.

Unfortunately, most small companies have no-one who fills that role, or if they do, it's the same person who both has the power to fire others, and is unwilling to entertain the notion that they themselves are at fault.

nmjohn11y ago

Except he didn't know there was a breaking news situation. He said pushing the wrong button wouldn't normally be a big deal.

marbletiles11y ago

He said he came in to find the system running, and the only time that happened was when Washington was waiting for info.

And even if you don't buy that, if pushing the button's not a big deal it doesn't need all the safeguards everyone's yelling for. (Had such safeguards been in place he might equally well have seen them, thought "oh, nobody's in, this will do what I want anyway" and approved it).

2 more replies

kstenerud11y ago

The worst unix disaster I ever saw happened to one of my co-workers. He was working on a client machine, logged in as root because he needed to compile and install some complicated software. As he was working, he did an ls -l /bin and copy-pasted it to a text editor so he could make sure everything was installed correctly. Unfortunately, after returning to his console, he accidentally hit paste. Most of /bin was actually symlinked somewhere else. As you know, ls shows symlinks like this:

lrwxrwxrwx 1 root root 20 Apr 27 17:02 cc -> /etc/alternatives/cc

Guess what happens when you paste a whole list of those into a console as root?

lcedp11y ago

That's fascinating.

To prevent this from happening with me, I've added the following line to my `.Xdefaults`

    URxvt.perl-ext: confirm-paste

(I'm using `rxvt-unicode`)

http://i.imgur.com/joHRXaH.png

talmand11y ago

The worst unix disaster ever? Could you elaborate for non-unix people such as myself?

1 more reply

Tyr4211y ago

Oh no, that redirect!

baddox11y ago

But typing "rm -rf /" is significantly harder to do accidentally than typing F7 instead of F6.

graylights11y ago

Not really, a lot of novice unix users are of the habit of removing files with -rf switch. I cringe everytime I see it.

The command "rm -rf ~/blue/" is just a single space key from being equivalent to "rm -rf /" with "rm -rf ~/blue /"

5 more replies

svachalek11y ago

The / key was right next to Enter on a lot of old keyboards. It was quite easy to type 'rm -rf /tmp/garbage*' and have a simple fumble turn it into 'rm -rf /'. I mean, there's this guy I know, he did that once.

2 more replies

thefreeman11y ago

Not really, on my keyboard at least / is directly next to . and you could feasibly be clearing out a directory or something with rm -rf .

1 more reply

colanderman11y ago

Eh, "rm -rf $TEMPDIR/$TEMPFILE" in a shell script is just a couple typos away from deleting everything on the network. Yet I've seen people put crap like that in build scripts even after they've previously inadvertently deleted half the network drives.

Fortunately, despite rm's poor choice of options and bash's poor default handling of variable name typos and the obvious PEBKAC, backups saved the day here.

People are human. Policy ought to reflect this.

sentenza11y ago

Be honest, who doesn't have

rm -rf *

in their shell history? Now it's just one accidental twitch away.

1 more reply

pjc5011y ago

Many years ago I wrote a kernel module for my own use in response to a similar incident. It checked to see if the calling process was deleting a file called ".landmine" and killed the calling process if it was.

Far from perfect - it depended on the order of deletion - but a more general solution than preserve root. Of course it still requires the user to mark things they consider "important".

canjobear11y ago

Are you trying to say it's OK because Unix behaves analogously?

It is definitely a problem with Unix also.

marbletiles11y ago

It is. Which is why everyone in Unix who types "rm -rf " then types their next character _very carefully_ and reads the line before committing.

I'm trying to say that when you've got something dangerous without safeguards, you take care around it. Not taking care of known-dangerous things and causing severe damages as a result is an arguably good case for dismissal.

1 more reply

mreiland11y ago

rm -rf / is not quite the same as the f7 key restarting all machines sitting right next to the f6 key to restart a single machine.

you cannot fat finger rm -rf /

claudius11y ago

  # rm -rf /tmp/bla

  # rm -rf / tmp/bla

1 more reply

astrodust11y ago

You can fat-finger enter before you're done typing.

1 more reply

thrownaway242411y ago

You forgot the part where the only reason he was fucking with the F6 key in the first place was to play a game. That's irresponsible and grounds for firing.

ghayes11y ago

Actually, from the article he was a system administrator and another employee had been playing a game which froze her own terminal. The author did nothing wrong except press the wrong button (and to your point: not report his coworker for playing games on her terminal in her free-time).

1 more reply

nerfhammer11y ago

He publicly embarrassed USG, POTUS and a major US ally. SK is going to call up the state dept and demand an explanation. Someone has to be fired. This isn't some startup in California where everyone just plays it cool. The termination of his boss and his boss's boss and his boss's boss's boss all the way up were probably considered as well.

jheriko11y ago

I think you will find they embarssed themselves.

Responsibility flows upwards, not downwards. Its just unfortunate that the people at the bottom are often carrying the people above far more than they should...

pfisch11y ago

He was probably fired because they realized they couldn't put a summer intern in control of such a critical system. I would make the same call.

mikegreco11y ago

If you go with this line of reasoning, whoever put him in that position should also be fired, and their boss should be fired for putting someone in charge who made such a poor decision in the first place.

jasonwocky11y ago

The person that should be fired is always the person who has responsibility for the amount of budget represented by the loss.

e.g. No intern that needs permission to get a box of pencils from the supply closet should ever be fired for putting a mistake into production that costs a company $100,000. If a company loses $100,000 on a mistake, you look to the person in the hierarchy who manages budgets of that size. It's their job to make sure the safeguards are in place to prevent losses like that.

In government it's difficult but not impossible to put a dollar value on losses like this. In this case, whoever was in charge of that network, and could request budget to build safeguards (whether software or training) against such mishaps, was ultimately responsible. Firing the intern is just shit rolling downhill.

2 more replies

hueving11y ago

Only part of that makes sense. The person putting an intern in such a position of control should be reprimanded, but it wouldn't make much sense for the next level higher because there isn't a blatant mistake. Hiring someone that turns out to make a mistake isn't as blatant as giving an intern the power to shut down a mission critical system.

toyg11y ago

They might have been, we just don't know.

pessimizer11y ago

That may have happened.

dm211y ago

I do agree with you.

An equally sufficient solution would have been to install a safety switch on any button with that much importance. Something like this but probably smaller, or just a plastic cover that fit over the F7 key: http://www.thinkgeek.com/product/15a5/

A fireable offense would be lying about the action or trying to cover it up.

Should the lady who asked for the reset be fired for playing a game and asking him to reset the computer?

Should the technician that didn't install some sort of safety be fired for not foreseeing this issue?

He would be much less likely to make the same mistake in the future than the person who would replace him.

If there are terminals that could erase a presidential report and there is no backup available, you send a non-critical staff member to guard every one of those terminals, or at least put a sticky-note in the middle of the monitor.

I'd say several other people deserved to be fired for this, but the intern was not one of them.

mseebach11y ago

Appropriate or not, it's probably what was to be expected in what I imagine even in 1981 was not the most enlightened HR management regime (the US foreign service). Also, 32 years is a lot of time to wash away the bitterness of having been unfairly fired from a summer job, especially if you, as the author, ended up doing pretty well for yourself.

Also, whether or not it was appropriate is completely irrelevant to the story being told.

emn1311y ago

I expect the HR management back then was more enlightened than it is now - the quick-to-pounce media and politically instigated witch hunts (terrorism, save-the-kids, etc.) have ensured that cover-your-ass is more and more a necessity.

billmalarky11y ago

That sounds like rose colored glasses. The cold war had its fair share of witch hunts. Granted this was the 80's not the 50's but lets not forget McCarthyism was based around democracy vs communism.

"That Korean announcement and the slow response by the US President—both caused by delayed real information—caused decades of conspiracy theories."

2 more replies

ufmace11y ago

On the other hand, they're hiring people to do these things, not computer programs. Shouldn't we expect them to notice that when doing routine-thing-x, it's awfully easy to accidentally do catastrophically-dangerous-thing-y, and thus it would be a very good idea to be extremely slow and deliberate when doing routine-thing-x?

I have to routinely create and drop databases on my local system. Our production databases, which I also have to connect to, contain hundreds, maybe thousands, of person-months of work. I realized that it would be a good idea, before issuing DROP DATABASE commands, to deliberately stop and double-check what server I'm connected to. Luckily, I haven't screwed that one up yet.

logfromblammo11y ago

I sure am glad that I never accidentally pushed an unlabeled and unprotected "get fired immediately" button. If it is important to not have all the workstations on site shut down at once, go to the control terminal and disconnect the keyboard before the system startup employee comes in without any clue as to what is going on and starts his ordinary daily routine. Maybe write a note and wedge it into his keys?

Based solely on the shortened account, it was not appropriate at all to fire him. Convincing him that it was is just doubly inappropriate. There may be more to the story, but as it is, it looks like angry scapegoating against a hapless, lowest-level employee.

rrss112211y ago

When so many people higher up are given incomplete information or even downright embarrassed on the world stage because of one simple mistake, I feel it is appropriate. It is still, however, doubly insane that one person's stray keystroke can do all of that.

sergiotapia11y ago

Insane, but somebody had to pay for that screw up and you can bet your ass it wasn't going to be the guy managing the newbie 23 year old. It'll be the newbie 23 year old himself.

ProAm11y ago

attention to detail is a more valuable skill than people realize

chengiz11y ago

IMO cleary he is not telling us the entire story.

njharman11y ago

In the world of bureaucratic need for scapegoats.

endtime11y ago

> In what world

Japan, I guess? I've never been there but the story was consistent with my impression of their work culture.

jpatokal11y ago

This was the American Embassy, which follows American work culture.

Also, in Japanese companies, it's basically impossible to fire people. They can, however, be assigned to a desk in a windowless room and be given nothing to do for several years, until they take the hint and "voluntarily" quit.

patio1111y ago

c.f. http://www.nytimes.com/2013/08/17/business/global/layoffs-il...

Suffice it to say that I am aware of situations created by a societal expectation of lifetime employment which make the above article look positively sane. (And I recently learned that, in some cases, what I had assumed was just an ironclad social contract actually is legally enforceable, which blows my mind.)

1 more reply

wildpeaks11y ago

The Japanese way is good in the sense that, as long as you have Internet, you could make your startup without worrying about putting a roof over your head or finding an office space to work from, and you get a still-full salary to bootstrap it without having to put the time.

You even get access to a pool of other soon-to-available engineers to work with if you're stuck with other poor sods in the room.

Definitely another scenario than the being suddendly kicked out of the door by security right before the week-end with a box of your belongings and, if you're lucky, a tiny check to not starve until next week.

1 more reply

HillRat11y ago

It was the AMEMB in Tokyo, though the basic principles apply to ay bureaucracy answering to political masters. Interns don't get AFSA (or, at the time, AFGE) union representation, and somebody's head was going to roll for that mistake, even though the company that programmed a non-confirmed global reset into a single keypress was truly at fault. Fair? Nope. Inevitable? Yep.

mathattack11y ago

Lots of fields are like that. You generally get paid better because of the risk. (I hope he was!)

ck211y ago

How about when Russia returned the data recorders after years of refusing to South Korea - made a press spectacle of it - and then South Korea discovered the recorders were empty and missing the data tapes when the press was gone.

Or the US navy crew who received medals after shooting down the Iranian airline.

Once there is loss of life, it is 100% politics afterwards with little to no practicality, just look at all the mass shootings where there were zero changes afterwards. We simply do not value life, it is politics first.

nness11y ago

> Or the US navy crew who received medals after shooting down the Iranian airline.

You make it seem like they received the medal for having shot down the plane. In reality, those who were awarded medals, were awarded Tour of Duty medals for their time spend in a combat zone. I believe the distinction is important, particularly since that class of medals are routinely awarded to individuals during their time in the military.

ck211y ago

If a police officer shot and killed innocent bystanders, should they get achievement awards for doing their job otherwise?

My answer would be no, you failed at your job regardless.

Same thing with military.

mason24011y ago

They didn't fail. They were ordered to shoot down a plane, and they shot it down.

maaku11y ago

That's a very narrow minded position.

Gosh I hope you never screw up even once, cause you'll never live it down.

3 more replies

jackschultz11y ago

The issue form most new stories is that even when the truth comes out, the great majority of people will never hear the actual facts. One issue is because news stations move on from caring about the story quickly. Or, the bigger issue in my opinion, is that people won't believe the new, correct facts since the old ones will have been engrained in their head. Solving both these issues would be really helpful for society, but are obviously damn hard to solve since we haven't really gotten anywhere in this space.

Aardwolf11y ago

When there is such chaotic news story, I usually switch from news to Wikipedia. That has all the facts and continues the story even after all media lost interest.

justizin11y ago

It's a sign of poor management that someone has to be fired when something goes wrong, outages are learning situations for all involved, and it is widely held that the person who took the action that caused an outage is not responsible, but that all involved are responsible.

See John Allspaw's Swiss Cheese Theory : http://www.kitchensoap.com/2012/02/10/each-necessary-but-onl... .

[ Edit: I guess it's not Allspaw's model, but he applies it to systems engineering rather well - http://en.wikipedia.org/wiki/Swiss_cheese_model ]

"Accidents emerge from a confluence of conditions and occurrences that are usually associated with the pursuit of success, but in this combination—each necessary but only jointly sufficient—able to trigger failure instead."

The person who pushed the button is not at fault, the manager is not at fault, the guy who designed the button is not at fault - all are jointly responsible.

Blaming the intern does, however, reflect extremely poorly on Itoh and everyone else in the chain of command. A superior who demands retribution for a simple mistake that happened to cause him or her pain is basically worthless.

But, I forget, we're talking about Ronald Reagan.

tokenadult11y ago

I remember this time sequence very well because I was living in Taiwan when the incident happened. Yes, people who lived in east Asian time zones saw news reports that appeared to be based on knowledgeable sources that the plane might have landed safely with all passengers alive. This explanation of why the Western-aligned diplomats and military officials based in east Asia didn't have complete information when they were interviewed by the press is quite interesting, and explains puzzling memories I have from that day.

jere11y ago

>And let’s hope that there is no stupid 23-year-old with his finger on an important keyboard in this information chain.

No. This is something you would read in Design of Everyday Things where Don Norman would totally shame the the engineers who made that system. Software shouldn't be designed with the assumption that no one makes errors.

ak3911y ago

What I find incredible to believe is that this problem could have happened without the F7 erroneous keystroke by a human. A simple power outage could have resulted in this exact same catastrophe.

Why didn't the backups work? System wasn't "robust" enough. (Did I just use the word "robust"?)

tootie11y ago

Alan Cooper's About Face is a great book on interaction design for computers and one of his axioms is "Hide the escape lever". Basically make sure the ejector seat control isn't right next to the throttle.

kosei11y ago

Really brave of the author to share this story. I know most people would be afraid to admit this kind of a public "mistake".

dictum11y ago

Further reading for those who want to be disabused of the concept of human errors:

https://en.wikipedia.org/wiki/The_Design_of_Everyday_Things

abcd_f11y ago

> That Korean announcement and the slow response by the US President — both caused by delayed real information — caused decades of conspiracy theories.

I appreciate that the OP was a part of the situation, but conspiracy theories were not caused by this.

It was time of very high tension between the US and Soviet Union. So when a plane veers off the course into not just Soviet airspace, but into an explicitly cordoned off top secret area, ignores all communication attempts, ignores the presence of fighter jets and just keeps on flying, then the situation itself is a fertile soil for conspiracy theories.

brudgers11y ago

'It was a time of very high tension' doesn't quite capture how different it was.

Through the glass of a yellow newspaper box, the Miami News headline that the Soviets had shot down a plane carrying a Congressman. My first thought was "This is the war." Not 'a' but 'the'. The primary stance of the US military was squared off against the USSR and had been for more than 30 years.

steven201211y ago

Wow. I got goosebumps when I read that article. I'm old enough to actually remember when KAL 007 was shot down, and while I wasn't old enough to hear about the conspiracy theories, I do remember the thing about people being safe and landing in Russia. To think that this was just a small mistake on the part of someone, which caused international ripple effect, and who later blogged about it is really something incredible.

userbinator11y ago

"With great power comes great responsibility."

Incidentally, "features" like this are why I don't trust systems that have some centralised control - IMHO giving any one individual (or organisation, in many cases these days) such power over others is not a good thing.

disputin11y ago

Scapegoat. The ritual expulsion of the evil spirits wrapped neatly in a little parcel to appease the elders and thereby prevent them blaming each other - harmony continues in the hall of power. Meanwhile the problem was in the process, not the employee, so nothing has been fixed, and the guy who had learned the lesson is no longer there, and so the problem will recur with the next lamb to the slaughter.

ryanobjc11y ago

I disagree that it was appropriate that you were fired, but interesting story all about.

kysol11y ago

For security reasons I think that it might have been justified, considering the events that had just taken place. Still a crappy way to go out though.

brown9-211y ago

What security reasons? Firing the author didn't change what had already happened.

kysol11y ago

People were bat shit crazy in the middle of that Cold War. If someone randomly decided to turn off machines without notice, even if they said "whoops accident, my bad", their actions would have instantly thought of as sabotage.

I'm not agreeing with the outcome.

sitinaud11y ago

That fact that he wasn't too concerned about having accidentally reset all the computers in the building suggests that he may not have had an appropriate temperament/attitude for a sysadmin managing critical systems.

Karellen11y ago

Or, you know, he had a perfectly good reason to think that accidentally resetting all the computers in the building at that time would not be a problem:

"Not long after I arrived in my office, I received a call from a secretary in the Agriculture Department who liked to play a computer game before her workday started. Her favorite game had a bug that regularly froze her workstation. [...] I realized that I had mistakenly hit F7 and reset all the workstations in the embassy. This realization didn’t bother me much, because no one except the Agriculture section secretary was usually on the computer system this early in the morning."

I'm sure I'd have thought something like: "Phew! Glad I made that mistake now, rather than at 11am when everyone was half-way through their morning's work. Likely no harm done at all, and I'm going to be really careful with that command in the future. Yup, definitely dodged a bullet there..."

sitinaud11y ago

Yes, he had a pretty good reason to think that probably no major damage was done, and this was sufficient to comfort him. This kind of carelessness about the possibility of causing harm or having caused harm suggests to me that he wasn't taking his responsibility as seriously as he should.

1 more reply

lotsofmangos11y ago

I often suspect that most of the work involved in keeping a power hierarchy going, is involved with trying to pretend that this kind of shit doesn't happen all the time.

lamontcg11y ago

And then conspiracy theorists latch onto this kind of shit, but believe that it must be malicious silliness...

Why didn't Reagan respond immediately? Well, he was waiting to hear from Chancellor Gorkon that the KAL flight had been successfully beamed aboard and was en route to Pluto, of course... Clearly they'd have their shit together better so it couldn't have been a 23-year old rebooting all the computers accidentally and wiping out hours of critical work -- that would just be ridiculous...

Loughla11y ago

Conspiracy theorists are just 20th and 21st century prophets, really. They search for meaning in an all too often meaningless world.

It's comforting to think that people can control the direction of every choice in the world, and that someone is at the helm.

It's uncomfortable to think about the daily series of random, unconnected decisions that drive the direction of our species.

lotsofmangos11y ago

I'm not sure that it is more comforting to think that there is someone at the helm, as much as anyone who aspires to be considered to be at the helm has to keep pushing that story, so it gets repeated more often and with better special effects than the story about there not being anyone at the helm.

Actually being in control of stuff is very difficult, but convincing people that you are in control of stuff is pretty easy as we are all suckers for narrative. The main ways to disrupt a power narrative is to spread other narratives or for a situation to occur that upsets the existing narrative, so getting people to make up new ones. This explains why totalitarian governments can collapse so quickly, which wouldn't be possible if the people running them were actually in control of anything.

privatedan11y ago

My first thought after reading the article was that it was ridiculous to fire/scapegoat the author for hitting the wrong key, too. This has happened to me before, where a single keystroke ( in my case, a line break in a config file ) caused me to take down a production system. My punishment? Designing a more robust system that would protect itself from a badly formatted config file. To this day, ten years later, a similar error has not been repeated, despite several attempts of people to push bad config files to our production systems. If I had been instead fired, no doubt a similar, but perhaps not exact, error would have been repeated every year or so.

If I had made the same mistake twice without any attempts to fix the situation long term, then, yes, I think that would have been a fire-able offense.

If you're working with people who care primarily about their own positions and egos without regard to the team as a whole, well, be prepared to be thrown under the bus when it comes time for those people to protect themselves.

Schwolop11y ago

Thanks for posting this. I found https://news.ycombinator.com/item?id=8062683 yesterday but yours appears to be the direct link to the author's blog, which I had missed.

joewaltman11y ago

Great story....thanks for your willingness to share.

peterwwillis11y ago

> On this day, I highlighted her workstation and hit the F6 key to reset. But my screen went temporarily black and then seemed to be starting again. I realized that I had mistakenly hit F7 and reset all the workstations in the embassy.

Ugh.

Those with automation capabilities: keep this lesson in mind, because it will happen to you in production one day. 'dsh -a reboot' is incredibly easy to type and can have disastrous effects. Creating abstraction layers around common admin tasks can help catch simple mistakes and give prompts before dangerous behavior.

jheriko11y ago

I hope you fought that firing...

... incompetence like that comes from having F6 next to F7 and no checks or authorisation needed for a potentially dangerous action etc. Processes should be designed for people to make the common mistakes... its what they do.

jheriko11y ago

nm. just seen the follow up. :)

hyperliner11y ago

"My boss, a >> Japanese << computer engineer named Itoh, poked his head in the door. "

hmmm, I am pretty sure Mr. Itoh was not Japanese working in the American embassy. I am pretty sure he was American.

billmalarky11y ago

If he was a first gen American his culture would have been greatly shaped by Japanese culture.

joshuaheard11y ago

I thought the headline meant to count to ten when a plane goes down...while you are in it!

instakill11y ago

Me too. I still don't know how counting to 10 will help you press the right key on your keyboard though.

feld11y ago

Reset all computers in the embassy with F7? No warning prompt?

Fire the idiot who wrote that function.

smacktoward11y ago

In fairness, it was a different world back then. There were so few people administering computer networks that you could generally assume someone who was doing so had been thoroughly trained; and the thing about highly trained people is that they tend to view things like failsafes and safeties as pointless time-wasters.

"I know what I'm doing when I hit F7, but the damn system makes me sit there for 30 seconds before it does what I told it to do! Piece of junk."

The result was that software in that era tended to come with a lot more sharp edges. The age of the Recycle Bin that would save you from yourself didn't arrive until administering systems became something the general public was expected to do.

tzs11y ago

Ahhh....the days of sharp tools, no failsafes, and young programmers or admins.

I recall that time I wrote a batch manager for the VAX 11/780 at Caltech High Energy Physics. It consisted of a program to monitor the batch queue and start jobs as scheduled ("BATch MANager", or "BATMAN"), and a program for users to submit jobs ("Run Overnight Batch INput", or "ROBIN").

The configuration file for BATMAN was stored in /etc/batman.

During development, I occasionally had to "rm /etc/batman". Of course, out of habit, as soon as I typed "/etc/" my fingers would automatically type "passwd", and once I did not catch this in time. Oops. It happened to be a Sunday morning at around 7AM, and I had to call the other admin, who handled backups, to come in and restore that file. He was annoyed.

The second time I did this, he was pretty pissed.

The third time, I fortunately had been working at the terminal we had in the machine room, and managed to shut down power to the machine before the write buffers were flushed, and the file was OK after fsck. I didn't have to deal with an angry co-admininstrator that time. Just angry physicists.

The other admin (Norman Wilson, in case anyone knows him or he reads HN) then made a link named /etc/safe_from_tzs to /etc/passwd to stop my nonsense once and for all.

That worked until the first time I wanted to overwrite /etc/batman instead of rm it.

That led to a cron job that maintained a copy of /etc/passwd in a separate file, and periodically checked to see if it were missing or misformatted, and restored it if so.

dkural11y ago

One would think after the first two times you'd find a better way to do this, realizing your infrequent but habitual mistake. Why didn't you change any of your practices after the first two screw ups?

3 more replies

viraptor11y ago

> you could generally assume someone who was doing so had been thoroughly trained

No amount of training can prevent something like this. It's like today's browsers where the tab can be closed with ctrl+w and the whole window with ctrl+q. It doesn't matter how many times you've done it and how used are you to the position of the 'w'. One day you will close the whole window by accident.

thefreeman11y ago

Personally I agree. Mistakes happen, everyone has accidentally hit the wrong key at one point or another in their life. I was pretty surprised how seemingly fine he was with being fired. At the same time, I guess the net result of the mistake was big enough that it did kind of require a response, and it has been about 30 years since it happened.

2 more replies

sliverstorm11y ago

IMO the fact that we have multiple examples of these kinds of accident-prone key pairs does partially exonerate whoever did this particular F6/F7 bit.

1 more reply

jessaustin11y ago

...the whole window with ctrl+q.

OMG I've never done that but now that I know about it I'm very afraid. If I do it tomorrow I'm blaming you.

4 more replies

steven201211y ago

No, this isn't true at all.

The power switch on the IBM PCs were way at the back so that people couldn't unintentionally reset the computer. The same thinking went into Ctrl-Alt-Del, which was a combination that people wouldn't accidentally hit.

So having a system where F7 would reboot the entire system was pretty dumb, even in the early 80s.

RachelF11y ago

Probably not. The power switch was at the back because that was were the power supply was.

IBM didn't put much thought into safety. You could blow up the early IBM's if you turned on the monitor (screen) before the CPU box.

gry11y ago

I can't find information why the design(s) were as they were. The design of Ctrl-Alt-Del was intentionally unintentional.

Gates noted Ctrl-Alt-Del should have been one button, not three [1]. David Bradley, the inventor of the trifecta, did make it deliberately difficult to reboot, however, it was also originally an Easter Egg which made it to production [2].

[1] http://www.theverge.com/2013/9/26/4772680/bill-gates-admits-... [2] https://en.wikipedia.org/wiki/Control-Alt-Delete#History

1 more reply

flomo11y ago

Meanwhile, the Apple II+ had the reset button right above the return key.

1 more reply

CoolGuySteve11y ago

Another alternative is confirmation fatigue.

As in: I hit F6, "Do you want to reboot this?" dialog pops, I hit 'Y', "Do you really really want to reboot this?", I hit 'Y' again.

Instead of actually reading what it says, you just instead press F6-Y-Y in quick succession.

Modern interfaces sometimes make you type some kind of string to confirm, but most either use a password (like sudo) or some hardcoded string that everyone eventually memorizes.

But even today, Windows 7 only makes you click that one button in UAC, and most people probably do it without even thinking about it.

djur11y ago

Whenever I implement a bulk delete feature I tell the user how many records they're about to delete and ask them to type it back in.

If it's possible they're trying to delete data from the wrong place (say, an administrative account that manages many customers) another safeguard is to have them select the name of the context (customer name, etc.) out of a list of four or five nonsense alternatives.

The user experience tends to involve a lot of double takes and rereading, which is precisely what I want.

1 more reply

btilly11y ago

I have idly considered addressing that problem when it really matters by asking multiple random questions whose answers need to be some combination of "Y" and "N" to proceed. With the result that you simply cannot engage in muscle memory.

Anyone using the app would hate me.

2 more replies

rickmode11y ago

I believe UAC works well even when a human always hits OK. The point is to make sure a human is there and trying to do something and not some malicious code.

austinz11y ago

This is very true, which begs the question - why was a 23-year-old summer intern placed in charge of the embassy computer system, or even given access to its central console? He may have have self-taught facility with computers, but I find it hard to believe he'd have much experience at that age with the sort of large, mission-critical institutional computer system described in the article.

I wonder if he had a supervisor sysadmin that he was working under, but given how he described his boss, that seems unlikely as well.

analog3111y ago

You might be surprised. What I observed during that time was that a lot of, if not most, senior people were clueless about what computers were actually doing. The potential "reach" of that intern might just not have occurred to people.

And we still have national security / diplomacy disasters resulting from relatively low level people having access to vital computer systems beyond anybody's imagination.

gambler11y ago

You're invoking an annoying and ridiculously overused false dichotomy that is as false today as it was 30 years ago. An interface that has fail-safes does not have to be annoying and clunky. In fact, interfaces that are annoying and clunky are a great contributor to human mistakes, because they require a lot of rote action, which encourages people not to pay attention and work on auto pilot.

If something has changed over these years, it's the overall understanding of design principles and their popularization. (Thanks, Don Norman and other people in the field!)

Training has nothing to do with it. Even if you can train a person to work with a badly designed system without making mistakes (often), designing the system well in the first place is almost always significantly easier and cheaper.

For example, accidental key presses can be easily prevented by requiring the user to type a command of reasonable length. Typing "reboot-all-workstations" is not that difficult, but it would definitely prevent the incident described in the article.

SolarNet11y ago

I think he was pointing out why it was built that way, not why it was that way. Today we know it is a false dichotomy, he was pointing out, back then, the world thought it wasn't.

nightpool11y ago

You know, I think that was the point of the comment. I might just be reading it wrong, but it seemed to me that he was decrying such interfaces, using obviously hyperbolic language like "pointless time-wasters" (in the context of something other people believed) and "the damn system makes me sit there for 30 seconds". It sounded to me like he was even slightly making fun of that worldview.

1 more reply

zaidf11y ago

You're not taking into account the fact than what is considered bad and clunky interface today may have been the best in class 30 years ago.

efnx11y ago

Even allowing for the era, providing the opportunity for a possibly disastrous effect to take place due to someone dropping the keyboard or knocking it with their elbow is inexcusable. Accidents happen - they should be minimized and steps should be taken towards quick recovery but they should be expected.

ern11y ago

It wasn't until Windows 95 (or was it 3.1?) that usability and standardization were taken very seriously in the PC world. I recall vaguely using an 80's-era DOS-based word processor at my father's office, and pressing "F1" for help, and wiping out my work.

Gravityloss11y ago

If you visit countryside museums with old agricultural or workshop equipment, you can verify they are basically all maiming devices. Things used even in the 1950s. In one machine you push a piece of wood downwards and there's a blade that hacks slices off the bottom. Real tough men pushed right till the end so there was no waste. They often had some missing fingertips. Some really small changes made those accidents avoidable, like using some other pieces to hold the worked piece.

swilliams11y ago

Holy moly. That's almost like the one Far Side cartoon with a "Wings Stay On" "Wings Fall Off" switch. http://imgur.com/AosYvGn

oh_sigh11y ago

There was probably something like a warning prompt that came up, but he may have been used to disregarding the message because it was always what he intended to do.

dade_11y ago

Two servers side by side. Broken KVM that only switched the monitor. The screen was Windows NT, but the keyboard in front of the monitor was connected to another server running OS/2. Ctrl-Alt-Del The Windows screen did nothing, but the sudden hard drive activity on the OS/2 server told me what I needed to know. Not thirty seconds later, I had visitors. I was around 21 at the time, so yeah... Experience.

mseebach11y ago

You know what's totally plausible? That hitting F6 and F7 prompted for confirmation, but using the same prompt - and he just hit "Y", "Return" as he'd done 1000 times before (just like everybody does with the UAC dialog in Windows 7 today), and that bit didn't make it into the story because it's totally irrelevant to the point he's making. Heck, it probably wasn't even F6 or F7. If he's a normal human, he likely can't remember.

hueving11y ago

It is possible there was a warning for both functions and the operator just acknowledged the warning assuming it was for the single reset without reading it. This is a common problem with warning too often in interfaces.

baddox11y ago

It's still pretty obviously a horrible idea to have the two keys right next to each other with confirmations that are even remotely similar.

xophe11y ago

You must have missed the main point of this essay, so I'll reiterate here.

"take a deep breath, count to ten"

mullingitover11y ago

Stole the words right out of my keyboard. Who puts the 'reset every workstation in the building' button right next to the 'reset just one workstation' button with no confirmation prompt?

fekberg11y ago

He might have been that times IT Administrator? We've come a long way over these past 30 years, but think about it for a second, if you're an IT Administrator today handling all the office machines, you probably have the power to click 1 button to turn them all off? (Depending on the setup of course..) There's obviously more access rights involved today.

Doesn't answer "who" put the button in there, I'm just thinking out-loud! :)

Florin_Andrei11y ago

> if you're an IT Administrator today handling all the office machines, you probably have the power to click 1 button to turn them all off?

To err is human. To push the error to thousands of instances online at once - that's devops.

matchu11y ago

The issue wasn't whether he allowed to press it; it was just way too easy to press accidentally.

Even if I have the authority to press the big red button, I'd like it to be behind glass and far from the light switch.

jauer11y ago

It could very well have been a minicomputer system where he was on the operator console and the other "computers" were terminals.

krrrh11y ago

crontab -e crontab -r

samstave11y ago

crontab -ri

raverbashing11y ago

I don't remember how it is on the newer versions, but try using fdisk in an old linux distro and see how many confirmations you get when deleting a partition.

Or just the old rm -rf /

Older stuff asks for much less confirmations.

taylorbuley11y ago

No thanks. It makes me shudder to just see this printed in the comments section!

grecy11y ago

I was leaving my old job and handing in my MacBook Pro. After getting permission from the network guys (who were going to format it anyway) I ran rm -rf / as root.

It was surprisingly boring, taking a very long time to delete all my files (I should have deleted them first). Eventually it got around to deleting fonts, which caused things to render a little strange, but after 60 minutes nothing much had changed and it was still chugging, so we shut it down and went to the bar for my last "Friday night drinks".

2 more replies

jqm11y ago

It's still the same on some Linux distros at least. I wiped a Slackware install last year carelessly attempting to fdisk an external drive. I caught it a few moments in, but it took a bit of doing to get the data back and I had to re-install.

sliverstorm11y ago

Function first, failsafes second. Or fifth. Or twentieth.

nisa11y ago

`parted` always makes me nervous. It's probably not so bad but the changes are instant.

michaelneale11y ago

That was one my my reactions too (yeah it was a different age). My other reaction was about the total loss due to a restart - in what could have easily happened by faults in many other systems - presumably there was no way to save data as work went no, no journaling or anything. Indeed this was a different age.

discardorama11y ago

It's easy to have this attitude now. But if you ask people who were in IT 30 years ago, they'll tell you that systems in those days had _many_ sharp edges. You were expected to know your way around, and the consequences of mistakes were pretty severe.

WalterBright11y ago

I was in IT 30 years ago, and this was never the attitude. What's different today:

1. we know a heluva lot more about human factors design

2. we have a heluva lot more excess computer power that can be devoted to human factors

RachelF11y ago

I'd add: 3. many more people are using computers.

openjck11y ago

Some people are saying that a warning would have helped.

Remember that warning functions are easy to ignore. Never use a warning when you mean undo.

http://alistapart.com/article/neveruseawarning

hliyan11y ago

That won't help here. You cannot "undo" a workstation reset, or any other action that results in a state propagation.

openjck11y ago

Sure you can.

Pressing the button would start a timer. While the timer is running, the user would have the opportunity to review their selection (maybe even with a simulation of what effect the selection has) and could undo the request if necessary. Only after the timer expires would the action actually be taken.

This is how the "undo send" feature of Gmail works.

http://mashable.com/2010/08/22/how-to-undo-send-in-gmail/

_pmf_11y ago

I don't know whether armchair usability trivia helps here. Not every system deals with ephemeral web drivel; some systems interact with the real world and have impact.

openjck11y ago

Are you saying that usability does not need to be considered in the design of critical systems?

Human factors grew out of the need to build safe and error-resistant weaponry in World War II. Poor attention to human factors and user interface design was a factor in the Three Mile Island disaster.

http://en.wikipedia.org/wiki/Human_factors#In_aviation

http://en.wikipedia.org/wiki/Three_Mile_Island_accident#Huma...

With respect, to call this armchair usability and to imply that some users are just stupid is to completely misunderstand what usability is.

kysol11y ago

My thoughts exactly. With that sort of "global" function, you would expect some sort of countdown timer prompt on each terminal that could be cancelled.

JustSomeNobody11y ago

Think about the resource constraints of systems back then.

SonicSoul11y ago

agreed 100% (if there truly wasn't a prompt). OP took the fall because the head administrator was too ashamed to admit that their system was so poorly designed to allow for this to happen.

j / k navigate · click thread line to collapse

273 comments

noisy_boy11y ago

Firstly, firing the intern doesn't make sense - it was a mistake waiting to happen and he just happened to do it at the wrong time.

Secondly, the punishment meted out should be:

1. Proportional to the degree of carelessness (in this case not that much since he accidentally hit a wrong key adjacent to the right one, didn't mow down anybody while driving drunk)

2. Inversely proportional to the likelihood of the error (in this case the likelihood was very high since the reset key was a. uncovered/single-press b. right next to single reset key).

3. Proportional to intention (this was a completely unintentional error)

jaredandrews11y ago

The author has actually responded to this on his blog: http://johncbeck.tumblr.com/post/92502108047/so-what-did-you....

    So what did you do after you got fired from the embassy?

    What I didn’t say is that it was the last day of my summer internship. The next summer they invited me back again. Everyone understood it was a mistake, but by officially firing me, someone had been punished … :)

So I guess it wasn't a big deal after all.

zeidrich11y ago

Punishment is not a good way to correct behavior.

Your number 2 is particularly wrong. For punishment to work in affecting behavior, you can't punish for a very unlikely event, especially accidental.

Punishment works because you are afraid of it. It works because you want to avoid it. But accidents don't happen because of defiance or a rational decision making process.

Firing someone who consistently makes mistakes is a corrective action, not punitive.

nathanb11y ago

woodchuck6411y ago

> I think you're wrong to ignore the consequences of the actions as an input to the punishment.

marbletiles11y ago

It makes perfect sense that the punishment should reflect inappropriate regard being given for known consequences. That's what negligence is.

1 more reply

hackuser11y ago

samspot11y ago

In this situation, the secretary playing the game is just as culpable as the intern, which is to say, not really responsible.

nathanb11y ago

67726e11y ago

There is a wholly different level of responsibility and ability to ensure quality in your scenario that is simply not present in the article.

In one case accidentally pressing the wrong key deleted incredibly important data, while in your case you have plenty of time to review and ensure quality at your leisure.

acdha11y ago

He's used a longer version of that story to view this as a management lesson – it's definitely not just “blame the intern”:

http://globis.jp/774-2

kyrra11y ago

Sorta weird how the 2 articles differ from one another about being "fired".

Count to ten article:

> I, naturally, felt terrible and was, appropriately, fired.

Honesty Wins article:

acdha11y ago

Bureaucratic jit-jitsu:

http://johncbeck.tumblr.com/post/92502108047/so-what-did-you...

“Oh, don't worry boss, we sacked the fool who made that mistake!”

genericuser11y ago

It is simply that given the simplicity and consequences of the error, it is the type of thing I generally see people beating them selves up over until they can not forget it.

HCIdivision1711y ago

I bet this entire article comment thread would be completely different with this additional context. Thanks, it certainly paints a better light on the situation!

mikegreco11y ago

dimitar11y ago

According to his next post:

basicallydan11y ago

Well, that clears it up - and it's quite a clever way to tick the arbitrary "someone took the fall" box.

apples2apples11y ago

Itoh should have been fired.

1 more reply

Scuttles11y ago

Makes sense (and is convenient). The people at the embassy had to have someone to blame, and report back to their superiors that they had dealt with the situation appropriately.

marbletiles11y ago

Then years later HN will call for Dennis Ritchie to get fired instead.

megablast11y ago

Also, the situation where somebody has to be fired.

colanderman11y ago

The person in charge of minimizing risk to their internal systems.

nmjohn11y ago

Except he didn't know there was a breaking news situation. He said pushing the wrong button wouldn't normally be a big deal.

marbletiles11y ago

He said he came in to find the system running, and the only time that happened was when Washington was waiting for info.

2 more replies

kstenerud11y ago

lrwxrwxrwx 1 root root 20 Apr 27 17:02 cc -> /etc/alternatives/cc

Guess what happens when you paste a whole list of those into a console as root?

lcedp11y ago

That's fascinating.

To prevent this from happening with me, I've added the following line to my `.Xdefaults`

    URxvt.perl-ext: confirm-paste

(I'm using `rxvt-unicode`)

http://i.imgur.com/joHRXaH.png

talmand11y ago

The worst unix disaster ever? Could you elaborate for non-unix people such as myself?

1 more reply

Tyr4211y ago

Oh no, that redirect!

baddox11y ago

But typing "rm -rf /" is significantly harder to do accidentally than typing F7 instead of F6.

graylights11y ago

Not really, a lot of novice unix users are of the habit of removing files with -rf switch. I cringe everytime I see it.

The command "rm -rf ~/blue/" is just a single space key from being equivalent to "rm -rf /" with "rm -rf ~/blue /"

5 more replies

svachalek11y ago

2 more replies

thefreeman11y ago

Not really, on my keyboard at least / is directly next to . and you could feasibly be clearing out a directory or something with rm -rf .

1 more reply

colanderman11y ago

Fortunately, despite rm's poor choice of options and bash's poor default handling of variable name typos and the obvious PEBKAC, backups saved the day here.

People are human. Policy ought to reflect this.

sentenza11y ago

Be honest, who doesn't have

rm -rf *

in their shell history? Now it's just one accidental twitch away.

1 more reply

pjc5011y ago

Far from perfect - it depended on the order of deletion - but a more general solution than preserve root. Of course it still requires the user to mark things they consider "important".

canjobear11y ago

Are you trying to say it's OK because Unix behaves analogously?

It is definitely a problem with Unix also.

marbletiles11y ago

It is. Which is why everyone in Unix who types "rm -rf " then types their next character _very carefully_ and reads the line before committing.

1 more reply

mreiland11y ago

rm -rf / is not quite the same as the f7 key restarting all machines sitting right next to the f6 key to restart a single machine.

you cannot fat finger rm -rf /

claudius11y ago

  # rm -rf /tmp/bla

  # rm -rf / tmp/bla

1 more reply

astrodust11y ago

You can fat-finger enter before you're done typing.

1 more reply

thrownaway242411y ago

You forgot the part where the only reason he was fucking with the F6 key in the first place was to play a game. That's irresponsible and grounds for firing.

ghayes11y ago

1 more reply

nerfhammer11y ago

jheriko11y ago

I think you will find they embarssed themselves.

Responsibility flows upwards, not downwards. Its just unfortunate that the people at the bottom are often carrying the people above far more than they should...

pfisch11y ago

He was probably fired because they realized they couldn't put a summer intern in control of such a critical system. I would make the same call.

mikegreco11y ago

jasonwocky11y ago

The person that should be fired is always the person who has responsibility for the amount of budget represented by the loss.

2 more replies

hueving11y ago

toyg11y ago

They might have been, we just don't know.

pessimizer11y ago

That may have happened.

dm211y ago

I do agree with you.

A fireable offense would be lying about the action or trying to cover it up.

Should the lady who asked for the reset be fired for playing a game and asking him to reset the computer?

Should the technician that didn't install some sort of safety be fired for not foreseeing this issue?

He would be much less likely to make the same mistake in the future than the person who would replace him.

I'd say several other people deserved to be fired for this, but the intern was not one of them.

mseebach11y ago

Also, whether or not it was appropriate is completely irrelevant to the story being told.

emn1311y ago

billmalarky11y ago

That sounds like rose colored glasses. The cold war had its fair share of witch hunts. Granted this was the 80's not the 50's but lets not forget McCarthyism was based around democracy vs communism.

"That Korean announcement and the slow response by the US President—both caused by delayed real information—caused decades of conspiracy theories."

2 more replies

ufmace11y ago

logfromblammo11y ago

rrss112211y ago

sergiotapia11y ago

Insane, but somebody had to pay for that screw up and you can bet your ass it wasn't going to be the guy managing the newbie 23 year old. It'll be the newbie 23 year old himself.

ProAm11y ago

attention to detail is a more valuable skill than people realize

chengiz11y ago

IMO cleary he is not telling us the entire story.

njharman11y ago

In the world of bureaucratic need for scapegoats.

endtime11y ago

> In what world

Japan, I guess? I've never been there but the story was consistent with my impression of their work culture.

jpatokal11y ago

This was the American Embassy, which follows American work culture.

patio1111y ago

c.f. http://www.nytimes.com/2013/08/17/business/global/layoffs-il...

1 more reply

wildpeaks11y ago

You even get access to a pool of other soon-to-available engineers to work with if you're stuck with other poor sods in the room.

1 more reply

HillRat11y ago

mathattack11y ago

Lots of fields are like that. You generally get paid better because of the risk. (I hope he was!)

ck211y ago

Or the US navy crew who received medals after shooting down the Iranian airline.

nness11y ago

> Or the US navy crew who received medals after shooting down the Iranian airline.

ck211y ago

If a police officer shot and killed innocent bystanders, should they get achievement awards for doing their job otherwise?

My answer would be no, you failed at your job regardless.

Same thing with military.

mason24011y ago

They didn't fail. They were ordered to shoot down a plane, and they shot it down.

maaku11y ago

That's a very narrow minded position.

Gosh I hope you never screw up even once, cause you'll never live it down.

3 more replies

jackschultz11y ago

Aardwolf11y ago

When there is such chaotic news story, I usually switch from news to Wikipedia. That has all the facts and continues the story even after all media lost interest.

justizin11y ago

See John Allspaw's Swiss Cheese Theory : http://www.kitchensoap.com/2012/02/10/each-necessary-but-onl... .

[ Edit: I guess it's not Allspaw's model, but he applies it to systems engineering rather well - http://en.wikipedia.org/wiki/Swiss_cheese_model ]

The person who pushed the button is not at fault, the manager is not at fault, the guy who designed the button is not at fault - all are jointly responsible.

But, I forget, we're talking about Ronald Reagan.

tokenadult11y ago

jere11y ago

>And let’s hope that there is no stupid 23-year-old with his finger on an important keyboard in this information chain.

ak3911y ago

What I find incredible to believe is that this problem could have happened without the F7 erroneous keystroke by a human. A simple power outage could have resulted in this exact same catastrophe.

Why didn't the backups work? System wasn't "robust" enough. (Did I just use the word "robust"?)

tootie11y ago

kosei11y ago

Really brave of the author to share this story. I know most people would be afraid to admit this kind of a public "mistake".

dictum11y ago

Further reading for those who want to be disabused of the concept of human errors:

https://en.wikipedia.org/wiki/The_Design_of_Everyday_Things

abcd_f11y ago

> That Korean announcement and the slow response by the US President — both caused by delayed real information — caused decades of conspiracy theories.

I appreciate that the OP was a part of the situation, but conspiracy theories were not caused by this.

brudgers11y ago

'It was a time of very high tension' doesn't quite capture how different it was.

steven201211y ago

userbinator11y ago

"With great power comes great responsibility."

disputin11y ago

ryanobjc11y ago

I disagree that it was appropriate that you were fired, but interesting story all about.

kysol11y ago

For security reasons I think that it might have been justified, considering the events that had just taken place. Still a crappy way to go out though.

brown9-211y ago

What security reasons? Firing the author didn't change what had already happened.

kysol11y ago

I'm not agreeing with the outcome.

sitinaud11y ago

Karellen11y ago

Or, you know, he had a perfectly good reason to think that accidentally resetting all the computers in the building at that time would not be a problem:

sitinaud11y ago

1 more reply

lotsofmangos11y ago

I often suspect that most of the work involved in keeping a power hierarchy going, is involved with trying to pretend that this kind of shit doesn't happen all the time.

lamontcg11y ago

And then conspiracy theorists latch onto this kind of shit, but believe that it must be malicious silliness...

Loughla11y ago

Conspiracy theorists are just 20th and 21st century prophets, really. They search for meaning in an all too often meaningless world.

It's comforting to think that people can control the direction of every choice in the world, and that someone is at the helm.

It's uncomfortable to think about the daily series of random, unconnected decisions that drive the direction of our species.

lotsofmangos11y ago

privatedan11y ago

If I had made the same mistake twice without any attempts to fix the situation long term, then, yes, I think that would have been a fire-able offense.

Schwolop11y ago

Thanks for posting this. I found https://news.ycombinator.com/item?id=8062683 yesterday but yours appears to be the direct link to the author's blog, which I had missed.

joewaltman11y ago

Great story....thanks for your willingness to share.

peterwwillis11y ago

Ugh.

jheriko11y ago

I hope you fought that firing...

jheriko11y ago

nm. just seen the follow up. :)

hyperliner11y ago

"My boss, a >> Japanese << computer engineer named Itoh, poked his head in the door. "

hmmm, I am pretty sure Mr. Itoh was not Japanese working in the American embassy. I am pretty sure he was American.

billmalarky11y ago

If he was a first gen American his culture would have been greatly shaped by Japanese culture.

joshuaheard11y ago

I thought the headline meant to count to ten when a plane goes down...while you are in it!

instakill11y ago

Me too. I still don't know how counting to 10 will help you press the right key on your keyboard though.

feld11y ago

Reset all computers in the embassy with F7? No warning prompt?

Fire the idiot who wrote that function.

smacktoward11y ago

"I know what I'm doing when I hit F7, but the damn system makes me sit there for 30 seconds before it does what I told it to do! Piece of junk."

tzs11y ago

Ahhh....the days of sharp tools, no failsafes, and young programmers or admins.

The configuration file for BATMAN was stored in /etc/batman.

The second time I did this, he was pretty pissed.

The other admin (Norman Wilson, in case anyone knows him or he reads HN) then made a link named /etc/safe_from_tzs to /etc/passwd to stop my nonsense once and for all.

That worked until the first time I wanted to overwrite /etc/batman instead of rm it.

That led to a cron job that maintained a copy of /etc/passwd in a separate file, and periodically checked to see if it were missing or misformatted, and restored it if so.

dkural11y ago

3 more replies

viraptor11y ago

> you could generally assume someone who was doing so had been thoroughly trained

thefreeman11y ago

2 more replies

sliverstorm11y ago

IMO the fact that we have multiple examples of these kinds of accident-prone key pairs does partially exonerate whoever did this particular F6/F7 bit.

1 more reply

jessaustin11y ago

...the whole window with ctrl+q.

OMG I've never done that but now that I know about it I'm very afraid. If I do it tomorrow I'm blaming you.

4 more replies

steven201211y ago

No, this isn't true at all.

So having a system where F7 would reboot the entire system was pretty dumb, even in the early 80s.

RachelF11y ago

Probably not. The power switch was at the back because that was were the power supply was.

IBM didn't put much thought into safety. You could blow up the early IBM's if you turned on the monitor (screen) before the CPU box.

gry11y ago

I can't find information why the design(s) were as they were. The design of Ctrl-Alt-Del was intentionally unintentional.

[1] http://www.theverge.com/2013/9/26/4772680/bill-gates-admits-... [2] https://en.wikipedia.org/wiki/Control-Alt-Delete#History

1 more reply

flomo11y ago

Meanwhile, the Apple II+ had the reset button right above the return key.

1 more reply

CoolGuySteve11y ago

Another alternative is confirmation fatigue.

As in: I hit F6, "Do you want to reboot this?" dialog pops, I hit 'Y', "Do you really really want to reboot this?", I hit 'Y' again.

Instead of actually reading what it says, you just instead press F6-Y-Y in quick succession.

Modern interfaces sometimes make you type some kind of string to confirm, but most either use a password (like sudo) or some hardcoded string that everyone eventually memorizes.

But even today, Windows 7 only makes you click that one button in UAC, and most people probably do it without even thinking about it.

djur11y ago

Whenever I implement a bulk delete feature I tell the user how many records they're about to delete and ask them to type it back in.

The user experience tends to involve a lot of double takes and rereading, which is precisely what I want.

1 more reply

btilly11y ago

Anyone using the app would hate me.

2 more replies

rickmode11y ago

I believe UAC works well even when a human always hits OK. The point is to make sure a human is there and trying to do something and not some malicious code.

austinz11y ago

I wonder if he had a supervisor sysadmin that he was working under, but given how he described his boss, that seems unlikely as well.

analog3111y ago

And we still have national security / diplomacy disasters resulting from relatively low level people having access to vital computer systems beyond anybody's imagination.

gambler11y ago

If something has changed over these years, it's the overall understanding of design principles and their popularization. (Thanks, Don Norman and other people in the field!)

SolarNet11y ago

I think he was pointing out why it was built that way, not why it was that way. Today we know it is a false dichotomy, he was pointing out, back then, the world thought it wasn't.

nightpool11y ago

1 more reply

zaidf11y ago

You're not taking into account the fact than what is considered bad and clunky interface today may have been the best in class 30 years ago.

efnx11y ago

ern11y ago

Gravityloss11y ago

swilliams11y ago

Holy moly. That's almost like the one Far Side cartoon with a "Wings Stay On" "Wings Fall Off" switch. http://imgur.com/AosYvGn

oh_sigh11y ago

There was probably something like a warning prompt that came up, but he may have been used to disregarding the message because it was always what he intended to do.

dade_11y ago

mseebach11y ago

hueving11y ago

baddox11y ago

It's still pretty obviously a horrible idea to have the two keys right next to each other with confirmations that are even remotely similar.

xophe11y ago

You must have missed the main point of this essay, so I'll reiterate here.

"take a deep breath, count to ten"

mullingitover11y ago

Stole the words right out of my keyboard. Who puts the 'reset every workstation in the building' button right next to the 'reset just one workstation' button with no confirmation prompt?

fekberg11y ago

Doesn't answer "who" put the button in there, I'm just thinking out-loud! :)

Florin_Andrei11y ago

> if you're an IT Administrator today handling all the office machines, you probably have the power to click 1 button to turn them all off?

To err is human. To push the error to thousands of instances online at once - that's devops.

matchu11y ago

The issue wasn't whether he allowed to press it; it was just way too easy to press accidentally.

Even if I have the authority to press the big red button, I'd like it to be behind glass and far from the light switch.

jauer11y ago

It could very well have been a minicomputer system where he was on the operator console and the other "computers" were terminals.

krrrh11y ago

crontab -e crontab -r

samstave11y ago

crontab -ri

raverbashing11y ago

I don't remember how it is on the newer versions, but try using fdisk in an old linux distro and see how many confirmations you get when deleting a partition.

Or just the old rm -rf /

Older stuff asks for much less confirmations.

taylorbuley11y ago

No thanks. It makes me shudder to just see this printed in the comments section!

grecy11y ago

I was leaving my old job and handing in my MacBook Pro. After getting permission from the network guys (who were going to format it anyway) I ran rm -rf / as root.

2 more replies

jqm11y ago

sliverstorm11y ago

Function first, failsafes second. Or fifth. Or twentieth.

nisa11y ago

`parted` always makes me nervous. It's probably not so bad but the changes are instant.

michaelneale11y ago

discardorama11y ago

WalterBright11y ago

I was in IT 30 years ago, and this was never the attitude. What's different today:

1. we know a heluva lot more about human factors design

2. we have a heluva lot more excess computer power that can be devoted to human factors

RachelF11y ago

I'd add: 3. many more people are using computers.

openjck11y ago

Some people are saying that a warning would have helped.

Remember that warning functions are easy to ignore. Never use a warning when you mean undo.

http://alistapart.com/article/neveruseawarning

hliyan11y ago

That won't help here. You cannot "undo" a workstation reset, or any other action that results in a state propagation.

openjck11y ago

Sure you can.

This is how the "undo send" feature of Gmail works.

http://mashable.com/2010/08/22/how-to-undo-send-in-gmail/

_pmf_11y ago

I don't know whether armchair usability trivia helps here. Not every system deals with ephemeral web drivel; some systems interact with the real world and have impact.

openjck11y ago

Are you saying that usability does not need to be considered in the design of critical systems?

http://en.wikipedia.org/wiki/Human_factors#In_aviation

http://en.wikipedia.org/wiki/Three_Mile_Island_accident#Huma...

With respect, to call this armchair usability and to imply that some users are just stupid is to completely misunderstand what usability is.

kysol11y ago

My thoughts exactly. With that sort of "global" function, you would expect some sort of countdown timer prompt on each terminal that could be cancelled.

JustSomeNobody11y ago

Think about the resource constraints of systems back then.

SonicSoul11y ago

agreed 100% (if there truly wasn't a prompt). OP took the fall because the head administrator was too ashamed to admit that their system was so poorly designed to allow for this to happen.

j / k navigate · click thread line to collapse