Let me see if I get this straight: "While on vacation, I spent a significant amount of my time with a node in our production cluster attempting to understand this problem"
"Eventually, I discovered a race condition that allowed an SO_WRITE interest to be set on a socket but never cleared."
"Upon returning to the office and offering up one heck of a mea culpa"
---
Took his own time to hunt down an extremely difficult to find bug and then fixed it. Yea, mea culpa all around. This is the kind of ridiculous attitude that has people working themselves to death.
I agree with the rest of the article, I think you should never be afraid to change and refactor your own code. This particular part of the article struck a nerve for me.
Hopefully I can clarify the situation. The server application he worked on was fairly new territory for him, but moreso to us. He had experience in the language (Java) that we didn't have so it was really his project. The application has had numerous subtle bugs that never caused huge problems, just weird behaviors here and there. As time pressed on, it seemed for a while that the bugs would never cease. Keep in mind that this is a server heavy with NIO internals and several worker threads, so maintaining state is a very delicate process. Race conditions that seldom occur are perhaps the worst kind of bug to tackle.
Additionally, he's the new guy. Being the new guy is hard when tasked with such an important task. Especially when the task itself seems to haunt you every day for several weeks. We encouraged him and never once blamed him for anything (as if we would have known better?). He simply took his responsibility personally. Perhaps too personally.
This sort of humanity in software is uncommon and humbling. He's a good kid who isn't out to prove anything to anyone, just to be the best programmer (person?) he can be. I think he can be forgiven for being too self-deprecating.
I can forgive him for being self-deprecating, but he should be careful to strive for a good work-life balance or all those smarts and ambitions will flame out and burn out and he will be left trying to pick himself back up.
Of course from one blog post it's hard to know what his work-life balance is, so I could just be talking out of left field. I think it resonated with me because I saw a bit of myself in what he was saying and I know that taken too far that behavior that seems like a great idea can lead you to a place you don't want to be and that I'm finally starting to get out of.
I think this article's approach is a good one. Take precautions, use things like continuous builds, static analysis tools to boost your confidence, but DO make the necessary changes. Better sooner than later.
Edit: Spelling
I started on what should have taken 1 hour of work to do at 6pm... 7 hours later, I was almost done ... It threw off my entire schedule and destroyed my plans for today (I'm usually up at 6am, got up today at 11).
So yeah ... fear of breaking the software isn't unfounded.
Be granular in how you check in your changes, and backing things out (especially loosely coupled code like debug logging and senseless exception handling) carries low to no risk.
I became good at this when emacs vc-mode made it easy.
I can be 100% certain I found out what caused my bug, or how my code should really work. But I am iffy about taking out the superstitious and useless changes I made while I should have probably walked away for a bit... Changes that didn't help but didn't break anything either.
"insidious", "technical debt", "lack of understanding", "fear to modify": agreed!
"rooted in superstition rather than reason": disagree...
My primary reason for not breaking it has always been lack of time. There were many times I encountered something I would have loved to have rewritten, but it would have thrown the rest of my schedule off. If I have 27 open tickets, 3 bosses, and 9 customers breathing down my neck, the last thing I need is some clusterfuck I broke and can't put back together fast enough. That's a mistake you only make once.