undefined | Better HN

0 pointsphiljohn7mo ago0 comments

This.

One of the biggest things I see in junior engineers that I mentor (working in backend high throughput, low latency, distributed systems) is not working out all of the various failure modes your system will likely encounter.

Network partitions, primary database outage, caching layer outage, increased latency ... all of these things can throw a spanner in the works, but until you've experienced them (or had a strong mentor guide you) it's all abstract and difficult to see when the happy path is right there.

I've recently entirely re-architected a critical component, and part of this was defense in depth. Stuff is going to go wrong, so having a second or even third line of defense is important.

0 comments

technofiend7mo ago

I recently had to argue a junior into leaving the health check frequency alone on an ECS container: the regular log entries annoyed her and she didn't know how to filter logs, so her solution was to take healthchecks down to every five minutes, as just one example of trying to talk to people about the unhappy path.

dwedge7mo ago

That sounds more like a disaster waiting to happen than a junior. I find it difficult to believe that she didn't know the purpose of the healthcheck, so it sounds like breaking (someone else's problem) instead of addressing gaps in ability

Cthulhu_7mo ago

The junior part there is that this person still believes they can / should read and comprehend all logs themselves. This just isn't viable at scale.

But same with code itself, a junior will have code that is "theirs", a medior/senior will (likely) work at scales where they can't keep it all in their heads. And that's when all the software development best practices come into play.

j / k navigate · click thread line to collapse