Quantifying the improvements you get from addressing technical debt is a losing game. Most likely you will have to bullshit some numbers together.
The PM wasn't hired to make these decisions, they will not be required [for this]. Error budgets, delivery, operations, and what I see when I open my editor are all core to my job description. I'll refactor it/have my team do so... and argue with others about why we shouldn't at the same time.
I know this isn't ideal, I don't care - that's why I'm here. I've worked with too many people who internalized the field sabotage manual. I've been hired, promoted, and stolen for my ability to cut through nonsense.
My role (SRE) was made as a consequence of institutionalized technical debt. It provides nothing really new, hopefully picking up slack [that others created] while avoiding unproductive feedback loops.
This scene from "Breaking Bad" comes to mind. Apologies for the language, the character is known for it:
Chemist: Who do you think you are?
Jesse: I'm the guy your boss brought here to show you how it's done. And if this is how you run your lab, no wonder. You are lucky he hasn't fired your ass. Now, if you don't want that to happen, I suggest you stop whining like a little bitch and do what I say.
I have, and will, write RCAs for the roles we [and our technical debt] played in outages. I've published more of my product than any given development team, let's try.In the time spent gathering believable information to present, you could probably address a reasonable number of small tech debt items.
Large tech debt items probably can't be resolved without some major business-impacting issues. At which point, you might rather not be around.
Regarding bullshitting numbers, I think it's especially a losing game when you are presenting numbers to more experienced professional bullshitters.
I think pro bullshitters know it's all BS. The real problem are the people who haven't realized that yet :-)
None of this is helpful when your PM has no business strategy to align with, prioritizes based on what's going to make them look good, and sheds responsibility to the developers when things aren't going well.
It makes me wonder, if you're working with a PM who would even consider the information in the article, would you have to gather it for them?
The PMs who I've work with that conscientious as such already have some grasp of the problem. They see slow delivery of features, bug reports, and listen to the engineering team moaning. They get out of the way on stuff they aren't qualified to decide on. They just make sure that tech debt work stays within a reasonable scope and time budget.
This is really not the engineer's doing or problem. That's except if the engineer has been producing wasteful solutions that rack up cloud bills in a way that's 10x more money inefficient than it needs to be. For the most part, corporations waste extreme amounts of money in numerous ways, and the engineer has nothing to do with any of it. For the most part, the engineer has no say in the general business waste.
> in the process
This was unjustified. A good engineer will spend what is necessary, and no more. Generally, the business expenses that typically make or break the business are much larger than the engineer's residual salary.
> What is it that you think engineering entails
It does not entail saving the management from itself. I would in fact be happy to see the management fail if they don't listen to the engineers, telling them "I told you so". This is assuming I had previously documented my concerns in writing. I don't feel the need to have to convincingly "sell my concerns" in the face of resistance when no one really wants to listen anyway. My job is to eliminate my liability, which I did when I documented the concerns exactly once in writing.
In summary, if management is going to act st00pidly by ignoring the concerns noted by engineers, it is really not the job of the engineers to bend over backwards to convince them. Often it is failure that teaches the most important lessons.
To use an analogy, some engineers may think gothic architecture is ugly and should be replaced. That doesn't mean it is structurally unsound or that there is anything wrong with it.
Debt is garbage that fights you every day in delivering project goals. Everything else doesn’t need to be addressed now, or maybe ever. Write down what limitation it might introduce and move on. Just hope you have the history and experience to properly solve that problem later - it’s either on the project maintainer or business, and they are only good at their job if they know which will matter.
I have an internal rating scheme I use to classify debt based on risk / lost opportunity:
1. Efficiency
2. Feature Quality
3. Growth
4. Continuity
Efficiency projects lost productivity to the development team. It should eventually be addressed or you will lost momentum in hard to measure ways. Examples: Developer tooling issues, lacking documentation, failure to reach consensus, stable code with suboptimal API design.
Quality projects continuous customer impact due to bugs within certain vertical areas. Examples: Poorly coded features at the “leaves” of your code that don’t affect other areas but block improvement of itself.
Growth projects failure to meet a subset or all future deliverables. A failure is defined as anything that doesn’t meet required timelines to capitalize, or becomes downright impossible. Examples: Bad system architecture, data modelling, shared component that cannot scale to planned future requirement.
Continuity means what we currently have is going to fuck us at any moment. It’s a time bomb that gets worse passively. Examples: Security, major vendor deprecation, declining system stability.
Many developers I’ve worked with have never bothered to categorize debt, which is their fault. Many gravitate toward 1 and 2, but 3 and 4 are true debt today. The others may or may not evolve into 3 and 4 over time. You keep an eye on them but don’t argue about them until they will definitely become 3.
This is easy because a 2 flips to a 3 as soon as the code is even talked about being shared, or if it’s DX related and you’ve been asked to stretch it to the point productivity can be predicted to measurably decline. Come with numbers to prove your case, it will become easier and the evidence based approach plays much better than feelings and complaining.
Case in point, the situation with our flaky/unstable test got so bad that quarter of them are now to set ignore any failures. Pretty much zero interest in fixing.
Meanwhile, the rewrite of a service that had no outstanding bugs (but was originally written by other team but handed to us) is still not done after 4 months...
However, the very unfortunate truth is that _most_ engineers end up in product because they weren't good engineers. So they don't have a good internalised engineering culture as product managers either. My current one is by no means a horrendous one, but is definitely closer towards the bad end of the spectrum, and they moved to product out of management. I'm giving them the benefit of the doubt for now, because they've just started the new job recently and maybe it's just friction from learning on the job.
You never started. Full page ad right out of the gate. Pass.
Jokingly: my problems compound, now procurement/finance needs convinced?
Somewhat Seriously: Please tell me the revelation is deeper than "have data, buy it from my friend"
It's like a surgeon convincing the hospital administrator to get some time to clean the scalpels.
Seriously, be an adult
Honestly, this shouldn't be something that non-leadership engineers should be concerned with at all. They should be working with engineering leadership to ensure that technical debt is identified and classified, so that engineering leadership can work with product leadership to prioritize the payback of technical debt at an appropriate time, in the appropriate order. Trying to solve this problem at the level of individual teams and individual PMs means that the larger scope of the business's needs aren't available to be considered. Is there a funding round coming up? A big partnership announcement? A hiring round? A hiring freeze? You don't know, and you won't necessarily be told, even if you ask.
On top of that, the standard of measurement for improvement is flawed. The article is suggesting that "technical debt" as a whole is doubling the amount of time it takes to do features, but that's just not the way it works. Technical debt isn't just "crappiness of your codebase" -- you'll never solve that, no codebase is perfect. Technical debt should be tracked as individual, specific issues that will have specific negative effects (the database doesn't have appropriate indices and will get slower over time causing increasing performance issues, the version of the third-party API we're using is being deprecated and if we don't upgrade, we'll have a production outage, our User class has grown out of control and any feature that touches it effectively touches most of the app, causing bugs to appear at a much higher rate and slowing releases of these features, etc). I'm summarizing because this is already a long comment, but these issues should be even more specific than that: you need to describe the issue, the cost to fix it, the pain it's causing now, the pain it will cause over time and what that time frame is, and if there's any events that will require you to have the debt solved Or Else. Talking about technical debt with this information available gives concrete information about why it's valuable to solve. Talking about it with "we will move 2x faster" is unlikely to convince anyone, because software development isn't that reductive. You cannot guarantee that flat, project-wide improvement, and people will remember the time they spent three months not building features for no benefit they can understand.
I feel like the approach specified in the article will work -- at best -- once. A lone engineer doesn't have access to sufficient information to truly make a case for why this is a good time for the repayment of a particular piece of technical debt, can't see the full spread of potential technical debt that _could_ be solved and should be considered, can't adequately convince people as to the benefits of having resolved the technical debt, and honestly shouldn't be spending their limited political capital trying to get their personal pet peeve fixed. If technical debt isn't getting fixed appropriately, that's an engineering leadership issue, and you should be working with engineering leadership to try to help them understand any debt they're not aware of so it can get prioritized appropriately, and then afterwards, work with them to make sure that the specific, visible improvements and "catastrophes avoided" are recognized and respected.