This led us to an interesting question: what if we just split all bugs into "will fix" and "won't fix", and then prioritise every "will fix" above all new features....always. In other words: we commit to only ever adding new features when we're bug free.
Has anybody tried this? Can it work?
- Prioritizing is hard, so avoid wasting brain cycles deciding how important your bugs are
- It encourages all of your team to get things right the first time, because if they don't they know they will be going back to fix it immediately.
- If you want to create a culture of quality then it's an obvious first step.
- It saves time in the long term by addressing problems when you have the most context and by avoiding building hack upon hack upon hack
In extreme cases you can relax the policy, but be aware that if you don't quickly correct course then things will be permanently worse. Also accept that hard external deadlines are not suited to this approach, but using triage some of that can be mitigated.
Article from back in 2000, based on info from ~1990.
> 5. Do you fix bugs before writing new code?
> To correct the problem, Microsoft universally adopted something called a “zero defects methodology”. Many of the programmers in the company giggled, since it sounded like management thought they could reduce the bug count by executive fiat. Actually, “zero defects” meant that at any given time, the highest priority is to eliminate bugs before writing any new code.
Please read the entire article, it's worth it ;-)
Joel worked on Office. Have you heard the hair-on-fire customer stories about Office from the 90s? It was a draconian response to what was probably seen as a serious company-wide problem at the time.
That made me rethink my usual "prioritize urgent bugfixes over features" stance, but it works better if you have a lower number of (known) users who communicate, over a mass audience.
Also you’d be surprised at what gets classified as a bug when you apply this.
Ultimately though, everybody does this. Critical bugs get fixed first. Anything too far down the backlog de-facto doesn’t get fixed. I like the intellectual honesty that this approach brings, in that it forces you to set a bar for bugs not worth fixing and consequently marking them as won’t fix.
Final note: whether a bug is worth fixing changes over time. Maybe your best engineer can’t find it after two weeks. Maybe your biggest customer just ran into it. Maybe you can’t reproduce it. Maybe the platform causing it got acquired by Google.
Can it work? Yes, but it doesn’t look very different to what you’re probably already doing in practice.
How does it really work: let's say you have 2 week sprints (quite standard). Then you break the agile/scrum "story points" and velocity: it isn't usefull for your team (but might be for management). After quick estimates, each member take what he estimate to be a week, a week an a day of work on critical bugs, then critical features. Because let's be honest, you alway have more "critical" stuff to do each week, and will never get your backlog small enough to reach non-critical stuff.
Once you're done with your "critical task", depending on the time you have left, you take a non-critical stuff, in preference in backlog order if you're senior and long-time IC (lot of bugfixing), preference to "i'm sure i know how to do that" then "unknow project but seems easy enough" for juniors, and seniors who just arrived.
Having a gigantic backlog isn't an issue as long as each task are assigned to a product and a Major version: that will allows you to discard those tasks easily if the product isn't sold anymore of if the version changed.
I personally like the idea that for every bug you either: a) fix it with highest priority, or b) mark it as "won't fix".
I think this would really force you to make a decision on a bug, rather than adding it to some never ending list of lists.
If a bug is worth fixing, it will come up again
- New features also add tech debt, complexity, cost, bugs, makes you slower, and will lose you some sales that way.
- At 0 features you really need to implement features instead of messing about with the bugs in the devops scripts.
- At e.g. Windows scale, maybe stop messing about with new features nobody asked for and fix some bugs, yes.
- In a Laffer-curve-like effect, there must be a point where it peaks and it's better to fix bugs than to implement a new feature.
- It's a very difficult to identify where you are in the curve.
- One of the measures is simple "do X, get money from this guy I have on the phone"
- The other measure is fuzzy, lags, is subjective, can't be traced to a particular feature.
Good luck!
When customers aren't signing up because of lacking feature -> build features. When customers are churning because of bugs -> fix bugs. Else -> somewhere in the middle
You really think so?
Surely it's just a matter of picking a sufficiently high bar for "will fix" and then focusing some time on it.
Do the thing that will produce the most customer happiness in the shortest amount of time next.
Naturally, the wishlist of new features never ceases to shrink ... but the stability of the existing platform is what people really appreciate.
You are bug-free, start working on new feature, new bug reports come in, and you have to pause and work on them.
Fixing bugs is not fun work because there is usually a quick fix in an ugly way, and then a perfect fix via a large refactor and re-architecture. This results in that "soul-destorying" feeling of: if I had enough time I could fix this properly in the right way with clean code and avoid huge amounts of bugs, but alas I am just piling on tech debt.
Effectively, "I think this is important so I will argue you should work on this before that bug" is much better than "I think this is important so I will argue this is a bug."
YMMV, you have to adapt it to your usecase (B2B / B2C? contractual SLA? ...). But it can be something around:
- P0 : a bug prevents a significant part of the customers (= paying users) to perform one of the core functionality of the product : at least one dev drops what he is doing right now and investigate, fix it himself or send it to someone responsible for the bug who should drop what he is doing right now and fix it. Target is to have it fixed in under a few hours.
- P1 : a bug prevents a few customers to perform an important yet non core functionality of the product: next dev available have a look at it. Target is to have it fixed in under a few days.
- P2 : a bug customers can live with (concerns few customers / there is a workaround / ... ) -> fixed in best effort, in practice, we fixed them when doing other features near that code. It can take a lot of time to fix them (if we ever fix them, and it's ok. The good thing about tech debt is that it's a debt you don't have to pay, when you remove/replace a feature for instance).- P0 means drop what you're doing and fix it now
- P1 means fix after you've finished what you're doing
- P2 means "won't fix" (but keep a note of it in case we ever get to that perfect situation where we have more time than features to build ;))
Is it worth not making progress on any new features, all because of a smaller bug or issue? Can just one person work on the small bug while the rest of the team starts a new feature?
Sometimes the best bug fix is a new feature that depreciates the bug, so be sure to consider the estimated lifetime of the bug if you keep progressing your platform and maybe don't spend too much time fixing things that will be soon phased out anyways, unless they're really major issues that are rapidly hurting the business or reputation and need immediate fixing.
If you really want to halt all new features, I might try putting it on a calendar. Maybe you can afford 1 month, 1 quarter, or half a year on just bug fixing but eventually you have to keep moving forward in some way. Unless your platform is already pretty feature complete (which it doesn't sound like it is) you might do more harm than good when delaying your next core feature releases.
Two problem cases likely to pop up.
1) Lots of fixing when a bigger refactor is required. A poorly written area of code, or a poor design, may be causing high churn and wasted effort. The solution I’ve found to this problem is to track defects by code area and review once metric exceeds heuristic.
2) A team choked on defects only for a long period of time. This obviously has many negative side effects. It tends to happen in really important component and require most experienced developers. Any new starters run for the hills when a team gets into this state, therefore compounding the issues. The solution to this (though this is just my opinion), is to never allow 100% of time over [fixed interval] to be spent on defects. No more than 50%. The bugs will still get fixed, just take longer, and new development is still happening.
Overall though, I think a “defects first” approach is the right one, just have a plan for these negative cases.
One (arguably positive) side-effect I'm wondering might be possible is that: if bugs are always prioritised first .... and engineers are often very creative at solving problems .... will they perhaps come up with creative ways to reduce bugs in the first place?
Or, it might go all wrong -> and we create a dangerous culture of "swallow that exception" :D
Another way to look at it is the delayed effect of doing nothing in either area. Bugs creeping in over the months and years may only become a problem when a competitor starts to be noticeably more stable.
Features that are delayed may have a delayed effect of a competitor getting ahead of you in the market and launching months before you'd be ready.
So I would say, "it depends". If you're in a growth market and are trying to capture market share, features might be best before non-critical bugs.
If you're in a stable market serving a huge amount of people, then fixing bugs has a much larger impact on your users.
You also have to consider the team and their morale over time. Too much churning through low value bugs can be demoralising where individuals might need some type of higher level thinking and creativity.
If it were me, I'd look at bugfix only sprints and adjust the frequency based on the above factors.
Also what kind of bugs are coming up is important... I think bug do tell stories; it might help you identify issues with feature assumptions on workdlow or use-cases and that needs to inform your product development as a feedback loop to avoid technical debt and having to rewrite stuff later.
Depending on the size of your team, I would have a 2-4 people focusing on just bugs / QA (so you can catch most bugs going forward) while the rest of the team focuses on new features.
If the product has actual customers it makes ALWAYS sense to prioritize fixes along the hot paths. And it you should be easy even for non-technical people to understand. You lose customers.
A feature is not done until all major bugs or regressions are fixed.
Therefore go for release cycles. Lock features, fix all bugs, then release the version. Repeat for every cycle.
You might be forced to release with a few bugs. The conventional procedure is to publicly document the known bugs for each version.
If you are fighting fire with bugs, then you don't have stable product yet. Cut off up to a reasonable feature set and fix all bugs when you want to release.
This policy's results have been excellent: users are focused on learning to better exploit the applications and suggesting new functionality rather than complaining about long-deferred defect repairs; even minor, easily worked-around defects create a negative user community mindset that can snowball. The absence of defects increases user confidence, and reduces user-perceived complexity.
Does your system have a lot of (intentional or unintentional) emergent behavior, like a sandbox-heavy video game? You could end up never making a feature again.
Do your customers expect the frequent shipping of new features? Unless you can sell them on the idea of unexpected or infrequent addition of features, you could quickly lose your core audience.
However, if it is a product that no one is expecting tight deadlines on the release of new features, or the product's purpose is straightforward enough that there is no intentional emergent behavior in the system, then I could see someone running it in this way. I don't mean this in a cheeky way: there are definitely products that fit this criteria. It just won't be something that every product can reasonably do while also expecting to retain their user base.
I’m not disagreeing that customer complaints are important for prioritisation, but the problem isn’t quite as first-order as that.
In an extreme case, imagine you found a bug that corrupts data if a customer name begins with z. Would you not fix this bug just because you don’t have any customers whose name begins with z?
It's like building a wall; if one layer of bricks is laid unevenly, at least a number of layers built on top of it will have to compensate. Usually, this compensation takes form in increased development times or increased complexity / convolution of new features.
Furthermore, it lowers end users' overall trust in the platform.
Both of these two effects will have at least some negative impact on profitability, though it may be lower than the increased profitability gained by adding new features.
I'm not saying all bugs should be fixed immediately at the expense of new features, but I've rarely been in a situation, where it felt "right" to ignore a bug indefinitely.
I have a simple rule that has worked to keep bugs at zero. If you find a bug, drop what you are doing and fix it. If it takes more than 4 hours, only then, put it in your backlog and prioritize it like everything else. Don't keep a separate bug list! Bugs should be in your backlog if they are serious enough (more than 4 hours to fix).
99% of bugs take less than 4 hours to fix.
Bugs slow you down. Keep them at zero! You will be able to finish everything faster!
I've worked places that have tried it, too. It does work; in my experience, it literally always improves software quality. But I think it is like how pretty much all diets work, at first, but not over the long term.
It becomes unsustainable, as the pressure to work on features grows.
Continuing with my diet analogy, I now think of it like bulking and cutting phases. When you are bulking up your product with new muscles (features) you are also adding fat (bugs). At some point the percentage starts growing unhealthy, slowing you down and causing all sorts of ancillary problems.
Might be time for a cut phase then.
My analogy breaks down (because it's not really a good analogy) with a bigger team. Then you can have people assigned to bug cleanup. But this leads to its own set of problems — do you pay bug cleanup engineers a bunch extra? Because they will tend to be reading Who's Hiring top to bottom after a few weeks of that.
I think this does work great, though, when it is for a pre-determined fixed duration of time. (e.g. "3 iterations" or "one quarter").
[1]: I think where I read about this is now behind a paywall, but this seems to be the same initiative: https://sriramk.com/memos/zerodef.pdf
Using libraries and frameworks gets you speed in delivery because you don't have to write that stuff yourself, the downside is there are likely bugs in that software or in how your system interacts with it. No software will be "bug free", striving for that is a fool's errand.
The best option is to accept software is soft and malleable.
If it prints money but it’s some legacy code that smells like shit and has bugs, just write a new clean interface on top and build to it. Over time you’ll end up replacing the underlying system while shipping new features and fixing bugs as they pop up. Oh and the money keeps printing.
It it doesn’t print money and the code is shit, the project should just be scrapped. Who cares.
If people constantly keep building shit code and never improve, just quit because that company probably sucks.
Be extreme by being practical.
Places with zero bugs policies seem to have much branchier defect taxonomies than everyone else.
I've also seen shops just decide to ignore entire classes of problems so that they don't put too many rows in their bug database. So, willful ignorance is another option. I don't recommend it, though.
I feel like this already happens in some places, just implicitly rather than explicitly. There's a reason why people sometimes joke about the backlog being the place where lots of low priority things go to be forgotten about, and never be done. The problem with this approach is that even if you could get around to fixing low priority bugs in X months/years, with this approach you would prematurely toss them aside and decide to never fix them.
> ...and then prioritise every "will fix" above all new features....always. In other words: we commit to only ever adding new features when we're bug free.
This will decrease your ability to ship new features in a timely manner and will put you at a disadvantage if you're up against any sort of competition, that ships "good enough" things soon, versus you shipping "really good" software way later. The benefit of this approach is that it should lead to a more maintainable codebase in the long term, though how much this matters depends on your circumstances.
It will mostly depend on how you choose what goes under "will fix" and how much it matches what's necessary to keep the lights on (KTLO).
For example:
- Users can't view a page that's needed for legal compliance, in some popular browsers? Needed for KTLO, fix it, no brainer.
- There are errors when trying to use some niche functionality, which affects around 1-5% of your userbase? Probably should fix it, it depends.
- Some button's logo is offset by a few pixels in a settings page or an info message flashes too quickly after some redirect? Probably nobody cares that much.
Of course, all of that might also coincide with how you choose to test your software.I once wrote some JWT code to allow multiple systems to integrate and communicate securely. I decided that I wouldn't ship it until I got something like 95% test coverage, enforced by CI and everything. It was doable and helped me discover a few bugs to fix during development, as well as refactor with confidence - but it took something like 4x more time than regular development would, which I can't see working well for the majority of projects out there, outside of core functionality and financial transaction related code.
And in the extreme case one customer (or a competitor on the sly posing as one) decides that one minor (mis)feature that they don't like is a bug and doesn't accept any fixes to it as adequate, what do you do?
Overserialisation of life (and other extreme position taking) is unproductive. Some new things will happen while others get fixed or not. Sometimes you won't even know how to fix 'bugs' without trying new things out.
In fact, I'd love something like this so much, I'd even be willing to sponsor efforts like that with a bit of cash.
There is only so much dry powder. Every hour spent doing something that has lower bang:buck ratio in terms of "moving the needle" is an hour of powder burned. Make those hours count.
That said, correctly diving what moves the needle most is startup alchemy.
A more refined approach is to triage the bugs into 4 or 5 levels of severity and then you might reasonably agree that at least all sev1 bugs must be fixed before new functionality is added.
Just using a bug triage categorisation of "will fix" or "won't fix" is not granular enough.
Of course, in practice this just means 3 or 4 levels of bugs that will never get fixed and languish in a backlog until the team / project gets re-orged and the entire backlog is wiped clean.
Okay, I'm joking. Sometimes those bugs become old enough they refer to features that no longer exist or can't be reproduced and then get marked as "won't fix"
I've never had much success getting a "won't fix" decision out of a PM or designer, and my time is too limited and valuable to spend debating it. If developers all know "low priority" means "never do" and "stakeholders" believe that their pet quibble will be fixed one day (even though they will never allocate time to do that) then work can continue in a sensible fashion.
So, if it's important enough to fix, why not fix it now.
I worked for a company a full year that always prioritized new features over bug fixes, or even QA. This was a sales-driven company. For various reasons, it was a toxic environment and I left.
The next company I worked for prioritized good engineering over shipping a product. This was an engineering-oriented company. It's funding was running low ('cause - well, no product!) and I got caught up in a layoff.
Company #1 - for 5 months of that year, I was the only QA person (with 13 devs). Literally no time to automate regression - there was always the "next feature we need to close a sale to customer 'X'". I told my boss that inevitably I (or my occasional comrades) would miss something and the company would be liable. Sure enough, within 18 months of my exit, they were hit with 3 different lawsuits.
Company #2 - absolutely delightful environment, except for the non-technical CEO. Understandably, he wanted to return value to the investors, but he kept making short-term decisions about direction that, while intending to make money, ended up accelerating the cash burn. I was promoted to be a technical lead of about 8 people on a side-project that was supposed to leverage technology from a company whose name is a homophone of 1 × 10¹⁰⁰. No surprise, mega-company changed the rules and our potential product was DOA. The entire team (including me) was let go.
TL;DR - quality is engineered in, and everyone owns quality. If a genuine defect exists, determine its severity and likelihood of emergence with customers to determine priority. Don't ship/push to production any code that will cause customers to lose data or their ability to run their businesses. If you have to make a choice, choose bug fixes over new features.
Of course we try to get there before the feature request turns into a bug...