We can even just look at the title here: Do the simplest thing POSSIBLE.
You can't escape complexity when a problem is complex. You could certainly still complicate it even more than necessary, though. Nowhere in this article is it saying you can avoid complexity altogether, but that many of us tend to over-complicate problems for no good reason.
I think the nuance here is that “the simplest thing possible” is not always the “best solution”. As an example, it is possible to solve very many business or operational problems with a simple service sitting in front of a database. At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?
Complexity is more than just the code or the infrastructure; it needs to run the entire gamut of the solution. That includes looking at the incidental complexity that goes into scaling, operating, maintaining, and migrating (if a temporary ‘too simple but fast to get going’ stack was chosen).
Measure twice, cut once. Understand what you are trying to build, and work out a way to get there in stages that provide business value at each step. Easier said than done.
Edit: Replies seem to be getting hung up over the “DB” reference. This is meant to be a hypothetical where the reader infers a scenario of a technology that “can solve all problems, but is not necessarily the best solution”. Substitute for “writing files to the file system” if you prefer.
We had a project which is supposed to convert live objects back into code with autogenerated methods. The initial design was using a single pass over the object graph and creating abstractions of HDL and combining method blocks in the same pass.
That is a big hairy code with lot of issues. Simpler would be to handle one problem at a time - method generation in one pass and then convert the methods to HDL. But, getting approval for a deployed app is so hard. Particularly when it is a completer rewrite.
"could possibly work" is clearly hyperbole as it would only exclude solutions that are guaranteed to fail.
But even under a more plausible interpretation, this slogan ignores the cost of failure as an independent justification for adding complexity.
It's bad advice.
Hell I just spent a week doing something which should've taken 5 minutes because rather then a settings database, someone has just been maintaining a giant ball of copy+pasted terraform code instead.
Adding the runtime complexity and maintenance work for a new database server is not a small decision.
Do you handle one "everything is perfect" happy path, and use a manual exception process for odd things?
Do you handle "most" cases, which is more tech work but shrinks the number of people you need handling one-off things?
Or do you try to computerize everything no matter how rare?
Bring the problem back to our primary contact and they've got no clue what to do. They're on like year 2 of a 7 year contract and they've just discovered that their payroll department has been interpreting the ambiguous rules somewhat randomly. No one wants to commit to an interpretation without a memorandum of understanding from the union, and no one wants to start the process of negotiating that MoU because it's going to mean backdating 2 years of payroll for an unknown number of employees, who may have been affected by it one month but not the next, depending on who processed their paystub that month.
That was fun :D
The programmer's mind is the faithful ally of the perfect in its war waged against the good enough.
The "best" solution for most people that have a problem is the one they can use right now.
The one you can use right now in order to get feedback from real world use, which will be much better at guiding you in improving the solution than what you thought was "best" before you had that feedback.
Real world feedback is the key. Get there as quickly as feasible, then iterate with that.
> As an example, it is possible to solve very many business or operational problems with a simple service sitting in front of a database.
If this is the simplest approach within the problem space or business's constraints, and meets the understood needs, it may indeed be the right choice.
> At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?
No problem in a dynamic human system can be solved statically and left alone. If the demands on a solution grows, and the problem space or business's needs changes, then the solution should be reassessed and the new conditions solved for.
Think of it alternatively as resource-constrained work allocation, or agile problem solving. If we don't have enough labor available (and we rarely do) to solve everything "best," then we need to draw a line. Decades of practice now have shown that it's a crap shoot to guess at the shape of levels of complexity down the road.
Best case you spend time that could go into something else valuable today, to solve a problem for a year from now; worst case you get the assumptions wrong and fail to solve that second "today" problem as well as still needing to spend future time on refactoring.
Don't worry, the second half of the title has this covered:
> ... that could possibly work
In the scenario you've described, the technology is not working, in the complete sense including business requirements of reasonable operating costs.
Perhaps it really did work at first, in the complete sense, when the number of users was quite small. That's where the actual content of the article kicks in: it suggests you really do use that simple solution, because maybe you'll never need to scale after all, or you'll need to rewrite everything by then anyway, or you'll have access to more engineering talent by then, etc. I'd tend to agree, but with the caveat that you should feel free to break the rule so long as you're doing it consciously. But none of that implies that you should end up in the situation you described.
This is where I am arguing nuance. These decisions are contextual; and the superficially more complicated solution may be solving inherent complexity in the problem space that only provides benefit over a time period.
As an example, some team might decide to forgo a database and read/write directly to the file system. This may enable a release in less time and that might be the right decision in certain contexts. Or it could be a terrible decision as the externalised costs begin to manifest and the business fails because of loss of customer trust.
My point is that you cannot only look at what is right in front of you, you also need to tactically plan ahead. In the big org context, you also need to strategically plan ahead.
When this is accounted for, “the simplest thing” approaches “the best solution”.
It's making an ambitious risky claim (make things simpler than you think they need to be) then retreating on pushback to a much safer claim (the all-encompassing "simplest thing possible")
The statement ultimately becomes meaningless because any interrogation can get waved away with "well I didn't mean as simple as that."
But nobody ever thinks their solution is more complex than necessary. The hard part is deciding what is necessary, not whether we should be complex.
Your definition rubs up against what a UX designer taught me years ago, which is that simple and complex are one spectrum, similar to but different from easy and hard.
Often, simple is confused for easy, and complex for hard. However, simple interfaces can hide a lot of information in unintuitive ways, while complex interfaces can present more information and options up front.
The main argument I've seen against this strategy of design is concern over potentially needing to make breaking changes, but in my experience, it tends to be a lot easier to try to come up with a simple design that can solve most of the common cases but leaves design space for future work to solve more niche cases that wouldn't require breaking the existing functionality than trying to anticipate every possible case up front. After a certain point, our confidence in our predictions dips low enough that I think it's smarter to bet on your ability to avoid locking yourself into a choice that would break things to change later than to make the correct choice based on those predictions.
You can keep on doing the simplest thing possible and arrive at something very complex, but the key is that each step should be simple. Then you are solving a real problem that you are currently experiencing, not introducing unnecessary complexity to solve a hypothetical problem you imagine you might experience.
In other words, every time you optimize only locally and in a single dimension and potentially walk very far away from a global optimum. I have worked on such systems before. Every single step in and by itself was simpler (and also faster, less work) than doing a refactoring (to keep the overall resulting system simple), so we never dared doing the latter. Unfortunately, over time this meant that every new step would incur additional costs due to all the accidental complexity we had accumulated. Time to finally refactor and do things the right way, right? No. Because the costs of refactoring had also kept increasing with every additional step we took, and every feature we patched on. At some point no one really understood the whole system anymore. So we just kept on piling things on top of each other and prayed they would never come crashing down on us.
Then one day, business decided the database layer needed to be replaced for licensing reasons. Guess which component had permeated our entire code base because we never got around doing that refactoring and never implemented proper boundaries and interfaces between database, business and view layer. So what could have been a couple months of migration work, ended up being more than four years of work (of rewriting the entire application from scratch).
And to address something the GP said:
> I am still shocked by the required complexity
Some of this complexity becomes required through earlier bad decisions, where the simplest thing that could possibly work wasn't chosen. Simplicity up front can reduce complexity down the line.
I think you're focusing on weasel words to avoid addressing the actual problem raided by OP, which is the elephant in the room.
Your limited understanding of the problem domain doesn't mean the problem has a simple or even simpler solution. It just means you failed to understand the needs and tradeoffs that led to complexity. Unwittingly, this misunderstanding originates even more complexity.
Listen, there are many types of complexity. Among which there is complexity intrinsic to the problem domain, but there is also accidental complexity that's needlessly created by tradeoffs and failures in analysis and even execution.
If you replace an existing solution with a solution which you believe is simpler, odds are you will have to scramble to address the impacts of all tradeoffs and oversights in your analysis. Addressing those represents complexity as well, complexity created by your solution.
Imagine a web service that has autoscaling rules based on request rates and computational limits. You might look at request patterns and say that this is far too complex, you can just manually scale the system with enough room to handle your average load, and when required you can just click a button and rescale it to meet demand. Awesome work, you simplified your system. Except your system, like all web services, experiences seasonal request patterns. Now you have schedules and meetings and even incidents that wake up your team in the middle of the night. Your pager fires because a feature was released and you didn't quite scaled the service ro accommodate for the new peak load. So now your simple system requires a fair degree of hand holding to work with any semblance of reliability. Is this not a form of complexity as well? Yes, yes it is. You didn't eliminated complexity, it is only shifted to another place. You saw complexity in autoscaling rules and believed you eliminated that complexity by replacing it with manual scaling, but you only ended up shifting that complexity somewhere else. Why? Because it's intrinsic to the problem domain, and requiring more manual work to tackle that complexity introduces more accidental complexity than what is required to address the issue.
An example I encountered was someone taking the "KISS" approach to enterprise reporting and ETL requirements. No need to make a layer between their data model and what data is given to the customers, and no need to make a separate replica of the server or db to serve these requests, as those would be complex.
This failed in so many ways I can't count. The system instantly became deeply ingrained in all customer workflows, but they connected via PowerBI via hundreds of non-technical users with bespoke reports. If an internal column name changed or structure of the data model changed so that devs can evolve the platform, users just get a generic error about Query Failed and lit up the support team. Technical explanations about needing to modify their query were totally not understood by the end users and they just want the dev team to fix it. Also no concern in any way for pagination, request complexity limiting, indexes, request rate limiting, etc was considered because those were not considered simple. But those can not be added without breaking changes because a non-tech user will not understand what to do when their report in Excel gets a rate limit on 29 of the 70 queries they launch per second. No concerns about taking prod OLTP databases down with OLAP workflows overloading them.
All in all that system was simple and took about 2 weeks to build, and was rapidly adopted into critical processes, and the team responsible left. It took the remaining team members a bit over 2 years to fix it by redesigning it and hand holding non-technical users all the way down to fixing their own Excel sheets. It was a total nightmare caused by wanting to keep things simple when really this needed: heavy abstraction models, database replicas, infrastructure scaling, caching, rewriting lots of application logic to make data presentable where needed, index tuning, automated generation of large datasets for testing, building automated tests for load testing, release process management, versioning strategies, documentation and communication processes, depreciation policies. They thought that we could avoid months of work and keep it simple and instead caused years of mess because making breaking changes is extremely difficult once you have wide adoption.
>They thought that we could avoid months of work and keep it simple and instead caused years of mess because making breaking changes is extremely difficult once you have wide adoption.
Right. Do you think a middle ground was possible? Say, a system that took 1 month to build instead of two weeks, but with a few more abstractions to help with breaking changes in the future.
Thanks for sharing your experience btw, always good to read about real world cases like this from other people.
There’s a HUGE difference between the simplest thing possible, and the simplest thing that could possibly work.
The simplest thing that could possibly work conveniently lets you forget about the scale. The simplest thing possible does not.