All the horrid Java libraries that we hated working with 10 years ago were created because of slavish devotion to the single responsibility principle, short methods, DRY, and testability. The absolute worst codebases I've ever seen sprung from an overaggressive concern for extensibility.
I do agree with the guiding principle, that when you're writing code, it should be written for other people to read and work with. But sometimes that actually does mean that a class should represent an object that does more than one thing, and you shouldn't fight it. Sometimes a gritty algorithm really does make the most sense if it's laid out as a 40 line function in one place rather than spread across 5 different classes so each part is independently swappable and testable. Sometimes you really don't need extensibility, you know that up-front, and building it in just pisses off everyone that ever has to touch your code because now they have to go through five or six "Find implementing classes" dances to find the actual code that runs when your interface methods (of which, of course, there's only a single implementation) are called. Don't even get me started on abuses of dependency injection when it's not necessary...
Part of me is glad that these concepts are so commonly discussed, because they really are good things to consider, and can make code much tidier and easier to work with in the best case. But it takes a lot of experience to know when and where you should and shouldn't follow these "rules", and it tends to be much easier to unwind a novice's spaghetti code and tidy it up than it is to pick apart a poorly conceived tangle of abstraction and put it back together in a way that makes sense.
There is 3 levels of programmers.
Level 1 is the beginner. A level 1 programmer is barely able to make a complete app or lib. His work simply rarely works. The code has no sense, is full of bugs, it's a complete mess, and the user has a very poor experience (if an experience at all).
Then comes the level 2 programmer. The novice. The novice learnt about OOP, design patterns, DRY, single responsability principle, importance of testing, etc. The only problem with the level 2 programmer is that he overdoes everything. He is victim of overengineering. He is able to make a complete app with a decent UX and the quality of the code is dramatically better, but his productivity is low. Really, really low. Everything takes month. Every single change is a project. 10 actual lines of codes need to be changed? That's 10 classes, 250 lines of unit tests, it requires 3 classes to be refactored, and on and on.
Finally, there is the level 3 programmer. The level 3 programmer is a programmer that does level 2 programming in a smarter way. His first principle is less code. That doesn't mean the code must be complete shit, but the level 3 engineer understand the real enemy and that all problems grow with it exponentially: code. The level 3 programmer knows, understand and apply the principles of the level 2 programmer, but he just does the right amount of them. Not too much.
If he has to, the level 3 programmer gonna err on the side of the level 1 code and will avoid as much as he can level 2 code. That's because level 1 code cost less to fix than level 2 code.
Now here comes my point: a level 2 programmer read and write articles that is about how to not be a level 1 programmer.
I've been doing this recently and had a hard time justifying it even to myself. It just felt right. And you've got it - it's easier to fix dumb code, than refactor an overengineered mess.
> Now here comes my point: a level 2 programmer read and write articles that is about how to not be a level 1 programmer.
You're making me understand the current programming "scene" much better, especially the plethora of self-aggrandizing blog posts I tend to see today.
I wish I could upvote you twice.
A good, easy to understand codebase is one that has a minimal number of components that interact with each other in a well-organized and easty-to-understand manner. Those components, in turn, should each be made of a minimal number of sub-components that interact with each other in a clear, well-organized manner. And on down until you hit bedrock. Doing this is essential because it sets up boundaries that limit the amount of code and interactions that you need to juggle in your head at any one moment.
A programmer who doesn't understand this will produce horrible, tangled, tightly-coupled code regardless of whether they're doing it with a boiling soup of global variables or a boiling soup composed together from tiny "well-factored" classes that all implement interfaces containing only a single CQRS-compliant method.
I'd submit that there's another level of programmer, let's call it the SubGenius programmer (because the term tickles me) who thinks that levels 1, 2 and 3 programmers all share a sloppy work ethos and need to spend more time stepping back to keep an eye on how the forest looks instead of obsessing about trees all the time.
The level 2 programmer knows that the tomato is a fruit. The level 3 programmer knows that it is not a good idea to put it in a fruit salad.
I agree with you, but it feels a little like a false dichotomy. You can pretty easily decompose a long function into multiple smaller ones, within the same file. ~40 lines is close to the breaking point IMO. If it grows much beyond that, you can probably see groups of 10+ lines of code that have some sane subset of inputs/outputs. Toss those lines in another function, give it a name, make sure it's not exported outside of this file. If the 40+ line function gets reduced to a 10-20 line one with a few more stack frames, it is probably worth it.
If it's 100,000 lines of code, and you break stuff every 40 lines, you have now introduced 2500 procedures many of which don't really need to exist. But because they do exist, anyone who comes along now has to understand this complex but invisible webbing that ties the procedures together -- who calls who, when and under what conditions does this procedure make sense, etc.
It introduces a HUGE amount of extra complexity into the job of understanding the program.
(Also you'll find the program takes much longer to compile, link, etc, harming workflow).
I regularly have procedures that are many hundreds of lines, sometimes thousands of lines (The Witness has a procedure in it that is about 8000 lines). And I get really a lot done, relatively speaking. So I would encourage folks out there to question this 40-line idea.
See also what John Carmack has to say about this:
http://number-none.com/blow/blog/programming/2014/09/26/carm...
/**
* get the leaf node as a string
* @param obj the json object to operate on
* @param path the dot path to use
* @return the leaf node if present <b>and textual</b>, otherwise null
*/
public static String leafString(ObjectNode obj, String path) {
JsonNode leaf = leaf(obj, path);
return leaf != null && leaf.isTextual()
? leaf.asText()
: null;
} The KISS principle states that most systems work best if they are kept simple rather than made complex
Therefore, simplicity should be a key goal in design, and
unnecessary complexity should be avoided. YAGNI is a
practice encouraging to purely focus on the simplest
things that make your software work.
made me think that it clearly warns against overengineering.(edit:formatting)
"The wrong abstraction is worse than no abstraction" comes to mind here. Very simple, repeated code that is unlikely to change (or unlikely to change everywhere at once because things are unrelated and just happen to do the same thing... -for now-) is much easier to read and maintain than short, clever, DRYed up code.
I'm big on linting code though... For me the more everyone's code looks the same the easier it is to read without doing mental gymnastics.
One of the reasons I enjoyed learning and writing Go was that it sort of encouraged keeping things "tight"
Often times programmers look at lines of code or number of classes written as a source of pride, when we really should be looking at these things as expenditures.
Slavish adherence to these principles (with the exception of DRY) violates one of my core principles: write the least code possible to solve a problem.
The single responsibility principle in particular isn't what I'd consider a fundamental principle of good code since it often can lead to writing an awful lot of unnecessary boilerplate. Short methods are similar (if a method is called only once I often consider that a code smell).
I think the culture of Java is also partly responsible for this: grand architectural designs are valued over terseness and simplicity. A system with stack traces as high as your arm are seen as something to be proud of.
For an instance of how it depends, see the classic Forth example for implementing a washing machine driver:
...
: RINSE FAUCETS OPEN TILL-FULL FAUCETS CLOSE ;
...
: WASHER WASH SPIN RINSE SPIN ;
Then a single call to WASHER is all that happens when the user presses the start button.Function calls should make things clearer, not less clear because of some other principle (like DRY or single responsibility or testability etc.). That's ultimately where I think the JavaLand culture has gone wrong, all these abstractions (many of them caused by being forced to live in a Kingdom of Nouns) that are sometimes quite useful in the large are always a pain in the small, but many times the small is all that is needed. Big projects are slowly learning they don't need to be so big, they can instead be a set of independent smaller projects, but it'll take more time.
Well I certainly disagree; it often makes sense to separate out a singly-called function if for no other reason than to be able to unit test that functionality individually.
The problem, of course, is that your critical thinking needs experience before you truly become capable of deciding if a particular principle should be followed or broken in a particular case.
In short, I think guidelines help, but you have to walk the path to know the path.
Anyway, it's all about UX. Certain types of programmers will want a certain types of interfaces. A well designed library will have been crafted to meet those programmers specific use cases.
I'd agree that finding the right level of abstraction is hard, easy to misjudge even for an experienced programmer with YAGNI in mind, and that you can do more damage by over-abstracting than not abstracting. Most programmers experience this the hard way sooner or later.
I'd disagree that having objects/functions do more than one thing or not be able to be pulled apart into individually testable units can be a good thing - except in very rare cases, or to compromise with existing legacy code.
Would you mind giving an example of where an object should do more than one thing?
Over-concern for extensibility is just a really big problem.
I've had one discussion that stood out. We were talking about functional programming and I said that the idea of functional programming was the segregate state to the smallest unit of computation that must operate on it. I was met with my "friend" just blindly telling me to "stop saying that". I asked for a counter argument to that statement but they just said that I should stop saying that.
We as a community are horrible at speaking about code quality , evaluating each others work, and even just sifting out opinions from facts. It's crazy and it's something that needs to change if we want to take our field to a future where we can all be happy with the code we are writing.
I've got some suggestions I'm happy to talk to others about. My email is in my about page on this website. Please comment here or email me and I'd like to talk about this!
In reality it is probably a lot more like learning how to apply any other concept, it takes practice, yet we don't spend enough time practicing our craft. This is particularly a problem because in the course of developing a product you will only get to implement a new solution to a problem a few times at best, or once more likely. This means we don't get enough experience with the different possible approaches to really internalize what a 'good' solution looks like.
We spend much more time learning the software development approach du jour, XP, Scrum, Kanban, Lean, etc. It doesn't matter what software development approach you use if the output is an unmaintainable mess of code.
That's insane! I think that a goal of any programming course around the world should instill the idea that getting your peers to modify and read your code is a must in our industry. People need to be able to com along, see what you've done, and understand it.
If anyone want's to comment on my code that I'm talking about in this example it can be found here:
Implementation: https://git.gravypod.com/gravypod/school/tree/master/cs280/h...
Assignment: https://web.njit.edu/~gwryan/CS280/CS280-Program-1-Fall-2016...
I've attempted to live up to what I think is "good code" but no one want's to tell me if I'm right or wrong or even discuss this for fear of hurting my feelings I presume. I always get "run valgrind or a linter on it" and I've done that and come up with no ways for improvement. Everything is all opinion and no fact in this business of code cleanliness although this should be a cornerstone topic for the software development industry.
I also learn from other mistakes as well. I has a habit of inlining conditionals in python -
if False : continue
until I noticed that the same style made reading someone elses code harder. I didn't notice it in my own code as I was more familiar with it. When I did notice I stopped doing that.Keeping the logic as close to the data as possible helps as well (ideally in the data schema if possible). I think Linus's quote about bad programmers worry about the code, good programmers worry about the data and their relationships is true for most of not all levels of the stack.
I'd say one of the few people in the world who has given grade-a examples is Rob Pike. That said I don't agree with him on much but I do very much respect his talent for simplicity. I'm directly referring to his Regex parser. It's amazing work although I'd much rather see him implement it live from scratch.
Something like the SICP lectures on youtube. It's amazing what they go over and do in that class and how they teach you about abstracting your problem domain.
You can discuss it with me.
The single responsibility principle should really be "exactly one" rather than "not more than one". If a function is responsible for less than a whole thing, it shouldn't be a function on its own.
This is something I was learning when I was working in Swift, in a WWDC video they called it local reasoning; being able to reason about the code in one particular function without having to worry about state changes elsewhere. Now working in C# it seems to be a common thing to pass an object by reference to a bunch of different functions to fill in it's values and map properties. If you do too much of that it can become a maintenance and debugging nightmare when a property didn't map up the way you wanted it to.
I also agree with what was said in a different comment about not breaking functions up too much if it's not necessary and that would help eliminate this need to modify state in so many places.
This is a tenet that will lead inexperienced developers astray. This "rule" is just too ambiguous. Extensibility is a fascination for object-oriented programmers but in my experience doesn't have a lot of successful examples. Typically I have seen this manifest itself in either a Command+Composite style where every meaningful class is a composite of other SRP classes, or in a proliferation of interfaces that seldom have method definitions and are instead used to enforce coding standards or dependencies.
KISS is incompatible with this rule and you should kill this rule with fire because simple is not extensible. Perhaps when the goal is extensibility then should you consider other developers, but if you are developing a "beige" application then you should not consider extensibility. Instead, just assume that release management will handle changes, i.e. another developer will update rather than extend your class and that will be released in version 1.1.
Of course, to do this also means admitting that version 1.0 of your class was pretty much garbage and that it needed to be "extended". Tough pill to swallow for some.
The idea in Unix is that data are the interface, in case of Unix the data are unstructured text, but I think it can be generalized to systems with structured data. So contrary to OOP, most extensible systems seem to be the ones that (self)document their data structures as interface and leave it at that.
Maybe there should be a 'click-bait' button on HN with which we can report things as such, along with posts such as 'Why I won't be using popular-technology-x ever again' and '10 things I hate about SQL'
Not sure I agree with this one. While abstractions are a great way to reduce the length of code, sometimes they break readability. When you read code, sometimes, you feel like you don't read a solution to your problem, but a way to solve your problem masked behind abstractions far removed from the domain concepts.
That's why, sometimes, redundancy is better than the wrong abstraction.
Consider a biography: you could simply collect facts about a person and write them in an arbitrary order and call it a biography. It could be a complete and accurate account, and still be impossible to read or follow.
Well written code is not only complete, but it also guides the reader through the logic.
Consider the difference between:
statuses = []
reporter = Reporter.new
jobs.each do |job|
statuses << job.complete && !job.error
end
and job_statuses = jobs.map do |job|
job.complete && !job.error
end
job_status_reporter = Reporter.new
In the first case, we see statuses declared. Statuses of what? Not yet clear. And the code that updates it is separated by unrelated code. Also, what will reporter be reporting?
In the second case, map and better naming are used making it clear that we are getting a status for every job. Aha! I don't even need to look at the implementation of the do block to understand what's happening.I always find that map() methods tend to obscure the purpose of the values whereas a simple appears more explicit.
Not sure how giving less information is preferred over being explicit, but hey, whatever works for you.
I remember when looking into neural nets, the basic python code to get one running was super easy for me to understand. However, I also realize that the optimal, most efficient methods of neural network libraries are way more complicated (and for good reason).
This one sentence should be guiding principle for any set of recommendations. Programmers should tape it to the top of their monitor. Programs difficult to reason about are hard to get working and hard to debug and hard to maintain.
In my opinion, this is the key requirement of "good" or clean code. Likewise, programming languages that facilitate this are "better" programming languages.
Depending upon the goals of a program there can be other important requirements (for example, efficiency), but however the program is constructed it should be as easy to reason about as possible given the constraints imposed by these other requirements.
Recognizing this, I tend to pick programming languages that make my programs easier to reason about and tend to program in a style that makes informal reasoning easier. For kernel development, programming in C was appropriate (back when I was doing it) because I needed to know exactly what was going on in the machine in response to the code. At the other end of the scale, straightforward Python code often results in such short programs that they fit entirely on one screen and are consequently easy to understand and reason about.
This style of programming has led me to be very impatient with the "keep debugging" until you think it works style of development. I tend to think that each bug found is evidence of the presence of more bugs.
>> Programs difficult to reason about are hard to get
>> working and hard to debug and hard to maintain.
Actually reasoning and maintaining often are at the opossing ends. The code which is easy to reason about tends to be low on abstractions and tightly coupled, so changes become much more difficult. More modular code is harder to reason about but the changes can be introduced easily.Well said (currently programming by dead reckoning).
Therefore write everything in Prolog.
In a SaaS environment the code needn't be extensible as there is only 1 copy of it. It is much more important for the code to be changeable, rather than extensible and in many cases the things you do to make code extensible make it harder to alter fundamentally.
Its important to understand that how you deliver your software is one of the biggest guiding factors in how you design your software and take that into account.
Good code, you can search it... even if the majority of it is unfamiliar. Find a piece, and say "oh this is probably the spot" with out ever executing it.
An excerpt: We are practically the only industry where completion and success are synonymous. If the foundation of a one-year-old home is crumbling and its roof is plagued with leaks, would anybody actually call that a success? Despite being filmed and produced on budget, is there anyone who would not be ashamed to have Gigli on their filmography? Of course not! So why are the products we create – complex information systems that should last at least fifteen years – be held to a different standard?
Now think back those projects of yours. How many would you say are maintainable by someone with less business knowledge and a weaker grasp of the system’s design? How many will not snowball into an unmaintainable mess? How many do you truly believe could last fifteen years? I’ll bet that number is quite a bit lower than all of them.
Coincidentally, the number of projects where the client offers to pay for tens of man years is also quite low. This is something I'd recommend to be taken into consideration by armchair philosophers portraying the state of the software industry.
I write code that I come back to and don't understand, and realise its terribly complex because I understood everything at the time, but could not imagine what it would be like coming back after a month to make a change, or for someone new to try to understand it.
Code is for people to understand.
Some people think its for being able to write good tests, eliminating all side-effects and shared state, for the computer to be able to run quickly and optimise, to be easy to read.
But its really just about being able to be understood. Most time will be spent in maintenance.
When you modify and debug, you are diving into the middle. You are not walking through the code base from the beginning and reading all the comments.
It needs to be understandable from all locations.
I'm a strong believer in micro-modules. Left pad et. al. I try to create small modules which do one thing well that I can trust and not have to think about.
This quote always sticks in my mind :)
I can remember a instance, when I started my carrier. I was a JS developer. I wrote a module with Functors, Compose, Partial etc.
In code review my team told they didn't understand anything and reading such a code is not pleasant for them. I was upset, I was thinking why my team is not happy with it.
Today I can make lot of sense; stick to the pattern/design of your team. If your team follows / loves functional programming in your project, stick to it. If not try to advice them why functional programming would be better than normal approach.
End of the day, all it matters is writing simple, elegant code which others can understand.
The easier it is to remove code from someone else's program, the "cleaner" the code.
That's my definition of "clean code".
For example, I just had to edit the code for a text-only browser to remove a few "features". For example, the author recently decided it would be a good idea to non-interactively access no-content refs in a page such as "prefetch" and other garbage.
But then, contrary to intuition, a program becomes less clean by actually removing code (since now there is less code to remove, hence it is more difficult to remove code). A minimal (in the sense that no more code can be removed) program would be maximally unclean, whereas, intuitively, should it not be considered clean?
Also, adding code always makes a program cleaner, since the newly added code can always be removed easily.
Or else what happens?
"... should it not be considered clean?"
It should be considered finished.
"... since the newly added code can always be removed easily."
Not true in my experience. Unfortunately.
Sometimes it's necessary to add some code, e.g., a new driver for a new item of hardware. I have nothing against adding, per se.
Yes, most things in the article are nice to have, but you can't teach everything in a month and people need to be prodcutive to be worth their money.
When this rule is not respected:
1) You end up with lot of coupling
2) Unit tests end up with a lot of overhead
Also the article is from 2013, did these exist then?
I'm passing on this article.
Just because you know it doesn't mean everyone else does.