A related principle is what I call code locality. Instruction locality is the grouping of related instructons so they can be the CPU's cache (can an inner loop fit all in cache). Similar for data locality. Code locality is for humans to discover and remember related code. As an example of this is those times you make an internal function because you have to but it has a terrible abstraction (say a one-off comprator for a sort), its best for comprehending the caller to be near where its needed within the same file rather than in a separate file or in a dependency in another repo.
Applying code locality to SPOT, when you do need multiple sources of truth, keep them as close together as possible in the code.
In this case, I sometimes prefer a slightly different approach; let's call it CRAP - Cross Reference Against Protocol: instead of definitions effectively referring physically to the same point of truth, they are designed instead such that they simply cross reference against a protocol, warn if there has been a deviation in protocol giving the developer the opportunity to go back and correct as necessary, or otherwise allow it and manually sever the link to the protocol if one is no longer desired.
This prevents against SPOT's weakness (at the cost of some extra manual work / due diligence by the developer), which is accidentally enforcing future convergence in situations when previous convergence was incidental/accidental, and divergence should have been allowed to take place instead.
I'd also say there is a lot of nuance about what "truth" is (i.e. is a pizza crust/sauce/cheese an essential truth that should have a single source).
Some DRY definitions I read actually tie in SST but I think many devs don't bring that nuance to it.
It is very hard to find out if the definition already exists or not in the codebase. This can lead to multiple definitions of the same thing or the truth.
anyone has a good way to deal with this?
That's why I'm pretty dogmatic in applying DRY to infrastructure as code, for example, as opposed to generically for every code base, because finding or depending on repeated identifiers that have to be the same is such a source of error here.
</inevitable_joke>
This is not a reason to avoid SPOT altogether, but one think through that situation as part of their mental calculus on pros & cons.
Nit: the article’s problems also have an obvious solution—named parameters with default values when not provided.
I can think of using a Rule Engine but not sure if there are any performant ones and they don't seem to be used much.
People not able to factor out functions or structure their code in a readable way. Variables are called v1, v2, v3. Unit testing seen as a waste of time. CI seen as a fun toy. They lack the experience to even notice the difference.
Maybe I'm becoming a curmudgeon, but I think many people would be well served by just googling "<my programming language> best practices", learning the acronyms like DRY, and just following them. And when you have gained some experience, sure, then you should question the wisdom and not follow it blindly.
Had a colleague work under a 'team lead'. Needed to take a form with variable amount of rows of input data - max 50 - and take data, parse it, and store it. Took 20-30 lines of code. Next day "I don't trust loops, these need to be unlooped". Really? This was all in writing and stated out loud in a meeting with witnesses, and everyone agreed. "Loops can be tricky - they don't always work like you think" (something like that). So a 30 line block of code with a loop around it became 1200+ lines with 'v1, v2, v3.... v50', with 'ifs' around each one to check if that row number was also submitted.
The code to generate the form was, of course, a loop that spat out holders for 50 rows. THAT was OK, because someone else's team wrote that a while back (really??) and ... it was already done and in production. The lead could not put their stamp on it.
Very very very weird. Having half a dozen other people all nod their head suggesting that a 30 line loop is fraught with danger, and the correct answer is copy/paste 50 times. Felt like gaslighting, to my recollection. Worked in same dept, just not on same project together, but enough of this was heard/pickedup across the dept.
And... my colleague and I aren't there any more, and to my knowledge, that team lead is still there.
DRY deserves to be listed with other code smells that may indicate a missing abstraction (and there are many), but there is never a reason to blindly extract chunks of code based purely on repetition. That's not a "first step toward good programming", it's a step in the wrong direction.
I have seen this working in finance. I worked at a startup with a quant, and my job became waiting for him to go home at 6 so I could clean up his code and make it maintainable the next day. Now I don't blame him- his background was operations research and he was a professor prior, without any real software background. But, he would get legit mad at me for messing with his code. We had a really tense relationship for awhile. I would occasionally break things, but he didn't really have tests until much later, so it was hard to detect and there were often subtle side effects.
But anyway, I think for awhile he thought I was just a pain in the ass, until one day about 9 months after we started the project, and he wanted to run some experiments using a specific universe of securities, and just apply a few constraints to them, and I set this up for him in about 5 lines of code, and all of a sudden, I could just see the light bulb finally turn on for him as to why I was doing all of these things. After that day we became a lot more friendly.
You can’t adopt DRY or SPOT or anything else if you aren’t free to refactor and you aren’t free to refactor without some tests.
> learning the acronyms like DRY, and just following them
Following DRY is the hard part. As the author points out, to DRY something up you need to pick the right level of abstraction. This requires experience to do well.
While there are certainly some DRY proponents that treat it like that, most discussions I've seen advocating for DRY treat it as a rule of thumb. There are always exceptions, but DRY will steer you in the right direction more often then not.
I'm curious what languages and frameworks you work with. I suspect this is something that varies from one programming subculture to another. In my experience, the only codebases where I've consistently seen underuse of abstraction have been PHP and C, and in those cases it has been because the code was written by entrepreneurs or engineers who were not software professionals. In Java and Scala I've seen plenty of code that used too much abstraction but almost no code that used too little. In C++ and Python I've seen it go both ways.
In my book, what programmers need first of all is the willingness to rewrite their code. I've been trying to hit the right balance for twenty years, but even now I frequently have to backtrack because the decision to add or not add a layer of abstraction turns out to be wrong. Biasing somebody towards or away from abstraction just changes the kinds of mistakes they make without making them better at cleaning them up.
This is why best practices are needed, and why they are over-rated. Where they are used, they are often over-used.
For what it's worth, 90% of the code bases I've worked on overused DRY. Two pieces of code that do the same thing but that shouldn't be coupled should be repeated, otherwise you end up with a million "if" statements for all the separate cases this code will need to handle as it grows, as well as uncertainty about whether changes will have unintended consequences.
Or, put differently, my HN motto: don’t give uni-directional advice when optimizing a U-shaped error function.
Mistake 1: Switch from DRY to premature optimization.
> You might think that legit reasonable developers but would not actually do something like this and would instead go back to the existing invocations and modify them to get a nice solution, but I've seen this happen all over the place.
Mistake 2: Assumption of incompetence to support your argument.
> . As soon as we start the thought process of thinking how to avoid a copy paste and refactor instead, we are losing the complexity battle.
Mistake 3: Strawman argument. DRY does NOT lead to over-complicating things. Overcomplicating things leads to overcomplicating things.
Now, i wasted 5 minutes, so you can waste some more to reply to this comment, instead of completely ignoring this dumb random blog post.
> DRY does NOT lead to over-complicating things.
That is not true. I dive around foreign code bases a lot and dry-ness is actually a significant complicating factor in understanding code, because you're jumping around a lot (as in physically to different files or just a few screens away in the same file). As in, inherently every time it's used, not just in situations where it's used in a complicated way.
This sounds dumb but it just simply is much harder to keep context about what's going on around if you can't refer back to it because it's on the same screen or one short mouse scroll above or below your current screen.
That obviously doesn't mean you should leave copy pasted versions of the same code over your code base. But it's important to consider that refactorization of that code into something common that gets called from multiple places as something that you don't get for free, but that is an active trade off which you usually have to apply to prevent bugs (changing one code location and not the other) or simple code bloat. In practice this is very relevant when you suspect something might be repeated in the future, but you're not sure. Imo: Just don't factor it out into anything, leave it there, in place, in the code.
Mistake 1a: Conflating the term "premature optimization" - it doesn't apply here. Premature optimization is about runtime performance, DRY is about optimising maintenance overhead.
Mistake 1b: (good) DRY can't be done early (it's a continuous process throughout project development).
> Mistake 2: Assumption of incompetence to support your argument.
Mistake 2: Assuming you're never working in teams leveraging peers' of varying experience and technical focus.
The presumption of re-usability is absolutely the most common red flag I've seen with DRY: I've seen it with a lot of very senior / experienced devs. You can call them incompetent, but there's plenty of them and we have to work with them. Articles like this help.
> Mistake 3: Strawman argument. DRY does NOT lead to over-complicating things. Overcomplicating things leads to overcomplicating things.
This statement concerns me. DRY very obviously and demonstrably leads to over-complicating things (excessive / ballooning parametrisation is just one of many very simple examples of this). If you can't see this I would have my own concerns about competence...
"Premature optimization" is largely a bogus concept, because the meaning of "optimization" has shifted a lot since the concept was first created.
People now use optimization to mean "sensible design that does not needlessly waste resources".
In this meaning of optimization, "premature optimization" is a bogus concept.
You should absolutely ALWAYS write non-pessimized code by default.
What the original concept referred to is what people now call "micro optimizations". Sure, premature micro optimizations is often a waste of time. But this is irrelevant to the context of this discussion.
Okay, but it kind of is about incompetence. And by “it” I mean everything. Look, we all remember that first time we all realized that adults are just winging it most of the time. Almost nobody knows what they are doing. Half the people who “know” actually know the least.
> DRY does NOT lead to over-complicating things.
Don’t Repeat Yourself is a terrible acronym because what it stands for is exactly the opposite of what people do. Not doing something is avoidance, opting out, like “don’t push your sister” vs “be nice to your sister”.
What most people do is they realize they have already repeated themselves, or someone else, and they rip it out. They deduplicate their code. Avoidance definitely can “lead” somewhere, but deduplication is active, and that can often be headed the wrong way, either directly or obliquely.
The Rule of Three is much clearer on this. You get one. There’s nothing to do when you see you’ve duplicated code - except to check if you’re the first or not.
Sure, I agree, except DRY is probably the second greatest gateway drug to overcomplicating things to OOP. Actually, they really hand-in-hand since OOP features are often used to DRY things.
DRY can easily go too far because fundamentally it's about centralizing ideas with the premise that different operations can and should share units, even though a "writeSomeFileToDisk" function doesn't necessarily have to do the exact same thing between different higher-level operations. Because so many engineers emphasize "elegance", if a set of functions seem similar enough, they pressure themselves to write code that is shareable, hence more abstract. Abstractions are inherently more complicated and hard to understand, not the other way around. Rather than having very simple "molecules" of code that can be understood on their own, there is instead a much larger molecule of nodes that are connected by abstract dependencies, and those nodes may only have dependencies in common.
DRY should be done sensibly, but teaching DRY is a problem in our industry because we don't teach engineering discipline. We teach principles like DRY and OOP, and even YAGNI as if they are tenets of a religion.
Fallacy: False Dichotomy and No True Scotsman.
"Things are either DRY or premature optimization and can't be both"
"No TRUE application of DRY would ever be a premature optimization"
This would be the most efficient title, subtitle, and entire contents of most posts about programming principles.
However, each reader has to have a similar enough perspective, background, and experience to understand and apply it. In that sense, the trend line measuring the value of commenting about comments about random blog posts indeed indicates wasted time, but hopefully it's a local minima.
My pithy corollary to your helpful tautology is a quote from Tommy Angelo that's stuck with me since my poker days: "The decisions that trouble us most are the ones that matter least."
Decisions are necessarily difficult to make when the expected value of either outcome are similar. We waste an awful lot of time on choices that could have been made just as well with a coin flip.
So there you go world: two quotes that are generally useful about generalities that are locked, loaded, and ready to shoot you in the foot when misapplied.
Edit: formatting improvement.
Though note that DRY can itself be premature optimisation of the codebase.
For what it's worth, I agree with your points and disagree with the various counterpoints that were posted; "optimization" can mean a lot of things, and I for one understand what you mean.
No really, you're absolutely on point. The post is not worth the time, the case against DRY is too weak.
Sounds like a kid complaining about pushing DRY in a direction that overcomplicated things for him because of himself and instead of improving himself he choosed to attack "an uncomfortable principle".
If your use-case is that the user can select the crust, sauce, cheese and toppings for a pizza, just pass that shit to the make_pizza function with the help of enums and arrays. If you want to have predefined pizzas, you'd simply make a dictionary of pizza templates with all the options that the make_pizza function needs and/or if you wanna be fancy, you'd make a separate make_pizza_from_template function, but definitely not a make_pepperoni_pizza function, because that's just encoding data as a code in a silly way that arguably not even a factory pattern.
No solution will be able to cater to requirements that don't exist at the time of developing this pizza-application. You build it according to the requirements that exist and that is enough. It's not your fault if nobody cared to mention that the user should be able to arbitrarily subdivide the pizza and select options sepatately for each subdivision - that's a feature update and it's OK if the original program hadn't though of that. Just like you wouldn't scaffold ecommerce capabilities into a webpage "just in case", if there had been zero mention of such a need.
Mixing pizza database identifiers into generic pizza processing (e.g. make_ham_pizza) is wrong even without repetitions.
I would argue it’s part of your job most of the time to challenge whatever needs are presented and ask questions about the long-term vision to find a good middle ground of future proofing vs over-engineering. That is of course one of the hardest things to get right.
I think I know the title of Gordon's next blog post: "Why YAGNI is the second most over-rated programming principle."
EVERYTHING HAS TRADEOFFS. Every single thing has tradeoffs. Obviously you should not write terrible, brittle code. The reason DRY is important is because when you start duplicating code, having 30 different serialization methods littered throughout your code, 5 different ways of calculating the same value, etc etc you see why it really matters. Its a GUIDELINE used to as a general rule. And as guidelines and general rules go -- its useful for juniors and people who don't have the experience to see the best way to write the code.
Its a good default, and like YAGNI, and 100 other programmer acronyms it has its ups and downs. Your pizza example is not "coincidental repetition" -- it is actual repetition -- you just abstracted it in a really poor way to make a strawman.
The problem is that he made his case poorly and I definitely don't agree.
The article gets this wrong by considering DRY as some kind of dogma and then discovering some situations where it doesn't work well. And then of course some commenters here get it wrong by only looking at situations were it does work well. It's the same religious discussion again as FP vs OOP, static vs dynamic typing, no code vs full code etc. etc. The real answer to each of these is always 'it depends'.
This should be the lede, IMHO.
Not enough people are flagging the post.
Bullshit. What's the tradeoff on using `gets` vs any other function?
Nothing. Absolutely nothing. `gets` is wrong 100% of the time period.
If you're wrong its not a trade and a lot of things in this article and about dry is wrong
crust: "thyn",
DRY is about avoiding this class of cut-and-paste bugs too. Or with changing a string to a token, as it should have been: crust: THIN
The code isn't even correct. It's mixing JavaScript and Python. I'm also not sure why you'd declare functions for each type of pizza; that's data. I'm not sure about the context, but the right way is: def make_pizza(crust=THIN, toppings=[], cheese=REGULAR, sauce=TOMATO)
and then in each call, to override. make_pepperoni_pizza()
is bad code compared to make_pizza(toppings=[PEPPERONI])
All of the code in this post is horrible, and has easy solutions.I feel dumber for having read this post, and even dumber for having responded.
In case it gets fixed: https://i.imgur.com/ZR2XKA7.png
Yep, had the same thoughts reading the code. What you suggest even seems a purer implementation of the DRY principle, rather than what is proposed in the article which would result in copy and pasting the make_pepperoni_pizza() function as soon as you decide to sell a third type of pizza.
Of course, the DRY principle used without considering other factors could produce bad results, but all the code in the article is bad for reasons unrelated to the principle it attempts to criticize.
make_pepperoni_pizza()
is bad code compared to
make_pizza(toppings=[PEPPERONI])
How would you make Hawaiian pizza? I forget, does it include Ham? or just pineapple? you're forced to the remember that nuance in your suggested implementation, but not with "make_hawaiian_pizza()" hawaiian_pizza = {
crust: "thin",
sauce: "tomato",
cheese: "regular",
toppings: ["ham", "pineapple"]
}
pepperoni_pizza = {
crust: "thin",
sauce: "tomato",
cheese: "regular",
toppings: ["pepperoni"]
}
def make_pizza(pizza):
requests.post(PIZZA_URL, pizza)
This isn't better just because it's DRY, it also keeps the data separate from code, which makes it usable elsewhere. Defining fifty different types of pizza inline inside functions is a strange choice, because it tightly couples your pizza definitions to your pizza-making. What if you want to answer a question like "how many thin crust pizzas do we sell?"To speak to the wider point about DRY, it's a guiding principle for abstraction. If you have two kinds of abtractions for your method, one leads to code repetition, the other does not. Generally favour the does not.
The fact that you should have additional rules, like you shouldn't need rabbit hole debugging (jumping through a million files/objects) to understand core behaviour is not a failure of a useful guiding principle
pizza_base = {
crust: "thin",
sauce: "tomato",
cheese: "regular",
toppings: []
}
hawaiian_pizza = pizza_base {
toppings: ["ham", "pineapple"]
}
pepperoni_pizza = pizza_base {
toppings: ["pepperoni"]
}
def make_pizza(pizza):
requests.post(PIZZA_URL, pizza) mixed_pizza = {
crust: [["thin"], ["thick"]],
sauce: [["tomato"]],
cheese: [["regular"]],
toppings: [["beef"], ["pineapple", "pepperoni"]]
}
Where you offload the logic of making multi-topping pizzas into the data.One of my pet-peeves is clones in unit-tests. People tend to care less about code quality when it comes to unit-tests, and code gets copy-pasted all over the place. The result is usually an unmaintainable ball of mess, where the most subtle variation in the unit being tested requires you to apply the same change in 15 different places. In this situation, DRY is a very useful indicator that something is going wrong.
Now the opposite of DRY is YSHRY - You Should Have Repeated Yourself. When you start adding 5 boolean parameters to a function to adapt it to all its calls, it's a smell that you thought you should have DRY, whereas YSHRY.
So I agree with you, as developer you should know when do duplicate code and when DRY, but overall try to maintain your code as simple as possible, that makes also easier to maintain.
The initial API here is actually quite nice - there’s a good separation of abstraction and specification, and I can see all the information about an individual pizza in one place. The idea of making a pizza and the recipe for each pizza exist once in their respective places.
It’s true that a common pitfall is to prematurely create abstractions before having concrete examples of how they’d be used. But DRY is a _refactoring_. It’s something you do to an existing codebase to better clarify its design, not necessarily something to strive for ahead of time. Much better to extract abstractions from existing examples.
I always remember the tale Ron Jeffries tells of Kent Beck actually _introducing_ duplication to allow both pieces of code to be refactored. Duplication can be an opportunity to refactor towards a clearer design, but it’s not a mechanistic thing to do without thinking.
The developer who wrote this thought himself the programmer genius and wanted to make a pattern out of everything. He did not accept criticism because "DRY is a holy principle".
And that is why a post like this is important. Because next time I have someone like him in my team I can point him to this post. Argument by Authority may be a fallacy, but it is significantly more persuasive than other arguments.
And yes, you can respond to this with "why did you hire this guy in the first place, or why did you not fire him?". Well I do not make all the decisions. Not every teammate is perfect. Such is reality. Particularly in such a young industry as software development which (compared to, say, electrical engineering) is still searching a common understanding of ubiquitous best practises.
I looked at https://gordonc.bearblog.dev/ - I don't know why I would think this guy was any more of an authority on what was important than I am. So I'm not sure if anyone who thinks they're a programming genius would even care.
Because many companies are actually looking for these guys following the principles to a T initially. It's only midway through they complain about them lacking flexibility, if ever. Then later everyone complains about the incomprehensible mess while a few go "that's the way things are, we just need smarter people to understand our solutions".
And of course, it takes them 10 years to do something with a huge team which only took a small team a few years.
Well, no.
What you actually need to know is the next layer out: DRY stands for Don't Repeat Yourself, sure, but Don’t Repeat Yourself isn't the rule, it's a short phrase that is supposed to be a memory cue for the principle “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”.
If all you know is “Don’t Repeat Yourself”, you don't know the principle, and you can neither apply nor critique it.
#1 is simply applying the memory cue as if it were the principle. Yeah, don't do that.
#2 is, well, no, you only refactor to extract a bit of knowledge to a common place where it is immediately reused: there is no presumption of reusability, it is demonstrated.
The specific example they use of how this might be done wrong is...so bad.
Starting with: the example code that they suggest is bad design but works does not work.
make_pizza([left_topping, right_topping])
gives args a length of 1, not 2, but their function definition relies on it having length 2, and using that to distinguish from the simple case.When you go read proper definitions of DRY they have lots of nuance that speak to many of my criticisms. But the reality is most developers are not encoding that nuance and using it as a fairly blunt instrument. I can't really prove it but at least some people in the comments seem to agree. So I guess I could say "DRY is misunderstood" - but if it's so easily misunderstood then maybe that's a shortcoming in and of itself?
Therefore for my personal projects I'm now firm believer of quick iterative building. Just get the first iteration done, get it working and save improvements for later. It may create a bit more work for the future me but it decreases the mental load quite significantly. I'll take less mental load with clear objective (refactor this because this) over more mental load with unclear objectives (make universal solutions taking into account things that may or may not happen in the future) any day.
(Important note: stupid is not incompetent - it's a proxy for clarity, composability and rational structure without becoming formal, rigid, overly orthodox or academic about it)
left_toppings = ["beef"]
right_toppings = []
make_pizza([left_toppings,
right_toppings]) # this will be a very funny pizza
Holy cow I was not expecting a none pizza with left beef reference in code formBe prepared for laughter
Anticipating maintenance is tricky; I had a scenario where a developer on my team created a utility function to abstract away some code that was being repeated multiple times in the same file. As an author that made sense because he was writing the same code over and over again, but down the line when we wanted a specific instance of this copied code to work in a slightly different way we ended up making the utility function handle the edge case.
Over time this utility became extremely hard to work with, because you weren't always sure if you made a change it wouldn't create a regression in other places it was used.
When we sat down and asked ourselves "Is this utility assisting in authorship at the expense of maintenance", the answer was clear. We removed it and put back the repetitive code. We felt good about it because in reality, 90% of the time we were interacting with this code we were doing it in maintenance mode, tweaks and small updates. When in maintenance mode I don't feel the strain of a specific part of my code being repeated, I'm only looking at a small subset of the code. Sure, if I need to author a new case in this code it might be a bit more wordy, but I think the tradeoff is worth it.
I am sure there are perhaps better ways to abstract things, or that we were doing DRY wrong, and our utility function could have been smarter, but I've seen this same thing play out over and over again and usually trying to make my abstraction better hasn't helped.
I think in the case you described, instead of handling edge cases within that function, it might have been better to create an entirely new function to be called in those cases. You could then go a step further and identify shared logic, extract those and call them separately. At least that's what I tend to do when I find myself having to branch logic, especially established logic. Obviously I'm assuming a lot of the details here, and most likely what y'all ended up doing was the best right thing for your project/team.
WET code (opposite of DRY code, often starts as copy pasted) has more code, more typing, but the diffs are simple, bugs are simple (often you simply forgot to copy piece of code into 7 different places which is easy to solve). After many iterations, what started as very similar "classes" is now completely different.
One look at WET class and you know what it does, you change one line and you're done, maybe you need to copy it to 2-3 other files, maybe you don't. In comparison, you'll stare at DRY class for 2 hours and realize you need to refactor absolutely everything, it will break half of the codebase and diffs are insanely complicated.
I've recently wrote 2 similar projects, one wet one dry and wet one is simpler, easier to maintain, and more enjoyable to work on. Dry is root of all evil.
I am sorry but that sounds like an absolute nightmare. 7 different places means 7 different times that bugs might crop up because you forgot, specially if it isn't documented that you need to copy the code in other places. It also seems a nightmare to maintain documentation about that, as comments might get lost or not updated in all the copy paste. Of course, unit tests are out of the question, are you writing 7 slightly different unit tests and keeping them updated? And I'm supposing it's simple bugs, not pervasive, hard-to-reproduce, indirect bugs that take days just to find the root cause.
> In comparison, you'll stare at DRY class for 2 hours and realize you need to refactor absolutely everything, it will break half of the codebase and diffs are insanely complicated.
Sounds like a problem of overcomplicated, bad coding and bad documentation. It's not a problem with DRY.
But I tend to agree with the author. Sometimes verbosity is the lesser evil. No suggestion should become a dogma and whoever played some games of code golf knows that short code doesn't mean code that is easy to read. Extreme example of course. But I believe many start to optimize in this way just as a way to reduce the line count.
Still, there is still room for some kind of factories or function templates (not in the c++ sense). I think a user is allowed to repeat himself but then again a user is just another arbitrary layer again. But if such helpers are to be implemented, I tend to like it in a place where the user is invoking said helpers and not on the level below that if that makes sense.
Exactly! The DRY'ed example in the first section should read rather:
def make_hawaiian_pizza():
make_pizza(["ham","pineapple"])
This demonstrates the omnipresent dangers of untested copy-paste."Copy-paste, copy-paste, Will Robinson!!"
The backend itself held no data, but whoever built the backend had gone full service layer, with models and adapters to the upstream services that hold the data.
The result was a backend that was a pain in the ass to change, necessitating whole trees of file changes to build features.
So we started inlining everything. We just took it back to the request handlers. We started fetching, mutating and returning the data in the request handlers. Suddenly a change became modifying one function. Every endpoint was unique and didn't depend on anything else. Things became easy.
Halfway through the migration, someone got our effort reviewed by a principal engineer who told me "it wasn't SOLID", and my contract wasn't renewed. It didn't dishearten me.
Software design is meant to make change easier and proudly adding abstractions can be a bad thing.
What is the big benefit you gained from doing that compared to calling, say, a service method call in the controller?
costService.generateCost(newPrice);
Is it really that difficult to go to the service method definition? With inlining at the controller level, in order to unit test generateCost, you'll now have to deal with authentication/authorization/request handling related infrastructure which has nothing to do with cost calculation.Simplicity is hard.
Excellence in programming is trading principles off each other based on the design constraints and expected changes.
DRY trades off everything else in different ways depending on the language and problem. DRY in python should, often stop when you ask "should this be a metaclasss" but before "should this be a decorator", that's different than in C
Maybe.
But maybe you're falling prey to the other programmer trap - catering for conceivable situations that are just never going to happen, and making your codebase unnecessarily accommodating as a result. This is another great source of complexity, and quite often the source of unnecessary abstractions (which add to cognitive load) too.
In my experience it's better to cope with half-and-half pizza toppings when they arise, rather than coding as if they're already needed. Because when they are needed, you'll probably find the requirement is actually to put them on a 3-tier wedding cake, or a car.
1: https://www.wingolog.org/archives/2015/11/09/embracing-conwa...
2: https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction
1) Repeat yourself everywhere because you're a noob and don't know how to DRY
2) DRY everything because it's easier to maintain, learn all sorts of clever tricks to make DRY work
3) Realise that sometimes it's easier to DRY and sometimes it's easier to write repetitive code.
The problem is that if someone at level 3 (like the author) talks to someone at level 2 then the level 2 developer thinks that they're talking to a level 1.
With the third thing, you should have enough information to work out where generalizations should be. Even then, only generalize what you need to at the time. Going overboard can add unnecessary complexity, so it is generally a good idea to be conservative in what you generalize.
As you add more things, you can refine and evolve the system as needed. At this time you should have a better understanding of the system and what parts can be shared and generalized.
If code is coincidentally the same, then you should leave it alone - the two pieces of code are likely to evolve independently and trying to make a common function/class handle two separate usecases is likely to lead to complex, ugly code.
Conversely, if the two pieces of code are intrinsically the same then you SHOULD pull them out into something common. If you don't, you risk the implementations drifting and getting inconsistent behaviour over time.
Determining which is which is a matter of interrogating your domain and business logic, which is the essential function of our job as developers/engineers.
This. A million times this.
IMO, the single most important principle is still "Keep It simple when you can, make it complex when you have to."
A system that is simple can be grokked quickly, meaning it can be debugged quickly, modified quickly, new developers can be onboarded quickly,...
Yes complex systems have to exist. Some tasks are complex, and require complex solutions. BUT: Complexity should come into play when it is necessary.
It is perfectly okay to design simple solutions for simple tasks. Yes, sometimes this means ignoring things like DRY.
The example discusses a code boundary that is internal to a single atomic "module" - the preparation of a data structure that describes a pizza. Then the author says that bad things will happen if said code boundary is used from other modules.
However, why would an extrenal module developer do that? It is common wisdom to recognize and avoid module-internal utility functions.
Conversely, as long as the presented shortcut is internal to a module (=used only for a specific set of use cases well understood by anyone touching the code), and saves toil, it might actually be justified.
External users will go look the implementation of msvc's standard library and reverse engineer windows API to make things faster lol. No internal module function is ever safe.
Then write: "Why i prefer tabs over spaces!" let the flamewars begin
"Just because two things are the same right now doesn't mean they _should_ be the same."
So that criticism is totally valid - DRY has to be applied only when things _should_ be the same, and that can actually be hard to identify.
That being said, of course the title of the article is bad and not accurate. DRY is essential. I don't think there's many people that actually argue against it. If you have a piece of business logic that's essential to the business and it influences other pieces of logic, they all have to refer to the same definition. Repeating it is bad for everyone - users will see inconsistent behavior, and devs will have to "remember" (read: never actually remember) to update important logic in multiple places. Important things should have a single source of truth. That seems inarguable to me.
It can be hard to find a design that actually achieves that. That's not DRY's fault.
My usual approach would be something like:
def make_pizza(left_toppings, right_toppings=None):
if right_toppings is None:
right_toppings = left_toppings
...
through really it should probably be def make_pizza(toppings=None, **kwargs):
if toppings is not None:
kwargs['toppings_left'] = toppings
kwargs['toppings_right'] = toppings
return requests.post(PIZZA_URL, kwargs)
If this logic should even be handled in the application itself at all (not sure why you'd choose to make a breaking change to the API rather than extending the API, though changing the make_pizza function to keep the code working after an API change is the correct response).I'm also not sure why the author chose to make the function capable of handling an arbitrary number of arguments, or why after doing so he chose to incorrectly invoke it on a list.
For example, let's say there is an application where a user can purchase items and also give reviews for such purchased items, the user reviewing and the user buying have in common just the ID, while in the review boundary the relevant information is probably the user nickname, while in the purchase boundary the relevant information is the payment system, or the state of the checkout ("purchase" is probably not a single boundary).
In that case, data could be duplicated to ensure the boundaries are decoupled.
This is of course at the data level, but usually it translates to "there is a user model that has many orders and many reviews", because of DRY, no two user models could exist, there you have the boundary violation though.
Sorry, this is a bit of a ramble, it's a long discussion.
Oooh yeaah ... I have had that exact argument in the recent code review where patch literally modified _hundreds_ of LoC across many different files just to avoid duplicating a simple 10-liner at a single place? Yeah, you read that right.
A developer basically rewrote half of the existing code architecture and applied "best" OOP practices. Intention wasn't a bad one but it is unnecessary to say how incomprehensible code would have become if that patch went in in its original form. It was hard to argue against it and what would have been a 10-minute work it became a 2 or 3 week long discussion. And that is just ... bad.
Obviously YMMV.
Also, I liked the comment from ihateolives somewhere in the thread a lot.
Ironically the author just demonstrated why DRY is a great principle.
I actually stand outside my body and watch myself make it more complex, and think, "Ugh, this is more complex than it needs to be, I should make this simpler. But I don't want to." I continue and hope that a refactor will make it less embarrassingly complicated.
An idea I've seen a lot here on HN is that DRY is good with a baseline number of reuse. If we see the same pattern twice, maybe it's not a good abstraction since we haven't seen it grow yet. If we see that same pattern 15 times, I think we know an abstraction is handy here.
We abstract to functions to reduce cognitive load and to allow scope rules and information hiding within the language to prevent local variables becoming pseudo globals.
The whole premise of the 'goto consider harmful' structured programming movement was to allow us to replace control structures with a single black box consisting of input-process-output, which aided in reasoning.
The premise was to construct the program from cohesive functions that are lightly coupled.
When did we move away from that?
A less dry one would look like
def make_hawaiian_pizza(): payload = { hwaiianCrust: "thin", redSauce: "tomato", cheese: "regular", ham: true, pineapple: true, toppings: ["ham"] } requests.post(PIZZA_URL, payload)
def make_pepperoni_pizza(): payload = { pepperoniCrust: "thin", crust: "thick", sauce: "tomato", cheese: "regular", toppings: ["pepperoni"] } requests.post(PIZZA_URL, payload)
Now I know what I'm getting myself into here. Most people hate making their own choices and love to blindly follow simple prescriptive rules which are known by Experts to produce Good Results. But when the religious approach isn't working for you, maybe it's time to stop making your occupation a religion.
Disagree. Quicker discovering issues difficult to discover is actually a good thing.
> Its more difficult to build context of the implementation, if one needs to jump from file to file.
Agree.
In general, I personally look at DRY as "It takes time to implement and/or understand it, but when it's done then it works and it will last." "It takes time" is something your boss won't like, but that's (mostly) not an issue if working on open source SW. No boss -> no pressure -> higher quality.
// Option 1
const Pizzas = {
Hawaiian: {
type: 'hawaiian',
crust: 'thin',
sauce: 'tomato',
cheese: 'regular',
toppings: ['ham', 'pineapple'],
},
Pepperoni: {
type: 'pepperoni',
crust: 'thin',
sauce: 'tomato',
cheese: 'regular',
toppings: ['pepperoni'],
},
};
// Option 2
// Pizzas could be returned from an API, so that the pizza
types are configurable outside of the code
const response = [
{
type: 'hawaiian',
crust: 'thin',
sauce: 'tomato',
cheese: 'regular',
toppings: ['ham', 'pineapple'],
},
{
type: 'pepperoni',
crust: 'thin',
sauce: 'tomato',
cheese: 'regular',
toppings: ['pepperoni'],
},
];
const makePizza = (pizza) => requests.post(PIZZA_URL, pizza);
// Then, for Option 1
makePizza(Pizzas.Hawaiian);
makePizza(Pizzas.Pepperoni);
// Or, for Option 2 (e.g. user selected via a menu)
makePizza(selectedPizza);When I read the initial example I can understand it in the time taken to read the code.
With your example I had to think for about 1-2 min before it made sense. If the codebase is full of clever stuff then I have to spend hours understanding all of the clever things before I can make changes. If everything is simple then it's easy to change.
If you want to see where overengineering leads you then take a look at this project. https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...
It is satire but I have absolutely worked in places that write code like that.
Good programmers know that it's 10x times harder to read code than write it, so they deliberately keep it simple so that they can read it later.
Yes, that is a very common beginner mistake. Sometimes a little copy-pasta is needed to avoid over-complicating what needs to be simple.
Where DRY is really important are things like:
- Hey, you seem to be using "foo" and "bar" all over the place. Put those strings in constants.
- Hey, you're using magic numbers all over the place. Use an enum (or constants depending on your language / situation.)
- Wow, you copied and pasted that logic all over the place. Now when we need to make a change we have to make it in 20 spots. That should be encapsulated in a function / method / object
- (And to get closer to home) Even though you just want to "send a POST request with a single JSON object," we have a common session management pattern and error handling pattern in our application to deal with this API. That particular pattern should be encapsulated so you aren't repeating it for every #%$#@ API request.
The reason for this is maintainability. Say that logic needs to be changed. If you didn't break it out into a callable method, you'd have to find all the places you use that logic and change it. If it is a callable method, you only have to change the logic in one place, thus DRY.
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
Here is an overly simplistic example:
If you see this scattered around your code:
var formattedName = firstName + " " + lastName;
Create this method and call it when you need it: string FormatName(string firstName, string lastName)
{
return firstName + " " + lastName;
}
var formattedName = FormatName(firstName, lastName);
When business decides it wants to change name formatting from "firstName lastName" to "lastName, firstName", you only have to change the logic in the FormatName method because you "didn't repeat yourself."No, it's about representation of information in a system, as your own Wikipedia link states right up at the top of the second paragraph. Code and data are both forms in which information may be represented ina a system.
It does not invalidate DRY concerns in general! For example, an important reason to avoid repetition is that it adds maintenance inertia. Where a setup that a single-location change is easy to experiment with and improve, one that's repeated several places becomes tougher to change. Ie, friction. I would argue this is also adding complexity - what the author seeks to avoid.
You could imagine contexts for the pizza example where the repetition doesn't make sense, and a refactor could make things easier. You can't tell alone from the snippet.
From the headline and concluding paragraph, it feels like a straw-man. eg overrated, and:
> "Well, obviously I'm not saying we should throw DRY completely out the window. I'm not sure it would actually be possible to write code that "never doesn't repeat itself". But I do think we should tone down knee jerk reactions to PRs that contain several repetitions of a block of code. There are at least a few cases where that might be the exact right thing to do."
You make a make_pizza function that supports split toppings and you pull the guts out of the original make_pizza function that just calls the first make_pizza function with left_toppings=[toppings], right_toppings=[toppings]. You don't need to ruin your function signature with *args.
The fundamental assertion is that you should be structuring your code such that it is reflective of reality, but reality is really bloody messy. The immediate response to this example is that Pizza is in fact Toast[1] and so you should actually have a make_toast function that handles all forms of toast. This is clearly ridiculous, and if you're building a system to make pizzas and you build your function in a way that extends as far as building nigiri sushi, you're an idiot. You have to take a reasonable judgement of what is the underlying structure that you want to reflect. It's not a coincidence Hawaiian and Pepperoni Pizzas are structure the same.
Well, yes, they do. They also make your database slower. This doesn't mean they are overrated - this means they are tradeoffs. Like everything engineering.
With DB normal forms you buy integrity (i.e. keeping the data consistent as it changes) and you pay with performance and schema complexity. Usually this is a sensible tradeoff because integrity is more important. But, for example, if your data never changes - you will be paying for nothing. Or, perhaps, you can't afford the performance price and you have some other way to ensure integrity. Then you de-normalize.
DRY is similar. As mentioned in the sibling thread, it's not about mechanically avoiding repeated code - just like normal forms are not about never having the same value in two different rows. It's about maintining logical integrity of your code as it changes. PI=3.14159265359? Probably safe to copy around. An implementation of some use case? Probably not.
I'd say following DRY/STEP principle is a sensible default. If a reviewer asks you why your code is not DRY - you should be able to articulate a reason.
I think DRY is a good thing in some cases, but you should careful consider when something is worth to DRY and when rather WET gives you the best tradeoff for isolation.
My metrics to decide is to stick in favour of the Single Responsibility Principle. If DRY means compromising it, most likely is not worth it.
Depending on the given problem, this is sometimes more time consuming, but still gets easier and faster with practice, and the benefit down the road can be tremendous.
Associative arrays are bad even for such simple things imo, because it breaks autocompletion / code inspections, and your functions are then these blackboxes that are hard to understand without looking at the implementation (code). Sometimes this is also evident when it's just you working on the code; try leaving the code for a few months only to return at it, and waste time relearning how to use your own code, because it is not self-documenting, and you can also forget- or misspell an array key. Etc. This is not the case if you define data types as objects instead of using arrays.
I learned this from trial and error myself, and I used to use associative arrays a lot for things – now I find myself using/creating objects more often, and I just love returning to this code later, and have it work without too much crapping around.
"Copying and pasting a few lines of code takes almost zero thought and no time"...and you're fucked. Pardon my French but it's warranted. Been there, done that, now know better.
Firstly software development is a thinky sport and the moment you're cut-n-pasting code while not thinking you're exhibiting risky behavior. Here come the bugs.
Guess what happens next: senior devs are busy, simple bugfix is assigned to junior or farmed out to contractor. They fix just one of the cut-n-pasted routines and call it a day. Then you play a few iterations of the PR to test failure game or you ship a bug. I see this all the time. I saw it yesterday. After it fails the tests enough and I get the PR I have them DRY that code up.
This is especially important if you've inherited a crappy code-base with lots of duplicate code. We have a rule that if you touch it you DRY it. Never had that rule not serve us well. Getting people outside the core team to stick to it is work, but that's a different oroblem.
For example, a project using a framework may only require a developer look at 4 small files to understand the functionality of a resource, and it acts as inline documentation to others on how to quickly contribute new features. In a way, through explicit separation of resources the “similar” code tends to differentiate rather quickly as use-cases rarely share the exact same context throughout the entire life-cycle of a program.
The worst maintenance teams of popular projects permute an API definition every 6 months, and break existing production code in downstream works. You know, ironically still building that bug infested Ivory Tower everyone assumed they could avoid with grossly oversimplified acronyms ( https://en.wikipedia.org/wiki/Ivory_tower ). ;-)
The examples are also pretty contrived, there’s hardly any duplication there and the duplication is very simple and unlikely to change much. DRY is beneficial when the repeated code is complex and will likely need to be changed in the future (eg to fix bugs or to be extended), where repeating the code will be a source of error since every change would then need to be applied to each instance and forgetting one is a problem. The example is trivial enough that I wouldn’t bother refactoring it until it became complex enough to be a problem.
A good principle is to not apply principles too quickly/soon but only when not doing so would introduce complexity or cognitive overhead. YAGNI, basically.
As some have pointed out - make_pizza shouldn't be anywhere in the code. There should be a mongo collection or rel table that has a bunch of typical pizzas and a way to make custom ordered pizzas, typically through a UI.
The more complicated thing is the data structure that represents any pizza (like 50/50, 10/90, 10/80/10, toppings per section, etc...)
And a mongo collection for "typical" pizzas would fit that pretty well. And beyond that custom ordered pizzas.
All that being said even the above is over engineered. Typically this over-engineering is a result of allowing end-users to create a pizza. I'm super old school and still call in my orders on the phone. The difference being that end-user UIs that let you make a pizza need to be over-engineered while call in orders is just a bunch of notes and a total "additional toppings" count.
The principle concerns the duplication of knowledge, not code.
Author's reference to "accidental duplication" is caused by two similar code that represents different code that is when joined becomes the code that is ambiguous in meaning.
Basically I like the refactoring approach. You have a set of refactorings. Like extract method / inline method. The point is that every refactoring is two-way. And both ways are useful in different situations.
To support this approach, sane IDE is a must and strictly typed language is preferable You should refactor your code without fear of breaking unrelated code.
What I definitely think is overrated is "if it works - don't touch it" principle. It's lazy and in the end it creates much more work than if one would gradually improve something that works.
As I build a thing, I just sling duplicate code like I'm getting paid by the line. Once I've created all or most instances of the duplicate code, and they're working, only then do I circle back refactor it to extract common concepts and abstractions.
I've learned over the years that I don't really understand what I'm building until it's built. I need to step back and look at the patterns that have formed to spot the difference between firm concepts and trivial duplication.
Doing this well takes experience. It involves predicting how your code is likely to change. One lesson from experience is to favor repetition over bad abstractions. I formed this opinion from living through the pain both types of anti-patterns. Duplication causes the need to find, change and test all instances. That sucks. It leave you open to bugs. But bad abstractions can require ripping the whole thing apart.
- Have heard of it and it sounds like a good idea
- Let's DRY everywhere!
- OK, maybe don't DRY everywhere...
- There are multiple ways to implement DRY, and it all depends on circumstance
Taking the article example, a better approach would be to DRY the data first (after discovering that in your organization the most common pizza is thin crust with tomato sauce and regular cheese):
val STANDARD_PIZZA = {
crust: "thin",
sauce: "tomato",
cheese: "regular",
}
val TOPPINGS_PEPPERONI = ["pepperoni"]
val TOPPINGS_HAWAIIAN = ["pepperoni", "pineapple"]
def make_pizza(design):
requests.post(PIZZA_URL, design)
def make_standard_pizza(toppings):
make_pizza(STANDARD_PIZZA + {toppings: toppings})
Now it's easy to use with no repetition: make_standard_pizza(TOPPINGS_PEPPERONI)
make_standard_pizza(TOPPINGS_HAWAIIAN)
make_standard_pizza(["pepperoni", "ground beef", "olives", "feta cheese"])
You can easily add to it: val TOPPINGS_VEGETARIAN = ["green peppers", "tomato", "spinach"]
Then when you need to expand for half-and-half: def make_standard_half_and_half(left_toppings, right_toppings):
make_pizza(STANDARD_PIZZA + {left_toppings: left_toppings, right_toppings: right_toppings})
make_standard_half_and_half(TOPPINGS_HAWAIIAN, TOPPINGS_VEGETARIAN)
This gives you both low level and high level (convenience) interfaces to pizza generation, with none of the silly class complexity or function explosion.I architected a React.js framework that needed to exist in our existing portal environment and play well with all of the other frameworks and scripts. My solution was tightly coupling bundles of code for widgets deployed on my platform. I have conventions all devs need to follow and it does result in not very dry code.
The benefit is that everyone can work independently and not affect each other. Testing is easier to do as there are less logic paths. Performance is still optimized with code-splitting, so the extra code really doesn't affect performance.
Whenever people try to create a DRY one-size-fits-all solution, I find them very inflexible and prone to breaking. Add to that, they are generally poorly documented, so making changes can be very stressful.
In many cases thinking about possible future features/extension was one of the most limiting, complexing and at the end irrelevant approach out there, because it usually turned out that the future was different than it had been imagined. On the opposite side, when you take into account just what you know at the time of writing, led to easier adjusts, because it was way more clear to understand.
DRY is a great technique to organize your code, you just need to think about the structure first. In this article, the problem was that he wanted to bend 2 pizzas method into many other different types of pizzas, so his pizza architecture changed so vastly that it was not possible to describe it with just the initial idea of pizza.
If there's something getting in the way of DRY it's probably the biggest problem you have. However, that doesn't mean you have the ability or control to fix it completely in the short term, but the arc of history bends towards DRY I think.
In this specific article, the first example seems better the DRY way to me. The author seems to suggest that rewriting it later is worse than repeating the logic everywhere, which sounds very fragile. If you can't confidently rewrite an API later than you're doomed to repeat yourself until it all collapses. You could make a reasonable argument that the developer who fulfils the short term requirements and has found a new job by the time it all collapses would have a more lucrative career, but I doubt you could argue it's better software.
> Copying and pasting a few lines of code takes almost zero thought and no time. Find and replace are very good at finding repeating things later if we start to care
Find-and-replace is unfortunately not really adequate for finding repeated code, in my experience. There must be much better tools out there.
Yes, I actually think OOP is not bad :)
But I disagree with their conclusion:
That the goal is to send a post with a single JSON object and marking this task as:
> That is a very, very simple thing to do
That is a very simple thing to do if you think that you will write this code once and never have to change it to fit some new requirements.
But probably there will be changes either from your own business or because the API will change thus the task becomes:
<< How can I implement sending a post request that will follow a body request format and create a code that is simple to understand and _easy to change_ >>
And as right now when you write this code you cannot know what kind of change will come in the future the best way to move forward is to write small functions with very few conditions and open to extensions.
Thus repeating that code there is not a good solution. What if the API will request to add any new key in the payload? And those two methods (def make_hawaiian_pizza and make_pepperoni_pizza) are not in the same file and the one doing the implementation is not the current author to remember "ahh the code is duplicated so I have to change it in multiple places"?
Anyhow there are cases when duplication is good, but when composing the payload for a request is not one of them :) IMHO.
Let me add to think one more thought: the structure of code tends to be duplicated in the future. So choose not to DRY having in mind that people who will write code after you will tend to make the same choice. They will look at what you wrote and then follow a similar structure.
So don't DRY but make sure you do this in a place where you will be ok with other people increasing the number of duplicate code.
DRY is a really solid principle (pun intended.) If you have to define the payload schema for an API yourself (i.e. the API provider doesn't supply a library for you) then you really should define that in one and only one place. It doesn't matter thst today you only want a "handful" of hard coded JSON dicts. That path quickly leads to so many headaches and run time errors.
Implementing the API schema in one and only one place means you limit the sources of bugs for all code that deals with it. You only have to test that functionality in the one place it is defined, if it changes you don't jave to chase down hard coded JSON all over your codebase, etc
This post basically amounts to "I want to write lazy bad code and DRY tells me not to."
As you gather experience you can recognise these patterns more easily and develop a stronger intuition for when you should pursue a DRY approach or just leave things as they are. You might even choose to make two things even more similar so that they feel more familiar (i.e. choosing to be even less DRY!)
The point of DRY is that when you need to change something, you should only have to change it once (a common code smell that comes about when not employing DRY is "Shotgun Surgery"). If the rest of your architecture is broken, DRY is not going to magically save it. That should be obvious.
But this concern here with writing a make_pizza() function:
> The problem is that these two pizzas just happen to have the same crust, sauce and cheese. Had we started out with two pizza types that have different crust/sauce/cheese, we never would have made this refactor.
You can solve for this by adding parameters with default values for all of those things. Use the defaults for the most common use cases, but of course for pizzas with different crusts and such, you can override the default.
Not all languages support default parameter values, it's true. And of course, there is a level of complexity at which this breaks down.
It’s really easy to make assumptions about what you are going to need later that turn out to be completely unfounded (or even - years later there is no “many”, just the one or two usages you already have).
And I think folks shouldn’t freak out over a little bit of duplication, as long as it doesn’t get out of hand in the codebase, and you make sure to come back to refactor later when you do have many common usecases.
In all fairness, the best programming principle is: just like with software licenses, know what to use and when.
I also think the title of "don't obsess" covers the intent better. In other words, it's perfectly OK to write DRY code, but don't obsess over making all code DRY all the time at the expense of readability.
Introducing unnecessary complexity is, by definition, unnecessary. But we shouldn't be introducing complexity because DRY tells us. We should introduce complexity - some, but not more than needed - because some day a developer will know to update 2 of these 6 line snippets, but won't know about the third.
> "Keep it simple, silly", "keep it short and simple", "keep it short and sweet", "keep it simple and straightforward", "keep it small and simple", "keep it simple, soldier", "keep it simple, sailor", or "keep it sweet and simple".
It's true that often, complexity is praised...
What I really like about rule of 3, vs rule of 2, is that it allows more time to go by that may lead to the two pieces of code no longer being identical as requirements change. Which would either remove the need for the abstraction or allow for a more accurate abstraction.
or indicate a bug or nothing, because the differences aren't in logic, but e.g. in variable names.
^ perhaps voluntarily or, because happen to not find the duplication, or not search it in the first place.
The more time passes by the higher the risk.
Yes, sometimes you may discover that you prematurely DRY'ed the code, and it was just accidentally similar. Easy, you just un-DRY the code. This is a trivial operation. In an IDE it might be a single keyboard shortcut. Going the other way is a difficult and error prone process.
In this case, unit tests are a very reasonable place to "repeat yourself", so you don't have to figure out what it's actually doing. Seeing the code all in place makes life easier.
This article is the result of someone exploring a new idea and cussing it out because he has to change his ways. I've been there countless times.
DRY is not overrated. DRY is a time saver in the long run. Why is this even on top of the 1st page, how new are you to development anyway?
Sorry but I'm really getting pissed off by people wasting my time with useless articles lately. It's getting out of hand.
For one thing, pizza recipes are either dynamically built by the customer or they are just fixed recipes.
Having individual functions for different predefined pizzas is not really dry enough for me.
I would have a pizzas.yaml file, and a get_pizza_recipe("id"). Maybe I'd even read pizza data from an excel spreadsheet directly for ease of editing and sharing by management if needed.
That being said, I always allow myself to repeat code until a good enough patter emerges from that repetition, and then I refactor. Having the same code twice is not always sufficient to reveal what is the right refactor to do, if any.
Not every problem is the same and not every pattern could be used to solve every problem.
If you have two features that have N parts in common, and you are certain that they will never diverge for feature-specific customization or special cases, then DRY is probably a good idea
If they may diverge at some point, then structuring the code in anticipation of that divergence is a good idea, else you end up with a messy DRY implementation that inevitably has to fork
Say you have two templates (web pages). They are conceptually independent and serve two different business purposes. Yet in terms of their structure/content/whichever, they have about 20% in common.
Somebody obsessed with DRY would now elevate that 20% into some reusable module, after which both templates use it and the repetition is gone. Feels clean.
In reality, you didn't solve a real problem whilst you created a new one. Now individuals/teams cannot independently edit these templates as they need to understand and check the dependency tree. It no longer is simple, there's no piece of mind.
Next, inevitably somebody is going to request changes impacting that 20% and before you know it and after cutting lots of corners, you end up with this freak component that changes output based on some flag.
It's taken me 20 years to come to this conclusion: the negative effects of (too much) DRY (it increases complexity) show up in every single project and make code harder to understand and change. Meanwhile, the negative effects of allowing (some) repetition are mostly theoretical and more often than not a benefit, not a negative.
I mean it. This is coming from an ex-DRY fan boy. The DRY principle makes us eager to connect dots that really aren't connected and shouldn't be connected.
I believe it has a potential to be a great alternative to pytorch.
I love watching GeoHot's Twitch streams as he goes to the extreme to simplify the codebase, and the end result is amazing.
Being dry about state is actually really useful (essential!), where state can be derived it should be, rather than stored as a new variable. Being dry about code is often less useful in my day to day coding, but I’m not a library designer I just make apps /2c
default_pizza() .with_crust(Crust::Cheesy) .with_sauce(Sauce::Garlic) .add_topping(Topping::ExtraCheese) .cook()
I'm not going to weigh in on the DRY stuff because it's being discussed to death. I just liked thinking about how I would approach this problem.
Turn‘s out - no it isn‘t!
I think DRY and KISS are probably the most important and most misunderstood principles by far. Why? Because they seem trivial at first sight, but really are not.
Not every repetition should be DRYed (dry those which pose a risk to integrity) and „simple“ is not the same as „easy“ or „familiar“!
Here's my advice: don't refactor when you repeat yourself twice, refactor when repeat yourself _three_ times.
Having one chance to copy-paste before DRY'ing your code has been one of my most treasured coding tricks, it'll save you so much time and premature refactoring.
If all your API does is return two pizza description jsons then keep it that way. If your client is a pizza delivery company and your api is supposed to allow definition and customization of pizza recipes, then you better take your ass to DRY town. Don’t blame the principle when you can’t understand what it is that you’re abstracting.
I have time and again gained enormous benefit by pursuing DRY principle to its absolute core. 20x speed and code complexity optimizations, making entire teams obsolete, etc. The most important point is to make sure that your abstractions absolutely match the fundamental principles of the concept you’re trying to represent. No matter how verbose you think it’s getting it’s totally worth it if this is your bread and butter.
ref: https://twitter.com/id_aa_carmack/status/753745532619665408
Is it, though?
And yes, I would totally just make it so the function detects whether I am passing in an object or an array of objects and respond accordingly.
I feel like I got this pattern from jQuery or something. Seems very normal for a good library.
Repetition creates symmetrical cases. As they say in German, 'einmal ist keinmal' -- once is never. Overly DRY code is incredibly non-educational.
Why?
Because you learn by comparing and contrasting -- if you can't compare, you can't contrast, and therefore, you cannot learn.
The problem with analogies is that they are often bad. Don't don't specify that you want tomato sauce and regular cheese when you order your pizza because that's the default.
The problem with DRY is that the cost of it being wrong in the future often isn’t accounted for.
Copy, paste, search, replace is underrated.
That said, it’s a balancing act. The right generalisations are great.
def make_pepperoni_pizza():
make_pizza(["pepperoni"])
def make_hawaiian_pizza():
make_pizza(["pepperoni"])Separating it out into a function can obfuscate things making it harder to do that.
If a piece of code is duplicated thrice, it's ok. If a piece code is duplicated four times, then you must extract it.
Partial application might make the first example a little less ridiculous, for instance.
DRY code is a good value, but it is not an all important value. It's one of many values, and must be kept in balance.
But yeah, 100% dry code that is also practical to maintain is also a fucking myth and not grounded in reality.
I'd rather read the gunzipped code.
What a waste of a click.
If code is duplicated twice, I'm ok with that. If it's duplicated three times, then it's time to refactor.
Tldr if you're making code dry and you insist that you make code dry across feature boundaries then for the love of god make unit tests for those functions. Or keep your functions dry and your features wet.
I stopped reading there. DRY creates maintainability, not necessarily reusability.
They were awesome, but all the code examples where DRY to the max, it was quite funny.
Proper DRY at scale requires types that make sense and are easy to think about (you have to invent and document them even in dynamic langs ...that's why Typescript's a thing and so sucessful). You can't have DRY that doesn't slow you down and cause bugs without proper f types!
Eg. a sane solution to the authors' problem when the requirement for split pizza came would be:
- rename make_pizza(toppings: dict) to make_pizza_part(topings: dict)
- implement a new make_pizza(topping: list[dict]) calling make_pizza_part(toppings: dict) - and here you've change the type (important!), so you'll not miss any unchanged all calls to it, your tools will yell at you (ideally at build/compile/commit), or at worst at runtime but with an easy to interpret even from logs error
DRY is fine if done in the context of proper software engineering practices and tools.
Now being non-sloppy and following solid practices has a cost, and you might want to avoid it sometimes - in those cases do less DRY rather than crappy DRY!
(Whole languages are built around the OP's philosophy, eg. Go, but they are explicitly engineered to lower cost and defects in large corporate orgs! Randomly choosing to follow this in a project with 1-3 people of adequate skills and limited scope will just unnecessarily make that project have 4x the code, 4x the bugs, and 4x the cost for zero benefit.)
No, DRY doesn't mean that you should create classes just to prove your (invalid) point.
And if it’s “just going to be used once” who really care how it’s written other then “quickly and correctly”? And sadly, too many things aren’t just used once.
Here is the pizza cost optimizer from Pizzatool, written in object oriented NeWS PostScript, which checks all of the pre-defined base pizza styles and selects the "best" combination of style + extra toppings, ostensibly to save the user some money.
It's actually a dark pattern, because it's biased towards selecting higher level pizzas instead of the least expensive pizza. But at least the dark pattern is documented:
"Figure out the cost of the pizza, were we to order it as this style, and remember the style as the best match if it pleases us. The definition of pleasing us is biased towards matching higher level complex pizza styles, rather than economical lower level pizzas with extra toppings. This is the kick-back to Tony&Alba's for all that free beer."
The Story of Sun Microsystems PizzaTool How I accidentally ordered my first pizza over the internet:
https://medium.com/@donhopkins/the-story-of-sun-microsystems...
Tony and Alba's Pizza and Pasta, Mountain View:
https://www.yelp.com/biz/tony-and-albas-pizza-and-pasta-moun...
PizzaTool Source Code:
https://www.donhopkins.com/home/archive/NeWS/pizzatool.txt
% Calculate the cost of this pizza.
%
/updatecost { % - => -
10 dict begin % localdict
/TheBest /defaultstyle ClassStyle send def
/TheStyle null def
/TheTopping null def
/TheBestCost 99 def
/TheBestExtras 0 def
% For each and every pizza style in the universe:
/styles ClassStyle send { % forall: % style
/TheStyle exch def %
% Ask this style for its list of standard toppings.
/TheToppings /toppings TheStyle send def
% Is every topping from this style on our pizza?
true % true
TheToppings { % forall: % true topping
Toppings exch arraycontains? not { % if: % true
% Oops, this topping's not on the pizza. No dice.
pop false exit % false
} if % true
} forall % true|false
{ % if: all the toppings of the style were on our pizza:
%
% Make an array of our pizza toppings that aren't in the style.
/ExtraToppings [
Toppings { % ... topping
% Is this topping included in the style? Then toss it.
TheToppings 1 index arraycontains? { % ... topping
pop % ...
} if
} forall
] store %
% Figure out the cost of the pizza,
% were we to order it as this style,
% and remember the style as the best match if it pleases us.
% The definition of pleasing us is biased towards matching
% higher level complex pizza styles, rather than economical
% lower level pizzas with extra toppings.
% This is the kick-back to Tony&Alba's for all that free beer.
PizzaSize /pizzasizeindex self send % sizeindex
ExtraToppings length % sizeindex extras
/extraprice TheStyle send % $
dup % $ $
ExtraToppings length % $ extras
/extras TheStyle send sub % $ $ extras'
1 le { .9 mul } if % $ biased$
TheBestCost le { % ifelse: % $
% Hey this is the best match so far, let's not forget it!
/TheBestCost exch store %
/TheBest TheStyle store
/TheBestExtras
ExtraToppings length /extras TheBest send sub
store
} { pop } ifelse %
} if %
} forall %
% Set the window footers of the pizza topping panel.
% The left footer displays the name of the pizza style,
% and the right footer displays a message
% telling the user to choose more toppings,
% or the number of extra toppings,
% or nothing at all.
TheBestExtras dup 0 lt { % ifelse: % extras
neg dup 1 eq { () } { (s) } ifelse % extras (plural?)
exch (Choose % more topping%!) sprintf % (message)
} { % else: % extras
dup 0 ne { % ifelse:
dup 1 eq { () } { (s) } ifelse % extras (plural?)
exch (With % extra topping%.) sprintf % (message)
} { % else: % extras
pop nullstring % ()
} ifelse
} ifelse % (left footer)
/name TheBest send exch % (left) (right)
/setfooter ToppingWindow send %
% Remember the price of this pizza in dollars rounded to cents,
% and calculate its string value.
TheBestCost % $
Fraction mul
100 mul round 100 div
/Price 1 index store
dup 100 mul round cvi 100 mod % $ cents
exch floor cvi % cents dollars
1 index 10 lt { (%.0%) } { (%.%) } ifelse % cents dollars fmt
sprintf % (price)
% Set the value of the costfield and totalfield labels to
% the price string.
dup /setvalue costfield send
/setvalue totalfield send %
% Set the value of the stylevalue label to the name of the best style,
% and set the stylemenu value to the index of that name in the list of
% pizza styles. (The stylemenu is an exclusive settings menu.)
/name TheBest send % name
dup /setvalue stylevalue send
PizzaStyleNames exch arrayindex { % index
[exch] /setvalue stylemenu send %
} if %
% Remember the best match pizza style.
/Style TheBest store
end % localdict
} defdef generate_payload(crust, sauce, cheese, toppings):