undefined | Better HN

0 pointsfartsucker693y ago0 comments

I don't read coding opinion articles like OP but I like to check out comments.

> DRY does NOT lead to over-complicating things.

That is not true. I dive around foreign code bases a lot and dry-ness is actually a significant complicating factor in understanding code, because you're jumping around a lot (as in physically to different files or just a few screens away in the same file). As in, inherently every time it's used, not just in situations where it's used in a complicated way.

This sounds dumb but it just simply is much harder to keep context about what's going on around if you can't refer back to it because it's on the same screen or one short mouse scroll above or below your current screen.

That obviously doesn't mean you should leave copy pasted versions of the same code over your code base. But it's important to consider that refactorization of that code into something common that gets called from multiple places as something that you don't get for free, but that is an active trade off which you usually have to apply to prevent bugs (changing one code location and not the other) or simple code bloat. In practice this is very relevant when you suspect something might be repeated in the future, but you're not sure. Imo: Just don't factor it out into anything, leave it there, in place, in the code.

0 comments

MichaelGlass3y ago

Agreed. To use the example from the article

`make_pizza(["pepperoni"])`

What does `make_pizza()` do? It could be a lot or it could be a little. It could have side-effects or not. Now I have to read another function to understand it, rather than easily skimming the ~four lines of code that I would have to repeat.

I think the article fails to show particularly problematic examples of DRY. E.g. merging two ~similar functions and adding a conditional for the non-shared codepaths. shudders

laserlight3y ago

> What does `make_pizza()` do? It could be a lot or it could be a little. It could have side-effects or not. Now I have to read another function to understand it, rather than easily skimming the ~four lines of code that I would have to repeat.

This is not a problem of DRY. This is a problem of wrong abstraction and naming. If the function is just four lines, it could easily be named `make_and_cook_pizza`. In the alternative scenario where those four lines are copy pasted all over the place, one is never sure if they are exactly the same or have little tweaks in one instance or the other. Therefore, one has to be careful of the details, which is much harder than navigating to function definition, because in this case you cannot navigate to other instances of the code.

eloff3y ago

Exactly this. I fixed a problem like this a week ago. I found some duplicated code, factored it out into one place by introducing an abstract base class (Python) and in the process discovered one of the duplicated methods had a logic error leading to returning a slightly smaller integer result.

The code had test coverage, but the test confirmed that it produced the wrong result. I had to fix the test too.

1 more reply

fartsucker69OP3y ago

in a sense yes, in a sense no. if you see a function and know its sort of black box properties and its inputs and outputs are well defined, you really don't need to care. however, that applies whether the code is in an external function/module or physically inlined into your code. the sectioning off into separate code is then there to forcefully tell the reader "don't even try to care about the implementation details of this", so in practice your point still applies.

however... real software doesn't work like this. the abstractions that work that way exist for a select few very well understood problems where a consensus has developed long before you're looking at any code.

math libraries would be a typical example. you really don't need to know how two matrices are multiplied if you know the sort of black box properties of a matrix.

but the minute functions, classes, and other ways of abstraction code in a DRY way that you encounter constantly in everyday code, even if they are functionally actually well abstracted (meaning it does an isolated job and its inputs and outputs are well defined), even for simple problems, are typically complex enough that learning their abstract properties can be the same level of difficulty and time investment as learning the implementation itself. on top of practical factors like lack of documentation.

this is also why DRYness as a complicating factor really doesn't factor in once the abstracted code does something so complex that there is no way you could even attempt to understand it in a reasonable amount of time. like implementing a complex algorithm, or simply just doing something that touches too many lines of code. in this case you are left to study the abstract properties of that function or module anyways.

gjulianm3y ago

I think that drawing conclusions from these examples is not productive at all. In the wild we're going to see functions such as

    def make_string_filename(s):
        # four lines of regex and replace magic

so that we have code like

    file_src = make_string_filename(object_name)
    file_dst = make_string_filename(object_name_2)

which is much more understandable than eight lines of regex magic where you don't even know what the regex is doing.

The problem of not knowing what it does or whether it has side effects or not is more a problem of naming and documentation than DRY. Even then, it's still better than repeating the code all over, simply because when you read and understand the function once, you don't need to go back. On the other hand, if the code is all over, you need to read it again to recognize it's the same piece of code.

jsiaajdsdaa3y ago

if those 8 lines of regex have been unit tested and the function is commented to describe "what" the code does, it is entirely the point that you don't need to understand how it works

additionally, the function should be stateless and have no side effects ;)

1 more reply

gcassie3y ago

That's fair. But maybe someone wants to reuse this in another place so they do this:

``` def make_string_filename(s, style="new"): # 2 lines of shared magic if style == "old" # 2 lines of original magic elif style == "new": # different 2 lines of magic ```

When you get here, two totally separate `make_string_filenames()`, each private to the area of code they're relevant to, would be better.

2 more replies

bjornsing3y ago

Yup. But I guess that typically happens in steps. So next DRY-programmer that comes along will add a cheezeFilledCrust boolean to that make_pizza function and so on. Every time it will seem more reasonable to add another boolean, because otherwise you have to remove the make_pizza function, and there would be SO MUCH CODE DUPLICATION.

I’ve seen this again and again in the field and I wholeheartedly agree with the sentiment in the OP. IMHO different code paths should only share code if there is good reason to believe that the code will be identical forever.

4111111111111113y ago

Now the next genius turns up and says that make pizza is at it's core always a n-step domain process.

So now you've dumped it down to an interface with a default implementation which calls the create_dough, add_toppings, bake_pizza interfaces in order, each of which are either passed in callbacks or discovered through reflection.

We can even sprinkle in some custom DSL to "abstract away" common step like putting the product into the oven correctly!

Jr's will never understand when why and what is effectively excecuted at runtime. Honestly, at this point I enjoy working with this kind of code. It's always such a high entertainment value and I get paid by the hour, so whatever

svieira3y ago

This is discussed in detail in "The Wrong Abstraction" by Sandi Metz

https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction

Quote follows:

----

The strength of the reaction made me realize just how widespread and intractable the "wrong abstraction" problem is. I started asking questions and came to see the following pattern:

1. Programmer A sees duplication.

2. Programmer A extracts duplication and gives it a name. This creates a new abstraction. It could be a new method, or perhaps even a new class.

3. Programmer A replaces the duplication with the new abstraction. Ah, the code is perfect. Programmer A trots happily away.

4. Time passes.

5. A new requirement appears for which the current abstraction is almost perfect.

6. Programmer B gets tasked to implement this requirement. Programmer B feels honor-bound to retain the existing abstraction, but since isn't exactly the same for every case, they alter the code to take a parameter, and then add logic to conditionally do the right thing based on the value of that parameter. What was once a universal abstraction now behaves differently for different cases.

7. Another new requirement arrives. Programmer X. Another additional parameter. Another new conditional. Loop until code becomes incomprehensible.

8. You appear in the story about here, and your life takes a dramatic turn for the worse.

Existing code exerts a powerful influence. Its very presence argues that it is both correct and necessary. We know that code represents effort expended, and we are very motivated to preserve the value of this effort. And, unfortunately, the sad truth is that the more complicated and incomprehensible the code, i.e. the deeper the investment in creating it, the more we feel pressure to retain it (the "sunk cost fallacy"). It's as if our unconscious tell us "Goodness, that's so confusing, it must have taken ages to get right. Surely it's really, really important. It would be a sin to let all that effort go to waste."

1 more reply

grog4543y ago

Some languages handle massive parameter lists better than other (ex with defaults). There are also design patterns for this type of problem (ex a PizzaBuilder).

gcassie3y ago

100% agree and I just wrote this response and then saw you said it better!

Ma8ee3y ago

But that is usually a problem with abstraction, rather than a problem with a method call. If I can trust what make_pizza does, that is much faster to read than any four lines of code.

A functional style certainly helps. I get the pizza in my hand and don’t have to worry that anyone left the oven on.

BigJono3y ago

> If I can trust what make_pizza does

You can't, unless it's in a standard library or a core dependency used by millions of people.

That's one of the reasons why functional code is generally easier to read. A lambda defined a few lines above whatever you're reading gives you the implementation details right there while still abstracting away duplicate code. It's the best of both worlds. People who's idea of "functional programming" is to import 30 external functions into a file and compose them into an abstract algorithm somewhere other than where they're defined write code that's just as shitty and unreadable as most Java code.

2 more replies

gcassie3y ago

Thanks - I like this point. I think it's probably a better illustration of what I'm trying to say in my third point. Devs are biased towards adapting existing shared code so we end up with shared libraries picking up little implementation details from each of their consumers and ultimately becoming very messy.

theK3y ago

Arguably one could say that this is a typing (as in type system) problem

`makePizza :: PizzaType -> [Toping] -> IO (Pizza)`

Seems to carry all that information by just accepting a PizzaType symbol and a list of toppings, `IO` communicating the side effect.

tcfhgj3y ago

> I think the article fails to show particularly problematic examples of DRY. E.g. merging two ~similar functions and adding a conditional for the non-shared codepaths. shudders

Not a problem of DRY, but bad code structure.

Just keep the two functions and pull the shared code-path out

yoz-y3y ago

Not all the time. When the similar code mixes types and the common codepaths are sprinkled multiple times over it you can either have the code there twice, or have an overcomplicated templated common function.

In these cases factorizing may or may not be a good idea.

1 more reply

jomjomv3y ago

There is no way I'm leaving fragments of code that I would have to _manually sync_ every time I make a change to either of them.

bigDinosaur3y ago

DRY makes it harder to actually understand the system as a whole in some sense, since it usually means some indirection has been added to the program. However, it avoids the one thing that actually makes me pull hair out: code that looks the same because it was duplicated but is just different enough to trip you up because each area it was used required minor syntax changes that had major implications for the result.

nine_k3y ago

Repetition also makes it harder to understand a system: not only do you have to read more, you also need to remember and compare repeating fragments that may be identical or just similar.

What makes it easier to understand a system is simplicity. I'd argue that DRY, deployed with a right strategic plan, usually does more to simplify things than does copy-paste.

But, as any tool, DRY is but a tool; to be useful it requires some skill.

spacemanmatt3y ago

On a big codebase, I much prefer to learn a function once than see its body repeated frequently. It's not a small thing.

tcfhgj3y ago

> practice this is very relevant when you suspect something might be repeated in the future, but you're not sure.

DRY only hits when you indeed repeat something.

If you predict potential reuse, which you don't certainly know, it's premature optimization.

dhzhzjsbevs3y ago

aka

Abstractions have non zero complexity costs.

And

Repeated code has non zero complexity costs

Why is this a hard concept?

It doesn't make dry any less valid.

Generally you can invoke both reasons to do something but the underlying reasoning is always complexity.

adhesive_wombat3y ago

Indeed. In any practical optimisation problem, which is fundamentally what all engineering is, there's a sweet spot.

You can't just slam the DRYness knob to 11 and expect it to always be better, any more than you can turn a reflow oven up to 900°C and expect it to be better, just because 380°C is better, for the specific PCB in question, than 250°C.

It also doesn't mean you can turn it off entirely, just as if you look at your charred results at 900°C you don't conclude that "heaters considered harmful".

Also, the problem is strongly multivariate and the many variables are not independent so the "right" setting for the DRYness knob is not necessarily the same depending on all sorts of things, technical and not, up to and including "what are we even trying to achieve?"

rightbyte3y ago

> I dive around foreign code bases a lot and dry-ness is actually a significant complicating factor in understanding code, because you're jumping around a lot (as in physically to different files or just a few screens away in the same file).

I can't agree more. Also, "code reuse" makes debugging significantly harder when trying to reverse engineer some code base. The breakpoints or printf:s get triggered by other code paths etc. And you need to traverse stack frames to get a clue what is going on.

Extra bonus points for fancy reflection so that you have no clue what is going on.

EugeneOZ3y ago

I can't disagree more. DRY forces you to create pure reusable code, and split your code into small pieces. When I read such code I need to understand just a few pieces.

theshrike793y ago

You need multiple cases of duplication (repeating yourself) before you can infer a reusable piece of code.

If you make everything as generic and reusable as possible from the beginning, you'll end up with messy code that has way too much options to set for every simple operation.

dhzhzjsbevs3y ago

It also leads to overengineered framework code that only exists to support the glue code that is now required to pull distant code together.

Increasing the distance between inputs and outputs increases complexity.

Reusable code isn't all that reausable when nobody understands it or things are so fragmented people can't figure out how to operate the code base.

This isn't a rule. It's a moderation thing.

makeitdouble3y ago

> This sounds dumb but it just simply is much harder to keep context about what's going on around if you can't refer back to it because it's on the same screen or one short mouse scroll above or below your current screen.

To note, a common effect of not DRYing functions is an increase in local code length.

In many code bases that lived long enough, that means screens and screens of functions inside the module/class files. It is still easier to navigate than between many files, but not by that much in practice (back/forth keyboard shortcuts go a long way to alleviate this type of pain)

papito3y ago

That's point. However, I think this is more of a "verbose vs elegant argument". Yes, DRY should not be a religion - I will write more code, possibly duplicated, if I deem it necessary for the code to be more readable that way. It's a judgement call, but I think the basic concept of DRY should still stand. If you find yourself cutting and pasting too much, stop, go get a coffee, take a walk, come back, and see how you can do it better.

kwhitefoot3y ago

Are you still using a VT100?

j / k navigate · click thread line to collapse

0 comments

MichaelGlass3y ago

Agreed. To use the example from the article

`make_pizza(["pepperoni"])`

I think the article fails to show particularly problematic examples of DRY. E.g. merging two ~similar functions and adding a conditional for the non-shared codepaths. shudders

laserlight3y ago

eloff3y ago

The code had test coverage, but the test confirmed that it produced the wrong result. I had to fix the test too.

1 more reply

fartsucker69OP3y ago

math libraries would be a typical example. you really don't need to know how two matrices are multiplied if you know the sort of black box properties of a matrix.

gjulianm3y ago

I think that drawing conclusions from these examples is not productive at all. In the wild we're going to see functions such as

    def make_string_filename(s):
        # four lines of regex and replace magic

so that we have code like

    file_src = make_string_filename(object_name)
    file_dst = make_string_filename(object_name_2)

which is much more understandable than eight lines of regex magic where you don't even know what the regex is doing.

jsiaajdsdaa3y ago

if those 8 lines of regex have been unit tested and the function is commented to describe "what" the code does, it is entirely the point that you don't need to understand how it works

additionally, the function should be stateless and have no side effects ;)

1 more reply

gcassie3y ago

That's fair. But maybe someone wants to reuse this in another place so they do this:

``` def make_string_filename(s, style="new"): # 2 lines of shared magic if style == "old" # 2 lines of original magic elif style == "new": # different 2 lines of magic ```

When you get here, two totally separate `make_string_filenames()`, each private to the area of code they're relevant to, would be better.

2 more replies

bjornsing3y ago

4111111111111113y ago

Now the next genius turns up and says that make pizza is at it's core always a n-step domain process.

We can even sprinkle in some custom DSL to "abstract away" common step like putting the product into the oven correctly!

svieira3y ago

This is discussed in detail in "The Wrong Abstraction" by Sandi Metz

https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction

Quote follows:

----

The strength of the reaction made me realize just how widespread and intractable the "wrong abstraction" problem is. I started asking questions and came to see the following pattern:

1. Programmer A sees duplication.

2. Programmer A extracts duplication and gives it a name. This creates a new abstraction. It could be a new method, or perhaps even a new class.

3. Programmer A replaces the duplication with the new abstraction. Ah, the code is perfect. Programmer A trots happily away.

4. Time passes.

5. A new requirement appears for which the current abstraction is almost perfect.

7. Another new requirement arrives. Programmer X. Another additional parameter. Another new conditional. Loop until code becomes incomprehensible.

8. You appear in the story about here, and your life takes a dramatic turn for the worse.

1 more reply

grog4543y ago

Some languages handle massive parameter lists better than other (ex with defaults). There are also design patterns for this type of problem (ex a PizzaBuilder).

gcassie3y ago

100% agree and I just wrote this response and then saw you said it better!

Ma8ee3y ago

But that is usually a problem with abstraction, rather than a problem with a method call. If I can trust what make_pizza does, that is much faster to read than any four lines of code.

A functional style certainly helps. I get the pizza in my hand and don’t have to worry that anyone left the oven on.

BigJono3y ago

> If I can trust what make_pizza does

You can't, unless it's in a standard library or a core dependency used by millions of people.

2 more replies

gcassie3y ago

theK3y ago

Arguably one could say that this is a typing (as in type system) problem

`makePizza :: PizzaType -> [Toping] -> IO (Pizza)`

Seems to carry all that information by just accepting a PizzaType symbol and a list of toppings, `IO` communicating the side effect.

tcfhgj3y ago

> I think the article fails to show particularly problematic examples of DRY. E.g. merging two ~similar functions and adding a conditional for the non-shared codepaths. shudders

Not a problem of DRY, but bad code structure.

Just keep the two functions and pull the shared code-path out

yoz-y3y ago

In these cases factorizing may or may not be a good idea.

1 more reply

jomjomv3y ago

There is no way I'm leaving fragments of code that I would have to _manually sync_ every time I make a change to either of them.

bigDinosaur3y ago

nine_k3y ago

Repetition also makes it harder to understand a system: not only do you have to read more, you also need to remember and compare repeating fragments that may be identical or just similar.

What makes it easier to understand a system is simplicity. I'd argue that DRY, deployed with a right strategic plan, usually does more to simplify things than does copy-paste.

But, as any tool, DRY is but a tool; to be useful it requires some skill.

spacemanmatt3y ago

On a big codebase, I much prefer to learn a function once than see its body repeated frequently. It's not a small thing.

tcfhgj3y ago

> practice this is very relevant when you suspect something might be repeated in the future, but you're not sure.

DRY only hits when you indeed repeat something.

If you predict potential reuse, which you don't certainly know, it's premature optimization.

dhzhzjsbevs3y ago

aka

Abstractions have non zero complexity costs.

And

Repeated code has non zero complexity costs

Why is this a hard concept?

It doesn't make dry any less valid.

Generally you can invoke both reasons to do something but the underlying reasoning is always complexity.

adhesive_wombat3y ago

Indeed. In any practical optimisation problem, which is fundamentally what all engineering is, there's a sweet spot.

It also doesn't mean you can turn it off entirely, just as if you look at your charred results at 900°C you don't conclude that "heaters considered harmful".

rightbyte3y ago

Extra bonus points for fancy reflection so that you have no clue what is going on.

EugeneOZ3y ago

I can't disagree more. DRY forces you to create pure reusable code, and split your code into small pieces. When I read such code I need to understand just a few pieces.

theshrike793y ago

You need multiple cases of duplication (repeating yourself) before you can infer a reusable piece of code.

If you make everything as generic and reusable as possible from the beginning, you'll end up with messy code that has way too much options to set for every simple operation.

dhzhzjsbevs3y ago

It also leads to overengineered framework code that only exists to support the glue code that is now required to pull distant code together.

Increasing the distance between inputs and outputs increases complexity.

Reusable code isn't all that reausable when nobody understands it or things are so fragmented people can't figure out how to operate the code base.

This isn't a rule. It's a moderation thing.

makeitdouble3y ago

To note, a common effect of not DRYing functions is an increase in local code length.

papito3y ago

kwhitefoot3y ago

Are you still using a VT100?

j / k navigate · click thread line to collapse