Ha! That's not a senior engineer. Senior engineers write the most simple-looking code that just works. In every rainy day scenario imaginable. The clever code writers aren't there yet.
https://www.willamette.edu/~fruehr/haskell/evolution.html
Make sure you don't miss the punchline, "Tenured professor".
It's the same with the progression of engineering seniority: increasing levels of cleverness and unnecessary sophistication, until you reach a point where you don't have anything to prove anymore, and you can feel comfortable writing the simplest and most elegant solution.
Here's the original:
http://www.ariel.com.au/jokes/The_Evolution_of_a_Programmer....
The punchline got me pretty good.
You nailed it really, senior engineers code is the most simple looking as generally they've picked the right abstraction for the problem.
Everyone is capable of making poor decisions and straight up logic errors. Senior devs sometimes more-so because we tend to get entrenched in a particular issue solo for longer periods.
A common fallacy is to assume authors of incomprehensible code will somehow be able to express themselves lucidly and clearly in comments.
Often, the "right abstraction" for a problem is not call/return based. So you get to choose between having the right abstraction, and code that is simple when viewed with that abstraction in mind, but "weird". Or alternatively choose the wrong but better-supported abstraction and have code that is needlessly complex but "straightforward".
I hate to agree with you, but "picking the right abstraction for the problem" elegantly expresses what I've been trying to tell people for the last few years now, but far less succinctly (I usually ramble on about "underlying data models" and "what this actually is in the real world, not just how we view it in our application")
The reason I hate to agree with you is that I just became very disappointed that this is a difficult skill to master. "Just use the right model and the code is easy, duh. Why aren't you doing this?" Well. Now I just feel like a jerk. I thought they just gave me the "senior" title because I was old.
It's unfortunate that one of the most important skills in the industry is so intangible and difficult to quantify. Even more difficult to teach.
Sooner or later someone will have to debug this code late at night. Don’t make it require brain cells.
Without comments, sometimes it's really really difficult to navigate the code. I have been adding more comments than ever: don't assume every line is obvious, write a comment to explain what the next few lines really do.
# base case: stop dividing when we find the largest square.
if width == height:
return width, height
else:
# otherwise, we know that we can break the land into several
# pieces, some are already square-sized, but there must be
# left over, which is the difference of the width and height,
# and can be divided again.
remain = abs(width - height)
return largest_square_plot(remain, min([width, height]))
^ This code is only for me to read so I didn't really care much about grammar... but a year from now I shouldn't have trouble understand the code in a minute or two.Some of my function/method has a pretty long docstring which may include explaining the rationale, and perhaps even some ascii diagrams. If you have trouble understand a piece of code after a few passes, that's a bad code. Also, use more newlines...
> I check every Github email I get; I make sure that I don’t get notified for everything that happens in the repo
Not sure about others, but I am tired of reading PR notifications in my mailbox. I don't know how kernel developers can live with this.
I have been thinking about just build a bot.
* receives PUSH from GitHub
* adds events to a queue
* notifies based on priority
* pings me once in a while to remind me that I have outstanding PRs to review (as reviewer and as author of the patch).
If someone needs me to review right away, he/she can reach out to me directly in chat.
If I'd been the one to write this function, I would have done it like:
def largest_square_plot(width, height):
"""Computes the largest square to tile a plot of the given width and height."""
# because we want the grid of squares to fit exactly
# the size of the squares needs to divide both width and height
# to get the largest square, we use the greatest common divisor
from math import gcd
square_size = gcd(width, height)
return (square_size, square_size)
... but now I realize that probably not everyone is intuitively familiar with the kind of math that involves the GCD, so your code is actually much more intuitive without that background.It suggests taking this code:
// square root of n with Newton-Raphson approximation
r = n / 2;
while ( abs( r - (n/r) ) > t ) {
r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );
And refactoring it to this function: private double SquareRootApproximation(n) {
r = n / 2;
while ( abs( r - (n/r) ) > t ) {
r = 0.5 * ( r + (n/r) );
}
return r;
}
System.out.println( "r = " + SquareRootApproximation(r) );
I'm all for this refactoring, but something was lost in the process. What kind of square root approximation is being used? Does the algorithm have a name? What would I search for if I wanted to read more about it? That information was in the original comment.Joe's arguments are weak
* code should be readable <--- I agree
* good comments require good writers <---- not really, comments are for fellow programmers to read. If you want to document your APIs, yes, spend good time on it, and get feedback from your colleagues. Inline code comments are internal, so write in the most natural way possible. I've read "shit" and "fuck" before (see Firefox codebase, very funny).
* refactoring <--- I agree, but refactoring and documentation cannot be mutual exclusive. That's wrong.
* His example is way too simple. Try a more difficult approximation function.
Joe needs to realize that code produced at work is not meant for one single individual. If I leave, I want my co-workers and their future co-workers to have a good time navigate through codebase.
Use comments wisely, but don't avoid them! Adding 10 extra lines of comments to the file is better than a one-liner no one can understand. Let's not run a 100-line competition when we are writing code professionally. I shouldn't need to frustrate my reviewers and beg me to explain. Use newlines to make your code more readable as well.
I don't understand how the rest of your post relates to that, although I think it's an interesting point and following discussion.
On the topic of reviewing code with the author in mind, I'm not sure I agree at all with the linked article. Does it matter who wrote the code in any way? Good code is good, and bad code is bad. It may be a helpful hint to remember the author was a senior engineer (who may "know more" than you do), but is it really something to keep in mind the entire time you review?
(in case it needs to be stated, to date, I have reviewed about as much code as I have written, upwards of 100k lines, as have most of the other people on my team. We aren't amateurs, but it often seems like we're babies.)
Isn't this the point of code review? When I click accept on a code review, I am saying that I have looked over the change and believe that it is correct and okay. If I just arrive at the conclusion by saying "Joe wrote this, and I trust Joe" the there is no point in me reviewing it.
When it is unreadable and hard to see whether it works it is different thing, but then the complain in review should be some specific variant of "hard to read".
I review mostly for architectural compliance and "bad idioms" or code smells. There is difference between not like I would write it and badly written mess and many programmers confuse them.
A code review ensures that the non-functional quality of the code is high. I.e. that the code is understandable/maintainable by someone other than the programmer himself, that there are no anti-patterns or dangerous-but-correct usages of language features, that the implementation fits to the intent of the specification etc.
Pedantic is the wrong approach to technical debt though. Almost every aspect of technical debt is about trade-offs.
If what you're doing appears pedantic it should come accompanied by an explanation of why the trade off the developer took is not ideal.
>I also ask a lot of pretty dumb questions during a review because I want to make sure that my understanding of the requirements matches the understanding the other engineer had.
Your code should really come accompanied by tests which you can read and use to understand what the precise interpretation of those requirements was.
That is not objective fact across projects. That means either the process or culture is bad.
As a corollary, when I ask a question in code review, I usually don't want it to be answered there - I'd prefer it to be answered in the code, if need be using a comment.
This is why I hate implicit assumptions. If I need some special knowledge or some special assumption that is no where in the immediate vicinity of the code, then maybe your approach to the problem is flawed. Sure it'll work but good luck to future coders working with it
This discussion makes me wonder about more tightly connecting the code with past reviews. What if when reading some code, the user had not only access to the revision history, but also the review history around the code in question? Something along the lines of storing reviews in revision control metadata.
when posting the review, you try to eliminate as much of the cruft as possible.
when reviewing code, sit down, you're actually going to read it for understanding.
your mutual goal is to improve code and product quality. if after a few passes something seems wrong, or unclear you raise it. now the two of you get to have a pleasant discussion about what the right thing to do is. maybe some rework. maybe just a comment about unhanded error conditions. maybe nothing.
if someone raises an issue that not important either way, just change it to save time. and don't raise something unless you think its important. so, if you both think its important, you can have a discussion and reach consensus.
i think thats the real problem, that somewhere along the way the consensus culture got lost. that was a really important thing.
I would take the position that written code standards enforced by humans should not be allowed. Either a given standard is important, in which case you enforce it automatically, or it isn't, in which case you drop it. That frees up code review to focus on more important issues.
Write a document which describes how you want code to be reviewed, share it with the team and then lobby for everybody to agree upon it / sign it during retrospectives?
I think it's better to develop a strong rapport with members on your team and only join teams with strong shared values about what constitutes good code, but in lieu of that you could do worse than to draw up a shared written agreement about what constitutes good code (and iterate upon it).
or may be we should find a job where constructive criticism is part of the culture and you then leave the current job! Relatively easier to find where the change exists than investing effort to bring about a change.
Personally I find that non-idiomatic code can be a bit distracting when I'm trying to evaluate the code for its overall approach and architecture.
This has nothing to do with trust or any other personal matter. Everybody makes mistakes, everybody might misinterpret code they use and not every programmer has a complete understanding of the scope of the changes they are making. This happens for both senior and junior engineers. (Obviously one would expect this happening much less for seniors)
The only way minimising the amount of future issues is to have thorough, rigorous and pedantic code reviews. Another major benefit for this approach to code reviews is having more people on the team understand deeply the component you coded.
That’s funny; I thought you were going the exact opposite direction with that. I wish my code reviewers were pedantic and obsessed with correctness!
The few time’s I’ve had the pleasure of working with them (usually open source) were the times I’ve learned the most about programming in the shortest time, by far. Especially about readability and quality of comments.
If it's not immediately clear to a reviewer that your change is correct, then either your code isn't clear, you're not writing tests, your colleagues don't have sufficient understanding of the code base, or some (non-trivial) linear combination of the above.
The first two cases are something you can work on. If people don't understand your code, then try to write code that is more clear. If your colleagues are as talented as you say, then they should be able to understand your code without explanation. If you're not writing tests, then well, maybe there is just no hope.
If your team does not have sufficient understanding of the code base, then do communal code-review sessions where you pick parts of the system and read through it together. Or organize knowledge share sessions.
Ultimately, as Lenin said, 'trust is good, but control is better'. The whole point of code reviews is that you want to get extra eyes on your code. Trying to bypass that by asking your colleagues to blindly trust you is undermining that process and ultimately undermining the quality of your system.
That said, in the end, correctness is important and justifying and communicating your change to others is the purpose of a change request. If you aren't following define standards and idioms then you should get on board. If they aren't defined then they should be defined and reviewed to get everyone on the same page. If everyone is using code reviews to grandstand or show off their knowledge that's a cultural problem that needs to be addressed.
Why do organizations allow this? I realize that some platforms require their own languages (iOS, Android), but outside of that, just pick one or two and hold the line.
Thoughts based on experience:
Asymmetric information situation: newly hired smart engineer says that new subsystem should be written in <language-du-jour>. He says it is so much better than <dinosaur-languages>. Everyone on HN says it is so much better. You however, haven't had the time to try it out on a medium sized project to determine if this is true or the usual new language hype. New Engineer seems to know plenty about it and is insistent. Do you tell them no, or do you let them run? Well now you have N+1 languages. Iterate.
Some engineer in your organization develops something that turns out to be useful, as a back-burner project without official approval. They do that in whatever language they think will look good on their resume. Do you pay to have that thing re-written in one of the house languages or do you let it ride and add it to the mix. N+1 languages. Iterate.
And...in general good luck with not "allowing this" in the context of software developers. Cat herding and all that.
We are building a product for customers, not a playground or post-graduate program. There are legitimate reasons to add another language but they must be evaluated with the needs of the business in mind, these include long-term maintenance costs and hiring/training costs regarding new/esoteric skill sets, among others.
In a large system like Tumblr, they are likely highly distributed with many different subsystems running in different parts of their infrastructure, each with their own SLA and resilience requirements.
Their team is probably large enough that most engineers focus on only some of the subsystems, and communication between them is via defined interfaces.
Though having a diverse ecosystem fosters an openess of mind, instead of enforcing "One Metaphore To Rule Them All". You find yourself borrowing efficient patterns from other environments, as they eventually start cross-pollinating.
Not quite true... not all languages have a sweet spot in a production environment. Node.js isn't particularly excellent at anything, the attraction is mainly "I know JS, and I don't want to learn anything else right now", which is a terrible attitude for someone who wants to have a career in tech. Knowing more than one tool in the toolbox is a key skill, since there is no one-size-fits-all.
Jury is still out on Scala. It's a big language, but unclear if there's a production sweet spot or not compared to other JVM hosted languages.
The rest of the ones you listed have their definite sweet spots. There are tons of languages though, and most don't have one.
I've been in organizations where you have the Language/Platform X people over here and the Language/Platform Y people over there, and they duplicate work, don't collaborate, and their libraries don't interoperate. They have different hiring needs, and developers can't easily move between them. Ultimately, everybody with their pet languages moves on, and new people come into a very fractured environment and ask themselves WTF is this? Over the years, developers (especially new ones) are only a fraction as effective as they would be if the company had just paid the relatively small up-front price of nudging everyone into the same ecosystem.
And BTW, I have yet to encounter a problem that is hard to solve in my preferred language, but is way easier to solve in another. I think the key to that is to choose carefully what you're going to commit to.
I've had the opposite experience. Having a lot of different technology stacks increases the barrier to entry for anyone who might be interested in working on another part of the ecosystem. This results in increased siloing of people and information.
While I generally agree with you on principle, I also feel you need to be pragmatic too. It’s important to also use the right tool for the job as long as you have sufficient engineers who can support that alternative language/framework/platform. Sometimes a language can also be favored at one point, but lose favor later, and in some cases it doesn’t make sense priority wise to rewrite it because it still works.
You should use the right tool for the job. Python is good for AI stuff. Java is good for performance and maintainability. Go is interesting any might give Java a run for its money.
You should limit adding tools because they're cool. Do that for bow ties not code.
I refuse to remove 'extrenious' parenthesis that make the code more readable to junior devs who may not know the language specific order of evaluation in a logical expression.
You can effectively do this by checking that every nontrivial change has sufficient automated test coverage. This saves you from having to test changes yourself and saves future devs from having to go through your thought-process when they touch that code next.
Any tips on how to improve this?
Would a checklist help? Have a clear process on what to review first?
What I look for is:
1) What's the problem being solved? Does this look like a reasonable approach? Is the code pythonic (Obv: for python)?
2) What edge cases are there? Does this handle the important ones? Does it punt properly on the less important ones?
3) Look for a short list of bug classes that have come up in the project before that have lead to emergency patches. E.g. Decrefing, Checking mallocs, any exec sorts of things. (This is a clear application for a checklist)
4) Are there tests/documentation/other required fixtures and stuff?
5) Does the code generally match the style of the project?
1000) Code formating and whitespace and line wrapping and all that bikeshedding stuff.
Feel free to short circuit anywhere once it becomes clear that there's more work required.
Also, read a lot of code reviews. Just like reading a lot of code is helpful for becoming a better developer, reading a lot of reviews is helpful for becoming a better reviewer.
"Clear" and "clever" aren't in opposition, and likewise "verbose" and "understandable" aren't correlated.
I think this characterisation, and especially the example, shows a lowest-common-denominator straw man of "clever one-liners" which seems to miss the reason that some people like them. In particular, it seems to be bikeshedding about how to write branches. The author doesn't say what those "ten lines of verbose-but-understandable code" would be, but given the context I took it to mean "exactly the same solution, but written with intermediate variables or if/else blocks instead".
This seems like an analogous situation to https://wiki.haskell.org/Wadler's_Law where little thought is given to what the code means, more thought is given to how that meaning is encoded (e.g. ternaries vs branches) and religious crusades are dedicated to how those encodings are written down (tabs vs spaces, braces on same/new lines, etc.).
Note that even in this simple example there lurks a slightly more important issue which the author could have mentioned instead: nested ternaries involve boolean expressions; every boolean expression can be rewritten in a number of ways; some of those expressions are more clear and meaningful to a human than others. For example, `loggedIn && !isAdmin` seems pretty clear to me; playing around with truth tables, I found that `!(loggedIn -> isAdmin)` is apparently equivalent, but it seems rather cryptic to me. This is more obvious if intermediate variables are used, since they're easier to name if they're meaningful.
In any case, compressing code by encoding the same thing with different symbols doesn't make something "clever". It's a purely mechanical process which doesn't involve any insights into the domain.
To me, code is "clever" if it works by exploiting some non-obvious structure/pattern in the domain or system. For example, code which calculates a particular index/offset in a non-obvious way, based on knowledge about invariants in the data model. Another example would be using a language construct in a way which is unusual to a human, but has the perfect semantics for the desired behaviour (e.g. duff's device, exceptions for control flow, etc.).
Such "clever" code is often more terse than a "straightforward" alternative, but that's a side-effect of the "cleverness" (finding an existing thing which behaves like the thing we want) rather than the goal.
If the alternative to some "clever" code is "10 lines of verbose but understandable code" then it's probably not that clever; so it's probably a safe bet to go with the latter. The real issues with clever code are:
- Whether the pattern it relies on is robust or subject to change. Would it end up coupling components together, or complicate implementation changes in unrelated modules?
- How hard it is to understand. Even if it's non-obvious, can it be understood after a moment's pondering; or does it require working through a textbook and several research papers?
- Whether the insights it relies on are enlightening or incidental, i.e. the payoff gained from figuring it out. This is more important if it's harder to understand. Enlightening insights can change the way we understand the system/domain, which may have many benefits going forward. Incidental insights are one-off tricks that won't help us in the future.
- How difficult it would be to replace; or whether it's possible to replace at all.
This last point is what annoys me in naive "clever vs verbose" debates, and prompted this rant, since it's often assumed that the only difference is line count. To me, the best "clever" code isn't that which reduces its own line count; it's the code which removes problems entirely; i.e. where the alternative has caveats like "before calling, make sure to...", "you must manually free the resulting...", "watch out for race conditions with...", etc.
One example which comes to mind is some Javascript I wrote to estimated prices based on user-provided sliders and tick-boxes, and some formulas and constants which sales could edit in our CMS (basically, I had to implement a spreadsheet engine).
Recalculating after user input was pretty gnarly, since formulas could depend on each other in arbitrary ways, resulting in infinite loops and undefined variables when I tried to do it in a "straightforward" way. The "clever" solution I came up with was to evaluate formulas and values lazily: wrapping everything in thunks and using a memo table to turn exponential calculations into linear ones. It was small, simple and heavily-commented; but the team's unfamiliarity with concepts like lazy evaluation and memoising made it hard to get through code review.
Also, regarding "straightforward" or "verbose" code being "readable": it's certainly the case that any particular part of such code can be read and understood locally, but it can make the overall behaviour harder to understand. Just look at machine code: it's very verbose and straightforward: 'load address X into register A then add the value of register B', simple! Yet it's very hard to understand the "big picture" of what's going on. Making code more concise, either by simplifying it or at least abstracting away low-level, nitty-gritty details into well-named functions, can help with this.
When used well, "clever" code can reframe problems into a form which have very concise solutions; not because they've been code-golfed, but because there's so little left to say. This can mean the difference between a comprehensible system and a sprawling tangle of interfering patches. This may harm local reasoning in the short term, since it requires the reader to view things from that new perspective, when they may be expecting something else.
When used poorly, it results in things like nested ternaries, chasing conciseness without offering any deeper understanding of anything.
I have seen this many times and they are actually usually talented developers that are just not used to working in groups....
but what I have seen more often (back when I did code reviews)... is lazy copy and pasting or something analogous.
So i think it's a wrong way to review the code. You don't need the WHO but the WHY.
One of my pet peeves is the inability to solve a K-Map for 6 variables and do stuff based on the resulting boolean expression. Just because it is utterly unreadable. I've tried this with many reviewers but nope.