This was finally merged to the main branch today [2].
I suspect there are lots of other both open source and private projects that are also making the change now. This is a show of confidence in Black as the standard code formatter for Python.
0: https://github.com/django/deps/blob/main/accepted/0008-black...
SSort is currently used for several hundred kilobytes of python so I'm wary, but if I'm going to make a breaking change before 1.0 then I think this is likely to be it.
I tried it on one of my Django `admin.py` files and it created NameErrors.
class TestAdmin(admin.ModelAdmin):
list_filter = ("foo_method",)
def foo_method(self, obj):
return "something"
foo_method.short_description = "Foo method"
# It turned it into this:
class TestAdmin(admin.ModelAdmin):
list_filter = ("foo_method",)
# NameError
foo_method.short_description = "Foo method"
def foo_method(self, obj):
return "something"Which is preferable, and why?
def myfunc(): global globalvar str(globalvar)
globalvar='abc'
myfunc()
will be transfered to
globalvar='abc'
def myfunc(): global globalvar str(globalvar)
myfunc()
I understand why is it done but i dont want to have function definition block filled with this declaration of variables (which i do later) since it has no impact to my code and it makes is just a bit "cleaner". Dont tell me to not use global variables :D
What I don't much care for is reorder-python-imports, which I think is related to black (but don't quote me). For the sake of reducing merge conflicts it turns the innocuous
from typing import overload, List, Dict, Tuple, Option, Any
into
from typing import overload
from typing import List
from typing import Tuple
from typing import Option
from typing import Any
Ugh. Gross. Maybe I'm just lucky but I've never had a merge conflict due to an import line so the cure seems worse than the disease.
Edit: Just to be 100% clear: this is python-reorder-imports, not black. I thought they were related projects, though maybe I'm wrong. Regardless, black on its own won't reorder imports.
https://pycqa.github.io/isort/docs/configuration/black_compa...
$ cat foo.py
from x import a, b, d, e
from x import c as C
$ isort foo.py
Fixing /tmp/foo.py
$ cat foo.py
from x import a, b
from x import c as C
from x import d, e from typing import (
overload,
List,
etc...,
)
would seem more sensible to me. I know you can make isort do that, I guess maybe not black. from typing import (
overload,
)
is silly, but I don't want: -from typing import overload
+from typing import (
+ overload,
+ List,
+)
when all I actually did (semantically) was: + List,In any case, you can wrap those in parentheses, in which case black will just enforce its usual tuple formatting: single line if it fits; one line per item if not, with a trailing comma.
edit: I tried it on a long line with a backslash break, and black wrapped the imports in parentheses like I suggested above. I wonder what causes the behaviour you see on your end.
Alt-enter over the would-be imported term to add an import, ctrl-alt-O to autoformat / autoremove (aka optimise) redundant ones.
You can then turn on folding to not see those imports much via Prefs -> Editor -> General -> Code Folding
from typing import *
because I get annoyed at having to change my imports each time I need a new type in my type annotations.
But even now I have to adapt on screen shares when my coworkers are using dark themes before sunset. Can't imagine what that would be like with different code formatting. So the next next step is to display their desktop with my theme.
IOW collaboration tools still have a long way to go.
Ostensibly you could craft a future where what is on disk is not what the user is actually editing - a-la the virtual DOM. And on read/save the developer's preference is used to transform the syntax into their ideal shape. This is trivial in a Lisp, but not so easy in other languages.
If this system results in syntactically identical code [1] it should not matter if it’s displaying for you differently [2] if it means you can read or write around or in it more comfortably it’s just a hairstyle.
I was asked to familiarize myself with Replit the other day and it seemed the editor defaulted to two spaces for Python. Two spaces?! I changed it to four.
A friend joined my session and began to code with me, their editor was in the default two space indentation. It was madness.
[1] This seems like is a decent sized presumption across many languages and versions.
[2] This seems like an interesting AI problem, showing code structures you’ve never used in your style you’ve never defined.
I think that making editors do this is within the realms of feasibility. Most support auto-formatting to your preferred style so it doesn't feel like a leap for it to format to your preferred style but keep the file on disk the project owner's preferred style. I haven't looked extensively to see if this already exists though but we chatted about this at work as I was advocating for use of prettier on a front-end project!
Example here: https://bignerdranch.com/blog/git-smudge-and-clean-filters-m...
IMO the next next step is, as others have discussed on HN, getting your version control to store and abstract syntax tree. tree-sitter could make this easier nowadays, but I think it'd need more invasive changes in Git than just using the filters.
See this HN thread https://news.ycombinator.com/item?id=28670372
As a couple of examples, PHP has had a unified formatting standard since 2013 and Elixir has a formatter built into the language. Both languages need the formatter to be enabled by your IDE/CI and that's also the case for Black.
Python was always meant to look concise / beautiful... (MyPy has also made this trickier too)
An autoformatter removes 99% effort from formatting code, and that includes code actively being worked on. Autoformatters are incredibly useful.
A standardized format removes effort spent learning to read a new format. That's an hour per format at most.
I don't see any good reasons for an autoformatter to enforce a standard. A standard would work just as well if defined as a specific configuration.
Blacks value isn't autoformatter, it's preemptive discussion ender.
One hour my ass.
Are the mentioned issues resolved by now? E.g. the quadratic algorithm?
foo = (
spark
.read
.parquet(...)
.filter(...)
.withColumn(...)
)and turns it into
foo = spark.read.parquet(
...
).filter( ...
).withColumn( ...
)which feels harder to parse for me. I also never quite got on board with the trailing commas.
assert outputs.get("foo.bar.baz", "default") == pytest.approx(
time_recorder.time_taken, abs=0.0001
)
I get why it's done that, but I just don't think it helps humans read.
Part of the twisted beauty of PEP-008's narrow lines is that you're forced to extract (named) variables, or avoid overly indented code by extracting methods or applying higher level abstractions.In the last few years I find devs are happier to format and push to "sort that problem out", leaving the readability benefit of that thought process lost.
TL;DR writing readable code isn't just about getting the spaces and brackets right...
https://github.com/ipython/ipython/pull/12091/files
Unfortunately there isn’t support for it in GitHub or GitLab yet, but there’s at least a GitLab issue here requesting it:
To do this just highlight the block, right click, and choose Git > Show History for Selection.
The only real downside is you nuke your issue tracker at the same time.
> It would be nice if there was a kind of diffing algorithm that can diff code units syntactically across history.
There have been quite a few attempts at that though I've only seen them applied to resolving merge conflicts. It would be interesting to try them for blame too.
Probably best to just make a one time git user to do it.
- doesn't respect vertical space - sure, making the code fit on screen might be valuable (though the default width should be at least 120 characters, I mean we're in 2022 after all), but Black does it by blowing up the vertical space used by the code
- spurious changes in commits - if you happen to indent a block, Black will cause lines to break
- Black fails at its most basic premise - "avoiding manual code formatting" - because a trailing comma causes a list/function call to be split over lines regardless of width
This is fine with me--I think it makes sense to optimize for readability, and I can read a long vertical list of arguments a lot more readily than a long comma-delineated list.
> spurious changes in commits - if you happen to indent a block, Black will cause lines to break
Is this a generic argument against wrapping lines, or am I misunderstanding something?
> Black fails at its most basic premise - "avoiding manual code formatting" - because a trailing comma causes a list/function call to be split over lines regardless of width
I'm not following this either. If black automatically reformats your code over multiple lines, that doesn't suggest manual formatting. Maybe you're arguing that all code which produces a given AST should be formatted in the same way--this would be cool and I would agree, but black gets us 95% of the way there so to argue that it "fails" is to imply that "0%" and "<100%" are equivalent.
Even in 2022, some people don't have wide external monitors, sometimes like to view two files (or a diff) side-by-side, or need to use GitHub/BitBucket/etc. code viewer pages. Also, it's still difficult for humans to read long lines.
You cannot read things you can't see. If half a function is scrolled off the bottom of the screen because every function arg is on its own line .... its pretty annoying.
IMO, < 80 is ideal where possible with an absolute maximum of 99. I think Black's choice of 88 (plus maybe a little more in special cases) is quite good.
Yeah, this one drives me nuts too.
But I also despise long lines with a passion, I hate having to go to the right, and would much much rather scroll up and down with a consistent width, so that I can put multiple views next to each other.
[0] https://black.readthedocs.io/en/stable/the_black_code_style/...
Ah that's why `manage.py shell` now split json pasted on several lines, very annoying
The built in "facebook style" formating felt by far the most natural to me with the out of the box settings and no extra config.
Black gives generally nicer output, and also more predictable output because its folding algorithm is simpler. YAPF uses a global optimisation which makes it make very strange decisions sometimes. Black does too, but much less often.
There are also non-style problems with YAPF. It occasionally fails to produce stable output, i.e. yapf(yapf(x)) != yapf(x). In some cases it never stabilises - flip flopping between alternatives forever!
Finally it seems to have very bad worst case performance. On some long files it takes so long that we have to exclude them from formatting. Black has no issue.
In conclusion, don't use YAPF! Black is better in almost every way!
I figured yapf was not "new" which is why black won.
Starting about 5-6 years ago there was a push in the Python community to replace solved problems with new ones in what appears to me as chasing the JavaScript community.
Instead of consolidating on existing tools that worked well but had some rough edges to smooth out, numerous projects came about to reinvent the wheel.
...which is why I wish Black allowed more configuration. A team can often agree on a set of styles. Every team on the Python planet agreeing... now that's much harder
Why did your team not implement a style guide? Not following style is not working as a team and this needs to be addressed.
The advantage of a tool like Black is that it avoids that constant bikeshedding and the fact that it actually does the work for you puts the conversation in a different light because the option which is the least work is just letting Black format the code. Whatever you pick for style, you really want automatic formatting to avoid it seeming like a chore.
print(repr('some string'))
to print "some string"
instead of 'some string'
then that would remove the only hangup about Black that I have.For clarity, I'm hoping to open us discussion about how we're dealing with massive changesets like this that are difficult to review due chiefly to the breadth of it.
Update: Found it:
> How stable is Black’s style?
> Starting in 2022, the formatting output will be stable for the releases made in the same year
- The changeset is authored by a trusted committer but the committer's tools have been locally compromised.
- The public tool itself (e.g. black) has been compromised to automatically create vulnerabilities in difficult-to-review bits of code (a Ken Thompson hack).
My personal experience is that 1) in many cases you do benefit from taking a moment, going through your code and thinking about presentation. And 2) I find it not at all difficult to settle. A change either doesn't matter, then you just don't discuss it at all, or it is important, then you quickly agree on the best solution. (In the worst case, "best" means what the project lead finds prettier.) If you don't have a social mechanism to agree on something as basic as coding style, then your team probably has bigger problems.
I actually find robo-formated code annoying to read: Go code from a bloody beginner who doesn't know what they are doing looks exactly like carefully tended for, highly thought-out code. And in autoformatted Python, you for example cannot make formulas clearer by removing spaces around operators with higher precidence. Parentheses placement is dicated by how long words are and not by what logically belongs together, etc..
Simply give up your mind and you too can be free.
[1] https://github.com/django/django/pull/15387#issuecomment-103...
https://github.com/ipython/ipython/pull/12091/files
And here’s a GitLab issue requesting support for blame-ignore:
https://gitlab.com/gitlab-org/gitlab/-/issues/31423
I don’t think there’s a corresponding GitHub request, but maybe if GitLab adds this feature GitHub will have some incentive to follow suit.
But, it's a "rip the plaster off" kinda thing, because it should ensure a lot less churn, inconsistent code style, or arguments and reviews about formatting after this is merged. It frees up a lot of headspace and distractions in code reviews. I don't know about you, but when I did code reviews I'd always end up zooming in on code style issues - ' vs ", things on newlines or no, JS objects with stringed keys, etc.
Beyond `.git-blame-ignore-revs` (which is neat and TIL), in GitHub's web viewer, if you find the line you're interested in and see that the most recent PR is a reformat, you click the "view blame prior to this change" button. I think most blame viewers do (or at least should) have a feature like this.
A year later and it seems to be the default on all projects I'm working on and I'm loving it.
It's this one
> Or is the point to autoformat
This one is done with pre-commit (which should probably be named pre-push?) hooks
> and autocommit the formatted code?
I don't think this one is done, and I think it's undesirable
It's this. Ensures that anything merged to master keeps the formatting conventions established in the project.
It is way better to deal with ugly formatting as long as it is consistent than with discussions where to put a closing brace/bracket/paren.
Some people make the case that it's easier to write single quotes (well, depending on the keyboard format anyway). For keyboards in the US standard you have to hold the Shift key to write a double quote. But the good thing about Black is that you can still write your code using single quote and when you run the command line utility it will fix/normalize the code to use double quotes.
Nowadays I got so used to it that I even write my Python code using double quotes. And looking at Python code using single quotes looks weird/unpleasant for me.
syslog('debug',"Just opened %s for output",filename)
While there's no semantic difference between single and double quote, in my code base, there is. And if black becomes very popular, why even support single quotes anymore?Technically single or double quotes have the exact same meaning in Python. What makes people use single quotes is probably other languages like PHP, Perl and Bash.
I know I've made it a habit to default to single quotes unless I know I need double quotes. So that might be where the habit comes from in the Django project. But it's not actually necessary in python so might as well use the most commonly used type of quote.