This is just not true, because C extension modules (i.e. libraries written to be used from Python but whose implementations are written in C) can release the global interpreter lock while inside a function call. Examples of these include numpy, scipy, pandas and tensorflow, and there are many others. Most Python processes that are doing CPU-intensive computation spend relatively little time actually executing Python, and are really just coordinating the C libraries (e.g. "mutiply these two matrices together").
The GIL is also released during IO operations like writing to a file or waiting for a subprocess to finish or send data down its pipe. So in most practical situations where you have a performance-critical application written in Python (or more precisely, the top layer is written in Python), multithreading works fine.
If you are doing CPU intensive work in pure Python and you find things are unacceptably slow, then the simplest way to boost performance (and probably simplify your code) is to rewrite chunks of your code in terms of these C extension modules. If you can't do this for some reason then you will have to throw in the Python towel and re-write some or all of your code in a natively compiled language (if it's just a small fraction of your code then Cython is a good option). But this is the best course of action regardless of the threads situation, because pure Python code runs orders of magnitude slower than native code.
I think some people's opinions is that if you're writing in C then you're not really writing a Python program, so they think it is impossible in Python. Which seems a reasonable point to make to me.
Your argument is that Python is fine for multithreading... as long as you actually write C instead of Python.
def add_and_mult(a, b, c):
return a + b @ c
If a, b and c are numpy arrays then this function releases the GIL and so will run in multiple threads with no further work and with little overhead (if a, b and c are large). I would describe this as a function "written in Python", even though numpy uses C under the hood. It seems you describe this snippet as being "written in C instead of Python"; I find that odd, but OK.But, if I understand you right, you are also suggesting that the other commenters here that talk about the GIL would also describe this as "written in C". They realise that this releases the GIL and will run on multiple threads, but the point of their comments is that proper pure Python function wouldn't. I disagree. I think that most others would describe this function as "written in Python", and when they say that functions written in Python can't be parallelised they do so because they don't realise that functions like this can be.
The GIL means that a single Python interpreter process can execute at most one Python thread at a time, regardless of the number of CPUs or CPU cores available on the host machine. The GIL also introduces overhead which affects the performance of code using Python threads; how much you're affected by it will vary depending on what your code is doing. I/O-bound code tends to be much less affected, while CPU-bound code is much more affected.
All of this dates back to design decisions made in the 1990s which presumably seemed reasonable for the time: most people using Python were running it on machines with one CPU which had one core, so being able to take advantage of multiple CPUs/cores to schedule multiple threads to execute simultaneously was not necessarily a high priority. And most people who wanted threading wanted it to use in things like network daemons, which are primarily I/O-bound. Hence, the GIL and the set of tradeoffs it makes. Now, of course, we carry multi-core computers in our pockets and people routinely use Python for CPU-bound data science tasks. Hindsight is great at spotting that, but hindsight doesn't give us a time machine to go back and change the decisions.
Anyway. This is not the same thing as "multithreading is impossible". This is the same thing as "multithreading has some limitations, and for some cases the easiest way to work around them will be to use Python's C extension API". Which is what the parent comment seemed to be saying.
I've mainly been looking at these resources:
https://github.com/rochacbruno/rust-python-example
Though I have not done rust <-> python in real practice
If you care about speed, Rust is supposedly as fast as C. The Rust ecosystem also has a lot of supposedly safe(!) tools for parallelism.
Type system and expressive macros seems like a big win over c to me.
That was interesting, thanks!
I really wish he had shown his numpy code. He said at 13:46 "Numpy actually doesn't help you at all because the calculation is still getting done at the Python level". But his function could be vectorised with numpy using functions like numpy.maximum or numpy.where, in which case the main loop will be in C not Python. I can't figure out from what he said whether his numpy code did that or not.
But either way, it's interesting that in this case the numpy version is arguably harder to write than the Cython version: rather than just adding a few bits of metadata (the types), you have to permute the whole control flow. If there's only a small amount of code you want to convert, I would still say it's better to use numpy though (if it actually is fast enough), because getting the build tools onto your computer for Cython can be a pain. And for some matrix computation there are speed inprovements above the fact that it's implemented in C e.g. matrix multiplication is faster than the naive O(n^3) version.
Why using legacy Python for this?
I get EOL/deprecation is here but lets not jump the gun to legacy just yet. I just see more 2 than 3 @ Day Job.
asyncio is a good library for asyncronous I/O but concurrent.futures gives us some pretty nifty tooling which makes concurrent programming (with ThreadPoolExecutor) and parallel programming (with ProcessPoolExecutor) pretty easy to get right. The Future class is a pretty elegant solution for continuing execution while a background task is being executed.
[0] https://docs.python.org/3/library/concurrent.futures.html
Step 1: stop using Python.
"You can have a second core when you know how to use one"
Now don't get me wrong, Python is a perfectly fine language for lots of things, but not for taking optimal advantage of the CPU.
https://benchmarksgame-team.pages.debian.net/benchmarksgame/...
Relative performance compared to C is somewhere between an order of magnitude or two slower. Considering how much harder and more error-prone multi-core is, maybe first try a fast sequential solution.
The ratio between the most-performant parallel framework and the least on Python will be a factor of (guessing) 1.5.
The ratio between a CPU-bound algorithm written in C and one in Python will be of the order of 10000 (again guessing as it's application-dependent).
Where is your time most profitably spent?
Most of the Python programs referenced on that benchmarks game webpage are in-fact using multi-core ?
For parallel execution, there's the GIL, but in practice it rarely matters, because once you want to do parallel execution, you have most likely a computationally intensive task to do, at which point you call down to C or something, and then GIL doesn't matter.
Eh, let me stop you there. Everything isn't about performance.
Hardware and UI based things really benefit from parallelism.
trio:
https://github.com/python-trio/trio
trio compared to asyncio, goroutines, etc.:
https://stackoverflow.com/a/49485603/1612318
"Notes on structured concurrency, or: Go statement considered harmful":
https://vorpus.org/blog/notes-on-structured-concurrency-or-g...
Was a damn good read, Thanks!
There are hundreds of libraries to deal with concurrency and/or parallelism in Python, asyncio, Celery and PySpark being the common ones.
All of them provide different approaches to concurrency because the language itself is not tight to one in particular.
And all of that is really just I/O parallelization; there's also CPU parallelization, and I don't believe Python has anything that's quite as easy as "Do these two things in parallel". Pretty much everything requires a lot of marshalling and process management which can easily slow a program down instead of improving it.
Python is great for a lot of things, and the community has found many creative workarounds for its shortcomings, but Go beats Python in I/O and CPU parallelism handily.
Also includes best currently available hyperparameter tuning framework!
However, where I think having this stuff available inside of python is useful is that it's cross platform and consumable from "higher levels" of python. A library can do some mucky stuff internally to speed computation but still present a simple sync interface, all without external dependencies.
My library lets you do parallelism in a unique way, where you do message passing parallelism without being explicit about it.
Also, by having the introductory chapter be about "functional programming" (which incidentally Python does not do well), he completely bypasses the serious issue of shared state.
Which goes to show that parallelism in Python is more like a gimmick than a real-world solution since it doesn't let you do in-process shared-memory processing via threads in parallel which is so important for many applications. In my case, the vast majority of the time I do not want to farm workers out to different operating system processes and deal with serialization and communication, but this is the only way for Python code to take advantage of multiple cores [1].
[1] Another way is to write a module in C and have Python code call into it on a new thread and release the GIL while doing so, but of course this is even worse pain-wise than doing it with multiprocessing and you end up writing/compiling C.
I thought a lot about this problem, for over 2 years, and came up with zproc
https://github.com/pycampers/zproc
Basically,
> It lets you do message passing parallelism without the effort of tedious wiring.
You'll be doing message passing without ever dealing with sockets!
Also, Shared memory parallelism is hard to get right irregardless of which language you use. I would recommend strongly against it, unless you're writing some really really really niche thing where message passing is a bottleneck (it isn't most of the time)
It means threas-based parallelism of pure-python code is unavailable; concurrency is just fine on Python.
Nowadays you can also use serverless to parallelize coarse-grained workloads in the cloud.
I could not agree more
It's definitely cheating to use C code with the exception of most Python libraries that already are to a large extent nothing more than thin wrappers over existing C libraries or the tiny fact that the most popular by far implementation of Python , CPython, is almost 50% implemented in the C language, including the standard library.The author even dared include "C" in the name of the implementation.
Those cheaters, becoming bolder and bolder every day.
Damn them !!!
Hold on, the GIL doesn't make Python automatically thread-safe!
You can still have classic data races as the VM can pause and resume two threads writing to the same variable.
It also simplifies a lot of CPython code, making it a lot easier to maintain.
What about no?
Don't get me wrong, i don't like Python as a language, but it's a fine tool and many useful programs have been written with it
But parallel programming? No, thanks.