Parallel Programming with Python (opens in new tab)

(chryswoods.com)

218 pointsuaaa7y ago141 comments

141 comments

In response to the multiple comments here complaining that multithreading is impossible in Python without using multiple processes, because of the GIL (global interpreter lock):

This is just not true, because C extension modules (i.e. libraries written to be used from Python but whose implementations are written in C) can release the global interpreter lock while inside a function call. Examples of these include numpy, scipy, pandas and tensorflow, and there are many others. Most Python processes that are doing CPU-intensive computation spend relatively little time actually executing Python, and are really just coordinating the C libraries (e.g. "mutiply these two matrices together").

The GIL is also released during IO operations like writing to a file or waiting for a subprocess to finish or send data down its pipe. So in most practical situations where you have a performance-critical application written in Python (or more precisely, the top layer is written in Python), multithreading works fine.

If you are doing CPU intensive work in pure Python and you find things are unacceptably slow, then the simplest way to boost performance (and probably simplify your code) is to rewrite chunks of your code in terms of these C extension modules. If you can't do this for some reason then you will have to throw in the Python towel and re-write some or all of your code in a natively compiled language (if it's just a small fraction of your code then Cython is a good option). But this is the best course of action regardless of the threads situation, because pure Python code runs orders of magnitude slower than native code.

chrisseaton7y ago

> complaining that multithreading is impossible in Python without using multiple processes, because of the GIL ... this is not true

I think some people's opinions is that if you're writing in C then you're not really writing a Python program, so they think it is impossible in Python. Which seems a reasonable point to make to me.

Your argument is that Python is fine for multithreading... as long as you actually write C instead of Python.

quietbritishjim7y ago

Let's say I write this:

    def add_and_mult(a, b, c):
        return a + b @ c

If a, b and c are numpy arrays then this function releases the GIL and so will run in multiple threads with no further work and with little overhead (if a, b and c are large). I would describe this as a function "written in Python", even though numpy uses C under the hood. It seems you describe this snippet as being "written in C instead of Python"; I find that odd, but OK.

But, if I understand you right, you are also suggesting that the other commenters here that talk about the GIL would also describe this as "written in C". They realise that this releases the GIL and will run on multiple threads, but the point of their comments is that proper pure Python function wouldn't. I disagree. I think that most others would describe this function as "written in Python", and when they say that functions written in Python can't be parallelised they do so because they don't realise that functions like this can be.

chrisseaton7y ago

This only gives you parallelism in one specific situation though - where operations like '+' and '@' take a long time. If they were fine-grained operations, then this doesn't help you.

If instead of operating on a numerical matrix, you were instead operating on something like a graph of Python objects, something like a graph traversal would be hard to parallelise as you could not stay out of the GIL long enough to get anything done.

2 more replies

bogomipz7y ago

Could you explain the return statement in your example? I only know the '@' as a decorator in Python. This looks like invalid syntax to me what am I missing?

1 more reply

ubernostrum7y ago

A whole lot depends on what exactly it is that someone wants to get out of using threading.

The GIL means that a single Python interpreter process can execute at most one Python thread at a time, regardless of the number of CPUs or CPU cores available on the host machine. The GIL also introduces overhead which affects the performance of code using Python threads; how much you're affected by it will vary depending on what your code is doing. I/O-bound code tends to be much less affected, while CPU-bound code is much more affected.

All of this dates back to design decisions made in the 1990s which presumably seemed reasonable for the time: most people using Python were running it on machines with one CPU which had one core, so being able to take advantage of multiple CPUs/cores to schedule multiple threads to execute simultaneously was not necessarily a high priority. And most people who wanted threading wanted it to use in things like network daemons, which are primarily I/O-bound. Hence, the GIL and the set of tradeoffs it makes. Now, of course, we carry multi-core computers in our pockets and people routinely use Python for CPU-bound data science tasks. Hindsight is great at spotting that, but hindsight doesn't give us a time machine to go back and change the decisions.

Anyway. This is not the same thing as "multithreading is impossible". This is the same thing as "multithreading has some limitations, and for some cases the easiest way to work around them will be to use Python's C extension API". Which is what the parent comment seemed to be saying.

kjeetgill7y ago

> All of this dates back to design decisions made in the 1990s which presumably seemed reasonable for the time ... Hence, the GIL and the set of tradeoffs it makes. Now, of course, we carry multi-core computers ... Hindsight is great at spotting that, but hindsight doesn't give us a time machine to go back and change the decisions.

Sadly I don't think this is _quite_ true. I believe GILs are used in a number of interpreters and fall prey to the common problem of where either coarsening locks or making them finer ruins somebody's day. I believe Guido Van Rossum hung the GILectomy on two main issues: The interpreter must remain relatively simple, and C extensions cannot be slowed down.

I'm not disagreeing with the decision (necessarily) but it isn't simply a layover from a bygone era. It was a decision that has been reaffirmed and upheld numerous times.

[0]: https://lwn.net/Articles/754577/ [1]: just google Gilectomy, it's been covered in a few places that I don't have handy.

1 more reply

chrisseaton7y ago

Right I should have said parallel threads, not multithreading.

Rotareti7y ago

Does anyone know how well Python and Rust team up compared to Python and C in practice?

ikornaselur7y ago

I've yet to play with beyond just experimenting a little bit, but it seems it works very well.

I've mainly been looking at these resources:

https://github.com/rochacbruno/rust-python-example

https://github.com/PyO3/pyo3

Though I have not done rust <-> python in real practice

jwandborg7y ago

Subjectively I'm really impressed by PyO3.

If you care about speed, Rust is supposedly as fast as C. The Rust ecosystem also has a lot of supposedly safe(!) tools for parallelism.

gpm7y ago

I've done it once, converting about 15 lines of python to rust. It was completely painless and resulted in a large speedup (changed a hotspot that was taking approximately 90% of execution time in a scientific simulation to approximately 0%).

Type system and expressive macros seems like a big win over c to me.

quietbritishjim7y ago

Care to share a bit more detail on how you did this? Was there some interfacing library that you used analogous to Cython/SWIG/etc.? Presumably you didn't code directly against the C API (in python.h)?

1 more reply

charlescearl7y ago

Also a nice short talk by Caleb Hattingh https://www.youtube.com/watch?v=NfnMJMkhDoQ

quietbritishjim7y ago

> talk [about Cython]

That was interesting, thanks!

I really wish he had shown his numpy code. He said at 13:46 "Numpy actually doesn't help you at all because the calculation is still getting done at the Python level". But his function could be vectorised with numpy using functions like numpy.maximum or numpy.where, in which case the main loop will be in C not Python. I can't figure out from what he said whether his numpy code did that or not.

But either way, it's interesting that in this case the numpy version is arguably harder to write than the Cython version: rather than just adding a few bits of metadata (the types), you have to permute the whole control flow. If there's only a small amount of code you want to convert, I would still say it's better to use numpy though (if it actually is fast enough), because getting the build tools onto your computer for Cython can be a pain. And for some matrix computation there are speed inprovements above the fact that it's implemented in C e.g. matrix multiplication is faster than the naive O(n^3) version.

woolvalley7y ago

Because we want first class python multithreading, like many other languages have. If we have to drop down into C, might as well use another language with first class multi-threading like java, kotlin, golang or swift and avoid all the other issues that come with slow GIL languages.

TheCondor7y ago

The thread workers pool circumvent the GIL if you carefully follow some rules. I think the arguments and results need to be pickleable.

quietbritishjim7y ago

I think you're thinking of the multiprocessing module, which uses separate processes to bypass the GIL. That's why the arguments and results need to be pickleable: pickle is a serialisation procotol, so it allows you to communicate the contents of objects between different processes. If you use threads within a single process, you don't need to pickle the objects; you just pass the object directly.

walterstucco7y ago

that's not writing Python though

elcombato7y ago

> (note that you must be using Python 2 for this workshop and not using Python 3. Complete this workshop using Python 2, then read about the small changes if you are interested in using Python 3)

Why using legacy Python for this?

ggm7y ago

why not re-write the workshop for python3 and require python2 users to wear the pain downgrade brings?

brennebeck7y ago

Because python2 is a deprecated language that will EOL?

ggm7y ago

ok. If python2 is deprecated, why write a tutorial in python2 and say "python3 people can work it out"

kjeetgill7y ago

I'm not sure it's fair to call it legacy just yet. Most linux distributions (minus Arch I think) still use 2.7 as the default.

I get EOL/deprecation is here but lets not jump the gun to legacy just yet. I just see more 2 than 3 @ Day Job.

ilovetux7y ago

I find it strange that nobody ever seems to mention python's concurrent.futures module [0] which is new in Python 3.2. I think asyncio got a lot of attention when it came out in Python 3.4 and concurrent.futures took a back seat. This article also doesn't mention the module in it's Python 2 and 3 differences link.

asyncio is a good library for asyncronous I/O but concurrent.futures gives us some pretty nifty tooling which makes concurrent programming (with ThreadPoolExecutor) and parallel programming (with ProcessPoolExecutor) pretty easy to get right. The Future class is a pretty elegant solution for continuing execution while a background task is being executed.

[0] https://docs.python.org/3/library/concurrent.futures.html

ZeroCool2u7y ago

ThreadPoolExecutor and ProcessPoolExecutor were exactly what I was waiting for someone to mention. I was doing some Python as a systems architect at my previous position and now as a full time data scientist, my life has pretty much been consumed by Python. Unsurprisingly, a lot of my initial work is retrieving and cleaning very large volumes of data, the later usually being I/O bound and the former being CPU bound and frankly myself and a lot of my team immediately default to using both ThreadPoolExecutor and ProcessPoolExecutor respectively, because of how simple and performant they are. Perhaps asyncio is more familiar terminology to people coming from Web Dev, so that's why they're gravitating towards it, but there are few times when I find myself needing that particular tooling outside of Web Dev anyways.

mpweiher7y ago

"...take advantage of the processing power of multicore processors"

Step 1: stop using Python.

"You can have a second core when you know how to use one"

Now don't get me wrong, Python is a perfectly fine language for lots of things, but not for taking optimal advantage of the CPU.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Relative performance compared to C is somewhere between an order of magnitude or two slower. Considering how much harder and more error-prone multi-core is, maybe first try a fast sequential solution.

TickleSteve7y ago

You're absolutely right (but you're probably gonna get some downvotes for saying that).

The ratio between the most-performant parallel framework and the least on Python will be a factor of (guessing) 1.5.

The ratio between a CPU-bound algorithm written in C and one in Python will be of the order of 10000 (again guessing as it's application-dependent).

Where is your time most profitably spent?

devxpy7y ago

Wild Guess: You're not really a python programmer, are you?

Just curious...

TickleSteve7y ago

Yes, I am... ( and C/C++ )

But I use the right tool for the job. Python is a great tool, but not for performance (Applies to all dynamic, interpreted languages TBH).

1 more reply

auggierose7y ago

Yeah. Recently switched some Blender Python algorithms I wrote to Swift/Metal, and the speedup was somewhere between 1000 and 1000000 depending on the algorithm.

stevesimmons7y ago

Speedups of that magnitude suggest the original Python approach was particularly inefficient...

auggierose7y ago

Not going to dispute that. If I spend time optimising code, I might do it as well in an environment like Swift/Metal instead of Python.

targafarian7y ago

Yeah, properly written Python is at extreme worst O(1000) times slower than speeding it up code with a Numpy/Numba/c/Fortran/etc. implementation. Brute-force loopy code in Python I've seen is 100x slower than the compiled alternatives. So I agree, these extreme numbers are the sign of writing the worst possible Python implementation of a thing and saying Python sucks.

kilon7y ago

Who would have guessed that compiled, static, non-dynamic, hardware accelerated code would be a ton more performant than runtime, highly dynamic, garbage collected and very powerful code that is not hardware accelerated.

auggierose7y ago

not sure though what you mean by "very powerful code" in that context ;-)

igouy7y ago

> Considering how much harder and more error-prone multi-core is, maybe first try a fast sequential solution.

Most of the Python programs referenced on that benchmarks game webpage are in-fact using multi-core ?

eternauta3k7y ago

Why? I can run my Python plotting script with multiprocessing in one of the blade servers at work and get the job done quickly. All without translating a big bunch of code to C.

ram_rar7y ago

I love python. But its seriously, incapable for doing non trivial concurrent tasks. Multiprocessing module doesnt count. I hope the python core-devs take some inspiration from golang for developing the right abstractions for concurrency.

azag07y ago

Concurrent or parallel? For concurrency, python has asyncio, which many people consider a success.

For parallel execution, there's the GIL, but in practice it rarely matters, because once you want to do parallel execution, you have most likely a computationally intensive task to do, at which point you call down to C or something, and then GIL doesn't matter.

devxpy7y ago

> most likely a computationally intensive task to do

Eh, let me stop you there. Everything isn't about performance.

Hardware and UI based things really benefit from parallelism.

Rotareti7y ago

I hope some of the (new?) concepts from trio find their way into the standard lib.

trio:

https://github.com/python-trio/trio

trio compared to asyncio, goroutines, etc.:

https://stackoverflow.com/a/49485603/1612318

"Notes on structured concurrency, or: Go statement considered harmful":

https://vorpus.org/blog/notes-on-structured-concurrency-or-g...

devxpy7y ago

> https://vorpus.org/blog/notes-on-structured-concurrency-or-g...

Was a damn good read, Thanks!

nicolaslem7y ago

As a developer working mostly with Python this comment makes no sense to me.

There are hundreds of libraries to deal with concurrency and/or parallelism in Python, asyncio, Celery and PySpark being the common ones.

All of them provide different approaches to concurrency because the language itself is not tight to one in particular.

weberc27y ago

These are all quite a lot harder to use than Go, and often they don't play well together. For example, there are lots of sync libraries (the Docker API, the AWS SDK, etc) that can't be turned async, so other folks have had to go through the trouble of forking and porting to async and since those other folks are often not affiliated with the original dev teams, who knows what the quality level of those libraries may be? We've also had a lot of problems with asyncio alone--often developers forgetting to await an async call or doing something (I'm not sure what exactly) that causes processes to hang indefinitely. It's all quite a lot more complex than Go's concurrency model.

And all of that is really just I/O parallelization; there's also CPU parallelization, and I don't believe Python has anything that's quite as easy as "Do these two things in parallel". Pretty much everything requires a lot of marshalling and process management which can easily slow a program down instead of improving it.

Python is great for a lot of things, and the community has found many creative workarounds for its shortcomings, but Go beats Python in I/O and CPU parallelism handily.

sidlls7y ago

While I agree that python isn't ideal beyond a certain scope, I think you're overstating how bad it is. My team and I have built a number of non-trivial machine learning products with pipelines that use both the ThreadPool and ProcessPool components successfully. The headaches we have are related more to the fact that Python is dynamic than its concurrency story.

weberc27y ago

OP probably is overstating a bit, but it is hard to efficiently parallelize computation in Python. For example, if you have a large Python object graph that you need to compute over, you can't easily parallize the computation without paying some significant serialization cost. You can probably alleviate that by carefully choosing algorithms that minimize the amount of serialization per worker process, but at the end of the day, all of this is still quite a lot harder than using shared memory and goroutines. And not to mention Go is 1-2 orders of magnitude faster than Python in single-threaded execution... Python is great for lots of things, but efficient parallel programming in Python is _hard_, even if there are a handful of cases where it's not so hard.

greglindahl7y ago

I successfully do concurrent+parallel computing with Python using asyncio combined with ProcessPoolExecutor. I can see why perhaps that doesn't scratch your itch, but it sure scratches my web-crawling itch.

andbberger7y ago

IMO ray[1] is the greatest thing to happen in python parallelism since the invention of sliced bread.

Also includes best currently available hyperparameter tuning framework!

[1] https://github.com/ray-project/ray

another-cuppa7y ago

I think a lot of this complexity can be avoided by just writing single threaded python and using GNU parallel for running it on multiple cores. You can even trivially distribute the work across a cluster that way.

quiq7y ago

This is the approach I've taken, albeit at the "top level" of the program. Since I know I don't have to deal with Windows I much prefer simply piping to parallel instead of xargs, or calling make -j8, or similarly letting some shell wrapper handle it over dealing with the overhead inside of python, especially multiprocessing.

However, where I think having this stuff available inside of python is useful is that it's cross platform and consumable from "higher levels" of python. A library can do some mucky stuff internally to speed computation but still present a simple sync interface, all without external dependencies.

jillesvangurp7y ago

Did they ever fix the global interpreter lock? Sort of a show stopper with doing stuff concurrently in python. I've done a bit of batch processing using the multi process module; which uses processes instead of threads. This works but it is a bit of a kludge if you are used to languages that support concurrency properly.

another-cuppa7y ago

Concurrency and parallelism are two different things. Python is fine for concurrency.

mpweiher7y ago

And the article is about "Parallel Programming with Python", in order to "...take advantage of the processing power of multicore processors".

devxpy7y ago

I believe that since the Advent of zeromq, parallelism is possible in almost any language, including python.

My library lets you do parallelism in a unique way, where you do message passing parallelism without being explicit about it.

https://github.com/pycampers/zproc/

TickleSteve7y ago

You make some extremely large claims about ZProc, what advantages does it have over every other message-passing library for every other language ever built? (including the other zeromq bindings?)

TBH, you're claims sound like you've just "discovered" message-passing, of which many, many languages, runtimes and operating systems have been using for many years/decades. (https://en.wikipedia.org/wiki/Message_passing)

In other words... its not a revolution.

ZProc seems to simply be a simple library to pickle data structures thru a central (pubsub?) server.

This is not the way to get remotely close to "high performance". What you've created here is pretty much what multiprocessing gives you already in a more performant solution (i.e. no zeromq involved).

2 more replies

adw7y ago

> My library lets you do parallelism in a unique way

That's a big claim which you don't really back up as much as you need to. Unique is an extremely high bar in this very busy field.

There are several other similar red flags on the linked GitHub; I think your enthusiasm is running away from you a little. You might want to dial the ten-dollar language back a bit – it made me immediately suspicious ("utterly perfect", for example is another danger phrase).

It's the combination of grandiose language + solution-in-search-of-a-problem which leads to that.

If you're going to sell hard, what I would want to see is a large, complex, high-traffic system which makes extensive use of this; if you compare and contrast with Ray, which I've also only just encountered in this thread, there's a real problem (distributed hyperparameter optimization) which they've built a solution for with the library, and that immediately lends it credibility; I know the system can be used for something because it has been.

2 more replies

heavenlyblue7y ago

>> Zproc uses a Server, which is responsible for storing and communicating the state. >> >> This isolates our resource (state), eliminating the need for locks.

So you've just invented a new name for a coordinator process and called it a new fashion in computation?

2 more replies

armitron7y ago

They did not, which is why this "course" illustrates taking advantage of multiple cores via multiprocessing without mentioning the GIL at all. Which is a little misleading if you think about it.

Also, by having the introductory chapter be about "functional programming" (which incidentally Python does not do well), he completely bypasses the serious issue of shared state.

Which goes to show that parallelism in Python is more like a gimmick than a real-world solution since it doesn't let you do in-process shared-memory processing via threads in parallel which is so important for many applications. In my case, the vast majority of the time I do not want to farm workers out to different operating system processes and deal with serialization and communication, but this is the only way for Python code to take advantage of multiple cores [1].

[1] Another way is to write a module in C and have Python code call into it on a new thread and release the GIL while doing so, but of course this is even worse pain-wise than doing it with multiprocessing and you end up writing/compiling C.

devxpy7y ago

> deal with serialization and communication

I thought a lot about this problem, for over 2 years, and came up with zproc

https://github.com/pycampers/zproc

Basically,

> It lets you do message passing parallelism without the effort of tedious wiring.

You'll be doing message passing without ever dealing with sockets!

Also, Shared memory parallelism is hard to get right irregardless of which language you use. I would recommend strongly against it, unless you're writing some really really really niche thing where message passing is a bottleneck (it isn't most of the time)

armitron7y ago

The mantra that shared memory parallelism is hard to get right to the point where such platitudes as "unless you're writing some really really really niche thing" are uttered is entirely erroneous I find, through my own experience.

There are idiot-proof thread-safe datastructures and producer/consumer APIs that map extremely well to most problems that come up in practice in the domain, that one should confidently use. Refusing to do shared memory parallelism because of the _abstract potential for havoc_ rather than any practical justifications based on the problem-at-hand is throwing out the baby with the bathwater and is not the mark of competent engineering.

2 more replies

TickleSteve7y ago

you claim "To make utterly perfect MT programs (and I mean that literally)".

you've rediscovered message-passing... please take an elementary CS course on parallel systems.

That claim is naive in the extreme.

1 more reply

dragonwriter7y ago

> Did they ever fix the global interpreter lock? Sort of a show stopper with doing stuff concurrently in python.

It means threas-based parallelism of pure-python code is unavailable; concurrency is just fine on Python.

skrause7y ago

I have to work with Python on Windows and believe me, concurrency is not just fine in Python when you can't use fork().

zbentley7y ago

Obligatory "concurrency != parallelism" statement; concurrency is fine on both platforms with Python threading in a single process with a GIL; parallelism is less of a done deal.

While it's a very big hammer, consider experimenting with Celery for your parallelism needs on Windows. I've had good results using per-script Celery "clusters" with either a filesystem (on a ramdisk for extra speed) or an embedded Redis backend to accomplish pretty nice bidirectional RPC-ish parallelism. The initial setup is much more complicated than something like goroutines, but once you get it working you can boilerplate it onto other tasks without much trouble.

It still won't save you from memory constraints imposed by the lack of good fork() emulation, though. Hopefully the WSL stuff will either bring better fork() emulation, or allow support for shared memory objects (e.g. multiprocessing.Value) in order to ease some of that pain.

mwyau7y ago

mpi4py should be included. It's a wrapper for the MPI library, which is the de facto standard for scientific computing: https://mpi4py.readthedocs.io/en/stable/

natvert7y ago

Sweet, a guide! I always end up rolling my own thread pool / manager. I wish something like the parallel gem for Ruby existed in pyland...

guiriduro7y ago

If your tasks are fairly coarse-grained (take >50ms each), Celery [1] has existed for a several years; takes a bit of setting up but works well, its very flexible. If your needs are simple, don't forget that your common or garden webserver can parallelize workloads too (distribute web requests to workers on multiple cores), it depends mostly on your client code for fan-out, and redis has worked well for synchronization for me.

Nowadays you can also use serverless to parallelize coarse-grained workloads in the cloud.

[1] http://www.celeryproject.org/

magwa1017y ago

Concurrency in python always ends up the reason to drop it and reimplement in Go. Also, the code ends up littered with type checks....

wenning7y ago

i think use python3 multiprocess and async is better for product.

gnufx7y ago

Multi-core parallelism isn't so interesting for serious computation. You want to be able to use large distributed HPC systems, but Python doesn't seem to have the equivalent of https://pbdr.org for R.

kilon7y ago

One more epic discussion on Python, where we have the unique opportunity to learn that using C libraries from Python is "cheating".

I could not agree more

It's definitely cheating to use C code with the exception of most Python libraries that already are to a large extent nothing more than thin wrappers over existing C libraries or the tiny fact that the most popular by far implementation of Python , CPython, is almost 50% implemented in the C language, including the standard library.The author even dared include "C" in the name of the implementation.

Those cheaters, becoming bolder and bolder every day.

Damn them !!!

1 more reply

goerz7y ago

The GIL has considerable benefits: I don’t have to worry about whether Python functions are thread-safe. Thread-based parallelism is hard to get right, and given the number of workarounds, Python’s GIL is a total non-issue.

jashmatthews7y ago

> The GIL has considerable benefits: I don’t have to worry about whether Python functions are thread-safe.

Hold on, the GIL doesn't make Python automatically thread-safe!

You can still have classic data races as the VM can pause and resume two threads writing to the same variable.

goerz7y ago

Can you elaborate on that? Is there a blog post somewhere that illustrates the problem you're talking about? I was under the assumption that Python interpreters run single-threaded.

devxpy7y ago

Small correction: It makes the _implementation_ thread-safe.

It also simplifies a lot of CPython code, making it a lot easier to maintain.

walterstucco7y ago

> Parallel Programming with Python?

What about no?

Don't get me wrong, i don't like Python as a language, but it's a fine tool and many useful programs have been written with it

But parallel programming? No, thanks.

j / k navigate · click thread line to collapse

141 comments

quietbritishjim7y ago

In response to the multiple comments here complaining that multithreading is impossible in Python without using multiple processes, because of the GIL (global interpreter lock):

chrisseaton7y ago

> complaining that multithreading is impossible in Python without using multiple processes, because of the GIL ... this is not true

I think some people's opinions is that if you're writing in C then you're not really writing a Python program, so they think it is impossible in Python. Which seems a reasonable point to make to me.

Your argument is that Python is fine for multithreading... as long as you actually write C instead of Python.

quietbritishjim7y ago

Let's say I write this:

    def add_and_mult(a, b, c):
        return a + b @ c

chrisseaton7y ago

This only gives you parallelism in one specific situation though - where operations like '+' and '@' take a long time. If they were fine-grained operations, then this doesn't help you.

2 more replies

bogomipz7y ago

Could you explain the return statement in your example? I only know the '@' as a decorator in Python. This looks like invalid syntax to me what am I missing?

1 more reply

ubernostrum7y ago

A whole lot depends on what exactly it is that someone wants to get out of using threading.

kjeetgill7y ago

I'm not disagreeing with the decision (necessarily) but it isn't simply a layover from a bygone era. It was a decision that has been reaffirmed and upheld numerous times.

[0]: https://lwn.net/Articles/754577/ [1]: just google Gilectomy, it's been covered in a few places that I don't have handy.

1 more reply

chrisseaton7y ago

Right I should have said parallel threads, not multithreading.

Rotareti7y ago

Does anyone know how well Python and Rust team up compared to Python and C in practice?

ikornaselur7y ago

I've yet to play with beyond just experimenting a little bit, but it seems it works very well.

I've mainly been looking at these resources:

https://github.com/rochacbruno/rust-python-example

https://github.com/PyO3/pyo3

Though I have not done rust <-> python in real practice

jwandborg7y ago

Subjectively I'm really impressed by PyO3.

If you care about speed, Rust is supposedly as fast as C. The Rust ecosystem also has a lot of supposedly safe(!) tools for parallelism.

gpm7y ago

Type system and expressive macros seems like a big win over c to me.

quietbritishjim7y ago

1 more reply

charlescearl7y ago

Also a nice short talk by Caleb Hattingh https://www.youtube.com/watch?v=NfnMJMkhDoQ

quietbritishjim7y ago

> talk [about Cython]

That was interesting, thanks!

woolvalley7y ago

TheCondor7y ago

The thread workers pool circumvent the GIL if you carefully follow some rules. I think the arguments and results need to be pickleable.

quietbritishjim7y ago

walterstucco7y ago

that's not writing Python though

elcombato7y ago

> (note that you must be using Python 2 for this workshop and not using Python 3. Complete this workshop using Python 2, then read about the small changes if you are interested in using Python 3)

Why using legacy Python for this?

ggm7y ago

why not re-write the workshop for python3 and require python2 users to wear the pain downgrade brings?

brennebeck7y ago

Because python2 is a deprecated language that will EOL?

ggm7y ago

ok. If python2 is deprecated, why write a tutorial in python2 and say "python3 people can work it out"

kjeetgill7y ago

I'm not sure it's fair to call it legacy just yet. Most linux distributions (minus Arch I think) still use 2.7 as the default.

I get EOL/deprecation is here but lets not jump the gun to legacy just yet. I just see more 2 than 3 @ Day Job.

ilovetux7y ago

[0] https://docs.python.org/3/library/concurrent.futures.html

ZeroCool2u7y ago

mpweiher7y ago

"...take advantage of the processing power of multicore processors"

Step 1: stop using Python.

"You can have a second core when you know how to use one"

Now don't get me wrong, Python is a perfectly fine language for lots of things, but not for taking optimal advantage of the CPU.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

TickleSteve7y ago

You're absolutely right (but you're probably gonna get some downvotes for saying that).

The ratio between the most-performant parallel framework and the least on Python will be a factor of (guessing) 1.5.

The ratio between a CPU-bound algorithm written in C and one in Python will be of the order of 10000 (again guessing as it's application-dependent).

Where is your time most profitably spent?

devxpy7y ago

Wild Guess: You're not really a python programmer, are you?

Just curious...

TickleSteve7y ago

Yes, I am... ( and C/C++ )

But I use the right tool for the job. Python is a great tool, but not for performance (Applies to all dynamic, interpreted languages TBH).

1 more reply

auggierose7y ago

Yeah. Recently switched some Blender Python algorithms I wrote to Swift/Metal, and the speedup was somewhere between 1000 and 1000000 depending on the algorithm.

stevesimmons7y ago

Speedups of that magnitude suggest the original Python approach was particularly inefficient...

auggierose7y ago

Not going to dispute that. If I spend time optimising code, I might do it as well in an environment like Swift/Metal instead of Python.

targafarian7y ago

kilon7y ago

auggierose7y ago

not sure though what you mean by "very powerful code" in that context ;-)

igouy7y ago

> Considering how much harder and more error-prone multi-core is, maybe first try a fast sequential solution.

Most of the Python programs referenced on that benchmarks game webpage are in-fact using multi-core ?

eternauta3k7y ago

Why? I can run my Python plotting script with multiprocessing in one of the blade servers at work and get the job done quickly. All without translating a big bunch of code to C.

ram_rar7y ago

azag07y ago

Concurrent or parallel? For concurrency, python has asyncio, which many people consider a success.

devxpy7y ago

> most likely a computationally intensive task to do

Eh, let me stop you there. Everything isn't about performance.

Hardware and UI based things really benefit from parallelism.

Rotareti7y ago

I hope some of the (new?) concepts from trio find their way into the standard lib.

trio:

https://github.com/python-trio/trio

trio compared to asyncio, goroutines, etc.:

https://stackoverflow.com/a/49485603/1612318

"Notes on structured concurrency, or: Go statement considered harmful":

https://vorpus.org/blog/notes-on-structured-concurrency-or-g...

devxpy7y ago

> https://vorpus.org/blog/notes-on-structured-concurrency-or-g...

Was a damn good read, Thanks!

nicolaslem7y ago

As a developer working mostly with Python this comment makes no sense to me.

There are hundreds of libraries to deal with concurrency and/or parallelism in Python, asyncio, Celery and PySpark being the common ones.

All of them provide different approaches to concurrency because the language itself is not tight to one in particular.

weberc27y ago

Python is great for a lot of things, and the community has found many creative workarounds for its shortcomings, but Go beats Python in I/O and CPU parallelism handily.

sidlls7y ago

weberc27y ago

greglindahl7y ago

andbberger7y ago

IMO ray[1] is the greatest thing to happen in python parallelism since the invention of sliced bread.

Also includes best currently available hyperparameter tuning framework!

[1] https://github.com/ray-project/ray

another-cuppa7y ago

quiq7y ago

jillesvangurp7y ago

another-cuppa7y ago

Concurrency and parallelism are two different things. Python is fine for concurrency.

mpweiher7y ago

And the article is about "Parallel Programming with Python", in order to "...take advantage of the processing power of multicore processors".

devxpy7y ago

I believe that since the Advent of zeromq, parallelism is possible in almost any language, including python.

My library lets you do parallelism in a unique way, where you do message passing parallelism without being explicit about it.

https://github.com/pycampers/zproc/

TickleSteve7y ago

You make some extremely large claims about ZProc, what advantages does it have over every other message-passing library for every other language ever built? (including the other zeromq bindings?)

In other words... its not a revolution.

ZProc seems to simply be a simple library to pickle data structures thru a central (pubsub?) server.

2 more replies

adw7y ago

> My library lets you do parallelism in a unique way

That's a big claim which you don't really back up as much as you need to. Unique is an extremely high bar in this very busy field.

It's the combination of grandiose language + solution-in-search-of-a-problem which leads to that.

2 more replies

heavenlyblue7y ago

>> Zproc uses a Server, which is responsible for storing and communicating the state. >> >> This isolates our resource (state), eliminating the need for locks.

So you've just invented a new name for a coordinator process and called it a new fashion in computation?

2 more replies

armitron7y ago

They did not, which is why this "course" illustrates taking advantage of multiple cores via multiprocessing without mentioning the GIL at all. Which is a little misleading if you think about it.

Also, by having the introductory chapter be about "functional programming" (which incidentally Python does not do well), he completely bypasses the serious issue of shared state.

devxpy7y ago

> deal with serialization and communication

I thought a lot about this problem, for over 2 years, and came up with zproc

https://github.com/pycampers/zproc

Basically,

> It lets you do message passing parallelism without the effort of tedious wiring.

You'll be doing message passing without ever dealing with sockets!

armitron7y ago

2 more replies

TickleSteve7y ago

you claim "To make utterly perfect MT programs (and I mean that literally)".

you've rediscovered message-passing... please take an elementary CS course on parallel systems.

That claim is naive in the extreme.

1 more reply

dragonwriter7y ago

> Did they ever fix the global interpreter lock? Sort of a show stopper with doing stuff concurrently in python.

It means threas-based parallelism of pure-python code is unavailable; concurrency is just fine on Python.

skrause7y ago

I have to work with Python on Windows and believe me, concurrency is not just fine in Python when you can't use fork().

zbentley7y ago

Obligatory "concurrency != parallelism" statement; concurrency is fine on both platforms with Python threading in a single process with a GIL; parallelism is less of a done deal.

mwyau7y ago

mpi4py should be included. It's a wrapper for the MPI library, which is the de facto standard for scientific computing: https://mpi4py.readthedocs.io/en/stable/

natvert7y ago

Sweet, a guide! I always end up rolling my own thread pool / manager. I wish something like the parallel gem for Ruby existed in pyland...

guiriduro7y ago

Nowadays you can also use serverless to parallelize coarse-grained workloads in the cloud.

[1] http://www.celeryproject.org/

magwa1017y ago

Concurrency in python always ends up the reason to drop it and reimplement in Go. Also, the code ends up littered with type checks....

wenning7y ago

i think use python3 multiprocess and async is better for product.

gnufx7y ago

kilon7y ago

One more epic discussion on Python, where we have the unique opportunity to learn that using C libraries from Python is "cheating".

I could not agree more

Those cheaters, becoming bolder and bolder every day.

Damn them !!!

1 more reply

goerz7y ago

jashmatthews7y ago

> The GIL has considerable benefits: I don’t have to worry about whether Python functions are thread-safe.

Hold on, the GIL doesn't make Python automatically thread-safe!

You can still have classic data races as the VM can pause and resume two threads writing to the same variable.

goerz7y ago

Can you elaborate on that? Is there a blog post somewhere that illustrates the problem you're talking about? I was under the assumption that Python interpreters run single-threaded.

devxpy7y ago

Small correction: It makes the _implementation_ thread-safe.

It also simplifies a lot of CPython code, making it a lot easier to maintain.

walterstucco7y ago

> Parallel Programming with Python?

What about no?

Don't get me wrong, i don't like Python as a language, but it's a fine tool and many useful programs have been written with it

But parallel programming? No, thanks.

j / k navigate · click thread line to collapse