What Python’s asyncio primitives get wrong about shared state (opens in new tab)

(inngest.com)

79 pointsgoodoldneon21d ago49 comments

49 comments

the title seems unnecessarily clickbaity.

rather than "What Python's asyncio primitives get wrong" this seems more like "why we chose one asyncio primitive (queue) instead of others (event and condition)"

also, halfway through the post, the problem grows a new requirement:

> Instead of waking consumers and asking "is the current state what you want?", buffer every transition into a per-consumer queue. Each consumer drains its own queue and checks each transition individually. The consumer never misses a state.

if buffering every state change is a requirement, then...yeah, you're gonna need a buffer of some kind. the previous proposed solutions (polling, event, condition) would never have worked.

given the full requirements up-front, you can jump straight to "just use a queue" - with the downside that it would make for a less interesting blog post.

also, this is using queues without any size limit, which seems like a memory leak waiting to happen if events ever get enqueued more quickly than they can be consumed. notably, this could not happen with the simpler use cases that could be satisfied by events and conditions.

> A threading.Lock protects the value and queue list.

unless I'm missing something obvious, this seems like it should be an asyncio.Lock?

ggm21d ago

yes. I felt something very similar. I do think there is value in pointing out the pitfalls naieve users (me!) can make assuming things which aren't true about ordering of events, states. Queues with lock regions are also really nice because they are (as I understand it) very cheap: so making a thread or other concurrency primitive which writes into a queue under lock, and gets out of the way, aligns nicely with having some mothership process which reads queues under lock in a deterministic way. Actual event order can vary. but you should be able to know you had an event putting you into state A, as well as the terminal event state B you jumped into without doing work needed for state A.

samarthr121d ago

BTW: what is a lock region?

1 more reply

itemize12321d ago

it does seem the user wants a conditional variable.

For locking I am guessing they want multithreading, each with an event loop.

ydj21d ago

I think it’s not so much that the asyncio primitives got wrong about shared state, as much as is what the authors got wrong about the usage of those primitives. They are classic concurrency primitives that’s been around for almost half a century. They work as designed, but require some care to use correctly.

jsanders921d ago

Agreed. This isn't an asyncio problem, it's just not how those primatices work.

lukaslalinsky21d ago

They are trying to use condition variable without a mutex and see missed wake ups. That's a textbook error, no? I'm surprised asyncio.Condition even allows that mode of operation.

seanhunter21d ago

A better title would be: “Person who doesn’t know how to write state machines struggles to write a state machine”.

In attempt 2 the old school C way of writing the state machine would work just fine in python, avoid a bunch of the boilerplate and avoid the “state setter needs to know a bunch of stuff” problem. Basically you make the states as a table and put the methods you need in the table so in python a dictionary is convenient. Then you have

   > def set_state(new_state):
   >   state = new_state
   >   events[new_state].set()

Aaand you’re done. When you add a new state, you add an event corresponding to that state into the events table. If the stuff you would put into a conditional in set_state is more complicated, you could make a state transition method and link to it in the table. Or you could make a nested dict or whatever. It’s not hard, and the fact that the author doesn’t know an idomatic way to write a fsm definitely isn’t something that’s wrong with python’s asyncio and shared state.

In general if you’re writing a state machine and you have a lot of “if curr_state == SOME_STATE” logic, chances are it would be better if you used tables.

cl3misch21d ago

Is this being downvoted because of the tone, or because state machines are unpopular/inappropriate in this case?

Genuine question, because this feels like a sensible solution to the problem as stated in the article.

mrkeen21d ago

It made no reference to the 'shared' in 'shared state'.

No mention of asynchrony, multithreading, or the race condition that TFA encountered.

1 more reply

phs250121d ago

The one thing I wish stock python queues had an option for (async or otherwise) was some kind of explicit termination. e.g. be split into producers and consumers, and have consumers indicate iteration complete when all producers have finished (and vice versa - signal producers that all consumers have gone away). You can kind of kludge around it in one direction with stop sentinals but it's a lot more awkward to deal with - especially if your queues are bounded as then you can get into the situation where you block trying to push the stop sentinal onto the queue as it's full.

matheist21d ago

Does task_done not do what you want?

https://docs.python.org/3/library/queue.html#queue.Queue.tas...

phs250120d ago

Not really. It's certainly intended for the basic "fan out m tasks to n workers, and the fanout producer wants to know when they're all done" and can be abused for some more, but I don't think it does anything to help with the "consumer died, I want the producers to be able to know this rather than just continuing to push messages into a queue forever" case.

I've written wrappers to handle things the way I want, but it always feels like a bit of a hack. (Usually I use a stop sentinal internally and reach inside to unbound the queue before I send it to avoid blocking). Just wish it were built in.

rpz21d ago

Reminds me of https://geocar.sdf1.org/fast-servers.html

dbt0021d ago

What about a more general message-passing mailbox approach? This works really well in the Erlang/gen_server/gen_fsm world. (and in plenty of other contexts, but Erlang's OTP is still some of the best, simplest incarnation of these things)

scuff3d21d ago

“The problem with most programming languages is that they implement concurrency as libraries on top of sequential languages. Erlang is a concurrent language at the core; everything else is just a poor imitation implemented in libraries.” -Joe Armstrong

rciorba21d ago

I mean, the "one-queue per consumer" they eventually ended up with, is basically an inbox that the sequential process reads from.

darkhorse22221d ago

The event one seems perfectly fine with a dictionary or even a class that wraps your events and pairs with an event object subscribers can attach to with a singleton attribute.

three1421d ago

The thing that burned me with asyncio primitives is that calling an async function doesn't even schedule it (by default). The pattern for fire-and-forget is impossible to guess when coming from any other language - although it's called out in the docs. You must also call create_task AND you must maintain a collection of tasks because otherwise the task might not run, because it can be garbage collected before running, AND you must clean up completed tasks from your collection.

whilenot-dev21d ago

> The pattern for fire-and-forget is impossible

Good, that's an antipattern in the coroutines concurrency model.

three1421d ago

Then someone should really update the official python docs that explain the fire-and-forget pattern (https://docs.python.org/3/library/asyncio-task.html#asyncio....)! I had a FastAPI server, and calling a particular endpoint is supposed to kick off some work in the background. The background work does very little CPU work, but does often need to await more work for several minutes, so it's a good fit for asyncio. How do you want it to be structured? (In other words, on the level of human requirements, it IS fire and forget.)

1 more reply

mrkeen21d ago

This is one of those cases where software transactional memory really shines.

You can often take the naive solution and it will be the correct one. Your code will looks like your intent.

TFA's first attempt:

  async def drain_requests():
      while state != "closing":
          await asyncio.sleep(0.1)
      print("draining pending requests")

Got it. Let's port it to STM:

  let drain_requests = do
          atomically (
              do s <- readTVar state
                 when (s /= "closing")
                     retry )
          print("draining pending requests")

Thread-safe and no busy-waiting. No mention of 'notify', 'sleep'. No attempt to evade the concurrency issues, as in the articles "The fix: per-consumer queues - Each consumer drains its own queue and checks each transition individually."

pocksuppet21d ago

In most STM models this is a busy-wait implemented with STM? Only Haskell blocks on `retry`

TZubiri21d ago

I'm sorry but how do you jump from 1. Polling to 2. Asyncio

There's so many solutions in the middle, I have this theory that most people that get into async don't really know what threading is. Maybe they have a world vision where before 2023 python just could not do more than one thing at once, that's what the GIL was right? But now after 3.12 Guido really pulled himself by the bootstraps and removed the GIL and implemented async and now python can do more than one thing at a time so they start learning about async to be able to do more than one thing at a time.

This is a huge disconnect between what python devs are actually building, a different api towards concurrency. And some junior devs that think they are learning bleeding edge stuff when they are actually learning fundamentals through a very contrived lens.

It 100% comes from ex-node devs, I will save the node criticism, but node has a very specific concurrency model, and node devs that try out python sometimes run to asyncio as a way to soften the learning curve of the new language. And that's how they get into this mess.

The python devs are working on these features because they have to work on something, and updates to foundational tech are supposed to have effects in decades, it's very rare that you need to use bleeding edge features. In 95% of the cases, you should be restricting yourself to using features from versions that are 5-10 years old, especially if you come from other languages! You should start old to new, not new to old.

Sorry, for the rant, or if I misjudged, making a broader claim based on multiple perspectives.

scuff3d21d ago

As of 3.14 running without the GIL is optional, but the default still has the GIL in place. 3.13 had it as experimental, but not officially supported. 3.12 and back are all GIL all day.

Python's asyncio library is single threaded, so I'm not sure why you are talking about threads and asyncio like they have anything to do with each other.

Python has been able to do more then one thing at a time for a long time. That's what the multiprocess library is for. It's not an ideal solution, but it does exist.

TZubiri21d ago

> Python's asyncio library is single threaded, so I'm not sure why you are talking about threads and asyncio like they have anything to do with each other.

Ok, not OS threads, but it de facto creates application/green threads.

>That's what the multiprocess library is for. It's not an ideal solution, but it does exist.

Philosophical argument but, I'd say multiprocess is not python doing many things, there would be many python runtimes (each doing A Thing), and the OS would be the one doing multiple things / scheduling.

1 more reply

hrmtst9383721d ago

I think the core mistake is conflating async with threading and with the GIL, which causes people to treat asyncio like a magic GIL remover. For reference asyncio landed in CPython with Python 3.4 in March 2014, while threads and the GIL predate that, so Python could do concurrent IO long before 2023. In my experience asyncio is cooperative IO concurrency, so you must yield with await or explicitly offload blocking CPU work to concurrent.futures.ProcessPoolExecutor or run_in_executor, and you protect shared state with asyncio.Queue or an asyncio.Lock rather than assuming tasks are isolated. I've found the most practical pattern is message passing for state and keeping CPU heavy work in a process pool, because chasing a blocked event loop in production is a miserable way to learn concurrency.

dbt0021d ago

I think they were already in the async world and needed message passing -- the polling code was also in python async.

j / k navigate · click thread line to collapse

49 comments

evil-olive21d ago

the title seems unnecessarily clickbaity.

rather than "What Python's asyncio primitives get wrong" this seems more like "why we chose one asyncio primitive (queue) instead of others (event and condition)"

also, halfway through the post, the problem grows a new requirement:

if buffering every state change is a requirement, then...yeah, you're gonna need a buffer of some kind. the previous proposed solutions (polling, event, condition) would never have worked.

given the full requirements up-front, you can jump straight to "just use a queue" - with the downside that it would make for a less interesting blog post.

> A threading.Lock protects the value and queue list.

unless I'm missing something obvious, this seems like it should be an asyncio.Lock?

ggm21d ago

samarthr121d ago

BTW: what is a lock region?

1 more reply

itemize12321d ago

it does seem the user wants a conditional variable.

For locking I am guessing they want multithreading, each with an event loop.

ydj21d ago

jsanders921d ago

Agreed. This isn't an asyncio problem, it's just not how those primatices work.

lukaslalinsky21d ago

They are trying to use condition variable without a mutex and see missed wake ups. That's a textbook error, no? I'm surprised asyncio.Condition even allows that mode of operation.

seanhunter21d ago

A better title would be: “Person who doesn’t know how to write state machines struggles to write a state machine”.

   > def set_state(new_state):
   >   state = new_state
   >   events[new_state].set()

In general if you’re writing a state machine and you have a lot of “if curr_state == SOME_STATE” logic, chances are it would be better if you used tables.

cl3misch21d ago

Is this being downvoted because of the tone, or because state machines are unpopular/inappropriate in this case?

Genuine question, because this feels like a sensible solution to the problem as stated in the article.

mrkeen21d ago

It made no reference to the 'shared' in 'shared state'.

No mention of asynchrony, multithreading, or the race condition that TFA encountered.

1 more reply

phs250121d ago

matheist21d ago

Does task_done not do what you want?

https://docs.python.org/3/library/queue.html#queue.Queue.tas...

phs250120d ago

rpz21d ago

Reminds me of https://geocar.sdf1.org/fast-servers.html

dbt0021d ago

scuff3d21d ago

rciorba21d ago

I mean, the "one-queue per consumer" they eventually ended up with, is basically an inbox that the sequential process reads from.

darkhorse22221d ago

The event one seems perfectly fine with a dictionary or even a class that wraps your events and pairs with an event object subscribers can attach to with a singleton attribute.

three1421d ago

whilenot-dev21d ago

> The pattern for fire-and-forget is impossible

Good, that's an antipattern in the coroutines concurrency model.

three1421d ago

1 more reply

mrkeen21d ago

This is one of those cases where software transactional memory really shines.

You can often take the naive solution and it will be the correct one. Your code will looks like your intent.

TFA's first attempt:

  async def drain_requests():
      while state != "closing":
          await asyncio.sleep(0.1)
      print("draining pending requests")

Got it. Let's port it to STM:

  let drain_requests = do
          atomically (
              do s <- readTVar state
                 when (s /= "closing")
                     retry )
          print("draining pending requests")

pocksuppet21d ago

In most STM models this is a busy-wait implemented with STM? Only Haskell blocks on `retry`

TZubiri21d ago

I'm sorry but how do you jump from 1. Polling to 2. Asyncio

Sorry, for the rant, or if I misjudged, making a broader claim based on multiple perspectives.

scuff3d21d ago

As of 3.14 running without the GIL is optional, but the default still has the GIL in place. 3.13 had it as experimental, but not officially supported. 3.12 and back are all GIL all day.

Python's asyncio library is single threaded, so I'm not sure why you are talking about threads and asyncio like they have anything to do with each other.

Python has been able to do more then one thing at a time for a long time. That's what the multiprocess library is for. It's not an ideal solution, but it does exist.

TZubiri21d ago

> Python's asyncio library is single threaded, so I'm not sure why you are talking about threads and asyncio like they have anything to do with each other.

Ok, not OS threads, but it de facto creates application/green threads.

>That's what the multiprocess library is for. It's not an ideal solution, but it does exist.

1 more reply

hrmtst9383721d ago

dbt0021d ago

I think they were already in the async world and needed message passing -- the polling code was also in python async.

j / k navigate · click thread line to collapse