A couple of services looked like they had a memory leak. Memory was continuously increasing over time. Thanks to Python 3.14, we were able to use memray to understand what was going on. Those services were recreating HTTP clients (aiohttp) for every inbound request, and memory allocated by the downstream SSL lib was growing faster than it was being released.
We ended up rolling back to 3.13, which fixed the issue. I'll try again with 3.14.5.
The reference cycle httpx creates is kind of a worst-case scenario for the incremental GC issue. Both the generational (3.13 and older) and incremental GC are triggered by the net new "container" objects (objects that have references to others, like lists and not like ints and floats). The short summary is that you need to create more container objects before the incremental GC triggers. In the case of the httpx reference cycle, you have a relatively small number of container objects hanging on to a lot of memory, due the SSL context data (which is a big memory hog).
Reverting back to the generational GC was the wise thing to do, even though it's a bit scary to do in a bugfix release. The incremental GC works for most people but in the minority of cases it doesn't, it uses quite a lot more memory. I'm pretty sure with some additional tuning, the incremental GC would be fine too but it just didn't get that tuning. The generational GC has literal decades of real-world use (Guido merged my patch on Jun 2000, Tim Peters did a bunch of tuning after that to optimize it).
Unfortunately, you may be the wrong gender to contribute to Encode repositories like httpx:
> I've closed off access to issues and discussions.
> I don't want to continue allowing an online environment with such an absurdly skewed gender representation. I find it intensely unwelcoming, and it's not reflective of the type of working environments I value.
— https://github.com/encode/httpx/discussions/3784
Discussed on Hacker News here: https://news.ycombinator.com/item?id=47193563
A fork discussed here: https://news.ycombinator.com/item?id=47514603
It's annoying that somehow talking to S3 etc requires so much churn. We have been trying to cache session objects and the like but clearly are still missing something.
Chasing this down has also made me realize how little Python libs use `weakref`, and just will build up so many circular references. The other day I figured out Django request's session infrastructure creates a circular reference meaning that requests have to get GC'd to get cleaned up in CPython.
I have a suspicion that the 3.14 problems are heavily linked to "real" workloads being almost entirely filled with cyclical objects.
Is there a way to make all this much easier to debug and to prevent memory issues in the first place? Is the abstraction level not quite right?
So for example
class Request
class Session
request.session exists, and the session is "part" of the request. but session.request often exists as a facility. That's a reference cycle which prevents the request (and anything it's pointed at!) from being deallocated at the end of a request.
But in this case, you could easily do something like:
session._request = weakref.ref(request) # on session creation
and then have session.request call session._request() (and maybe assert session._request() is not None if you want to be certain). If you're confident that the session is a "child" of the request, and that you would _never_ have a hold of the session after the request is done, this is a cheap trick that makes session.request cost a little bit more but not much.
I think most Python libraries just don't do memory perf analyses here, and also "believe" in the garbage collector. When GC runs, both request and session will get deallocated, after all! But the long term effects of everyone relying on the GC are that GC is expensive when it doesn't need to be, and when looking through memory you just have more stuff to dig through
We’ve decided to revert it in both 3.14 and 3.15, and go back to the generational GC from 3.13."
Sounds the right move for me
Lately, they seems to work with CRIU, various heuristics, multi-stage in-process bytecode compilation ..
Java is a mess, they are working hard to avoid fixing their issue (that nobody else have, so fixes are available)
PyPy doesn't have the support it needs and is stuck on 3.11.
Obviously it's not easy to move the whole language of a big codebase, but I feel a lot of this stuff (fiddling with GC, JITing, type hints, and I'm dubious about the free-threading stuff) tries to take Python somewhere it isn't really good at, and if that's what you want, you really want a different language.
Not to mention that there are differences in ecosystem, familiarity, and ergonomics that may make a team want to stick with Python.
“Just use Go” is not really actionable advice in most cases.
jython went EOL.with python 2 going EOL.
It's predictable vs Rust, C#, F#, Elixir, Go, etc.?
Java? Nope, you're getting a fundamental change in Valhalla C++? Nope, new language edition every few years with fundamental changes C? C23 has a number of fairly fundamental changes, expect more in the next language revision
I think your sense of causality is backwards here. These languages are getting fundamental changes because they're being widely used. That is what motivates and drives the change. Languages with no users don't need to change.
But most such languages handle much better the compatibility with legacy applications.
Python is the main culprit in most cases when I see conflicts between various software packages that insist to use only a specific version of their dependencies. This is why I have to keep installed many versions of Python, and the Linux distribution that I use must take care to prevent interference between those Python versions.
That's fine, but that's clearly not what I'm talking about.
Languages like F#, Elixir, etc. don't undergo fundamental changes. Yes, every language evolves. But for Python, we're talking about grafting literally fundamental stuff on top of a language not designed for any of these things.
For example, if someone went and redesigned Python to solve its warts, you'd basically end up with F#.
I thought that by now dynamic garbage collection was a known quantity so that making changes, outside of out right bugs, is fairly safe and predictable?
So any change to GC starts with massive .Net MSFT code base so they get extremely good telemetry back about any downsides and might be able to fix it in time.
There is almost no dog fooding on Windows development since version 8, Typescript team rather rewrite the compiler in Go, Azure has plenty of Go, Rust and Java projects alongside .NET.
Windows Development is not "We are not dogfooding", it's that incentives are misaligned with customer wants.
.Net team incentives are aligned with customer wants, provide a language that is highly performant and easy enough to write.
Go is, essentially, nearly perfect at what it does - even if the language itself leaves much to be desired and would ideally be much safer.
Microsoft should up their game. They have a few research languages in development.
They've always been great with languages. Hopefully, they rise to the occassion.
I’ll confess the reason it hit us so hard is because the code quality was so low and wasteful on allocations that it didn’t hide the problem as well as previous versions.
So I think it was not a big problem for .Net because it gave you enough control over GC, and because people tested their code before putting it in production.
I hope Meta switches Instagram to PHP/Hack so they leave Python alone.
Free-threading actually uses its own, separate GC: https://labs.quansight.org/blog/free-threaded-gc-3-14
You are free to switch language but you still need to understand it.
We are just different. That's not something to be mad about.
Python has a different problem: it is slow as f---. I did a micro benchmark comparison against 5 other languages in preparation for my python replacement language. Outside of dictionary lookups, it is 50-600 times slower than C depending on the workload.
Go, Rust etc are fine. They land at 1.25-3x slower than C. But I prefer the readability of python minus its dynamic nature.
Also, even if it looks like that to you, there are still people that write code with their own hands.