Excuse me? I maintain a production system that cares about low latency for single events. Declaring that it doesn’t have “mechanistic sympathy” entirely misses the point. Of course I’m not squeezing the most throughput out of every cycle of my CPU. I have a set of design requirements, I understand what the kernel and CPU and IO system do under the hood, and I designed the system to make the most of the resources at hand to achieve the design requirements. Which, in this case, are minimal latency for single events or small groups of events, and io_uring would have no benefit.
(I can steam in events at a very nice rate as measured in events/sec, but I never tried to optimize that, and I should not try to optimize that because it would make the overall system perform worse.)