1Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint (opens in new tab)(modal.com)91charles_irl7d ago18
3Three types of LLM workloads and how to serve them (opens in new tab)(modal.com)75charles_irl4mo ago5
4Host overhead is killing your inference efficiency (opens in new tab)(modal.com)3charles_irl6mo ago0
8The future of Python web services looks GIL-free (opens in new tab)(blog.baro.dev)3charles_irl7mo ago0
9Lexical differential highlighting instead of syntax highlighting (opens in new tab)(wordsandbuttons.online)2charles_irl7mo ago0
12In C++ modules globally unique module names seem to be unavoidable (opens in new tab)(nibblestew.blogspot.com)2charles_irl7mo ago0
15A Tour of eBPF in the Linux Kernel: Observability, Security and Networking (opens in new tab)(lucavall.in)2charles_irl8mo ago0