1Low-Latency Inference with Speculative Decoding on D-Matrix Corsair and GPU (opens in new tab)(gimletlabs.ai)1nserrino14d ago0
2The emerging role of SRAM-centric chips in AI inference (opens in new tab)(gimletlabs.ai)3nserrino20d ago0
3Speeding up PyTorch inference on Apple devices with AI-generated Metal kernels (opens in new tab)(gimletlabs.ai)187nserrino6mo ago30
4Show HN: Pixie, open source observability for Kubernetes using eBPF (opens in new tab)(github.com)6nserrino3y ago3
6Observing HTTP/2 Traffic Is Hard, but eBPF Can Help (opens in new tab)(blog.px.dev)91nserrino4y ago4
9Horizontal Pod Autoscaling with Custom Metrics in Kubernetes (opens in new tab)(blog.px.dev)4nserrino4y ago0