7Systematically generating tests that would have caught Anthropic's top‑K bug (opens in new tab)(theorem.dev)2ag87mo ago0
9Training Qwen to answer briefly yet intelligently using feedback control (opens in new tab)(runrl.com)4ag88mo ago0
10Launch HN: RunRL (YC X25) – Reinforcement learning as a service (opens in new tab)(runrl.com)71ag88mo ago22