7Boosting multimodal inference performance by >10% with a single Python dict (opens in new tab)(modal.com)16jxmorris1219d ago0
8Why are neural networks and cryptographic ciphers so similar? (2025) (opens in new tab)(reiner.org)143jxmorris1224d ago48
9Using group theory to explore the space of positional encodings for attention (opens in new tab)(blog.janestreet.com)48jxmorris1225d ago0
10Pyptx – A Python DSL to Write Nvidia PTX for Hopper and Blackwell (opens in new tab)(github.com)3jxmorris1229d ago0
11Which one is more important: more parameters or more computation? (2021) (opens in new tab)(parl.ai)59jxmorris121mo ago13
13Work with the garage door up (2024) (opens in new tab)(notes.andymatuschak.org)202jxmorris121mo ago127
14Six (and a half) intuitions for KL divergence (opens in new tab)(perfectlynormal.co.uk)114jxmorris121mo ago33