110Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module (opens in new tab)(gilesthomas.com)3gpjt7d ago0
210Gb/s Ethernet: what I did to get it working in my home (opens in new tab)(gilesthomas.com)232gpjt26d ago177
4LLM from scratch, part 33 – what I learned from the appendices (opens in new tab)(gilesthomas.com)5gpjt1mo ago0
5LLM from scratch (32l) – Interventions: updated instruction fine-tuning results (opens in new tab)(gilesthomas.com)1gpjt1mo ago0
7LLM from scratch, part 32k – Interventions: gradient accumulation (opens in new tab)(gilesthomas.com)2gpjt1mo ago0
9LLM from scratch, part 32j – trying to train a better model in the cloud (opens in new tab)(gilesthomas.com)2gpjt1mo ago0
10Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (opens in new tab)(gilesthomas.com)1gpjt1mo ago0
11Writing an LLM from scratch, part 32h – Interventions: full fat float32 (opens in new tab)(gilesthomas.com)7gpjt1mo ago0
12Writing an LLM from scratch, part 32g – Interventions: weight tying (opens in new tab)(gilesthomas.com)2gpjt2mo ago0
13Writing an LLM from scratch, part 32f – Interventions: weight decay (opens in new tab)(gilesthomas.com)6gpjt2mo ago0
14Writing an LLM from scratch, part 32e – Interventions: the learning rate (opens in new tab)(gilesthomas.com)3gpjt2mo ago0
15Writing an LLM from scratch, part 32d – Interventions: adding attention bias (opens in new tab)(gilesthomas.com)6gpjt3mo ago0