5LLMs 'banchmark' where they write code controlling units in a 1v1 RTS (opens in new tab)(yare.io)2levmiseri1mo ago0
6LLM 'benchmark' – writing code controlling units in a 1v1 RTS (opens in new tab)(yare.io)6levmiseri1mo ago0
11Run an autonomous company without human intervention (opens in new tab)(paperclip.ing)1levmiseri2mo ago0