2My benchmark for large language models (opens in new tab)(nicholas.carlini.com)4cheviethai1232y ago2