Kaggle Launches LLM Evals (opens in new tab)

(kaggle.com)

9 pointsantgoldbloom10mo ago4 comments

4 comments

Here’s the announcement https://www.kaggle.com/blog/announcing-kaggle-benchmarks

I was founder and ceo of kaggle. I’ve been out of kaggle for 2.5 years. Super excited to see this announcement. Could solve the biggest problem in the LLM ecosystem.

art8213510mo ago

Curious how does it compare to Chat Arena?

meganrisdal10mo ago

We love what Chatbot Arena is doing to innovate on evaluation paradigms. The challenge of evaluating GenAI warrants diverse approaches. What we're excited to do is: 1) give anyone access to infra to make evaluation more accessible to more developers and researchers; 2) drive more novel, diverse evals. https://arxiv.org/abs/2505.00612v2

benhamner10mo ago

Can we add our own models or benchmarks?

j / k navigate · click thread line to collapse