Skip to content

Top New Best Ask Show Jobs

DatBench fixes VLM evals: 70% blindly solvable, 42% mislabeled, 35% prod gap | Better HN

DatBench fixes VLM evals: 70% blindly solvable, 42% mislabeled, 35% prod gap (opens in new tab)

(datologyai.com)

5 pointshurrycane4mo ago0 comments

0 comments

No comments yet.