undefined | Better HN

0 pointsFlyingLawnmower8mo ago0 comments

We did quite a thorough benchmarking of various structured decoding providers in one of our papers: https://arxiv.org/abs/2501.10868v3 , measuring structured outputs providers on performance, constraint flexibility, downstream task accuracy, etc.

Happy to chat more about the benchmark. Note that these are a bit out of date though, I'm sure many of the providers we tested have made improvements (and some have switched to wholesale using llguidance as a backend)

0 comments

0x4FFC8F8mo ago

I think @dcreater was asking how these various structee decoding providers compare with how pydantic ai handles structured output, i.e via tool calling, forcing the LLM to use a tool and its arguments are a json schema hence you read the tool call arguments and get a structured output.

dcreater8mo ago

thanks for the paper link! Im surprised there is such a minimal improvement in structured outputs when using any of these tools over the bare LLM!

j / k navigate · click thread line to collapse

0 comments

0x4FFC8F8mo ago

dcreater8mo ago

thanks for the paper link! Im surprised there is such a minimal improvement in structured outputs when using any of these tools over the bare LLM!

j / k navigate · click thread line to collapse