We did quite a thorough benchmarking of various structured decoding providers in one of our papers:
https://arxiv.org/abs/2501.10868v3 , measuring structured outputs providers on performance, constraint flexibility, downstream task accuracy, etc.
Happy to chat more about the benchmark. Note that these are a bit out of date though, I'm sure many of the providers we tested have made improvements (and some have switched to wholesale using llguidance as a backend)