Noone has claimed getting structured data out of pdfs are sane. What you seem to be missing is that there are no sane ways to get a decent output. The reasonable choice would be to not even try, but business needs invalidate that choice. So what remain is the absurd ways to solve the problem.