It is hard to trust "you" when ChatGPT wrote that text. You never know which part of the answer is genuine and which part was made up by ChatGPT.
To actually answer that question: Pricing varies quite a bit depending on what exactly you want to do with a document.
Text detection generally costs $1.5 per 1k pages:
https://cloud.google.com/vision/pricing
https://aws.amazon.com/textract/pricing/
https://azure.microsoft.com/en-us/pricing/details/ai-documen...
You wouldn’t get a markdown document automatically generated (or at least you couldn’t when I last used it a few years ago) but you did get an XML document
That XML document was actually better for our purposes because it gives you a confidence score and is properly structured, so floating frame, tables and columns would be properly structured in the output document. This reduces the risk of hallucinations.
It’s less of an out-of-the-box solution but that’s to be expected with AWS APIs.
And it’s cheaper too.
https://aws-samples.github.io/amazon-textract-textractor/not...
It's very consistent, though pricey.
Unless it is. We have a few hundred PDF per month (mostly tables) where we need 100% accuracy. Currently we feed them into an OCR and have humans check the result. I do not win anything if I have to check the LLM output, too.
It is off by 2 orders of magnitude.
My guess is you're using the token counting algorithm for pre-4o with the costs for 4o and later.
That aside, I strongly suggest taking a week off from code-outside-work and use that time to reflect-as-work. The post and ensuing comments are a horror show. Don't take it too hard, it probably won't matter in the long run, no ones going to remember.
But you'd get a lot out of taking it harder than you did in the comments I've seen, including one this morning where you replied to me. It worries me that you don't seem to understand how sloppy this work is.
When I was 14, my math teacher gave me a 0 on a test because I just wrote the answers instead of showing work. That gave me a powerful appreciation for being precise, clear, and accurate.
The only positive outcome is that even though there was enough upvotes for a simple, sloppy, mispurposed GPT wrapper to end up on the front page for ~16 hours, near-universally, the comments seem to understand contextually there's a lot of problems with how this was shared.
Would you contrast your accuracy with Textract? Because Textract is 10x cheaper than this at approx 1 cent per page (and 20x cheaper than Cloudconvert). What documents make more sense to use with your tool? Is it worth waiting till gpt-4o costs drop 10x with the same quality level (i.e. not gpt-4o-mini) to use this? In my use case it's better to drop than to hallucinate.
What do you think makes sense in relation to Textract?
I think in general it’s very hard to say if any approach is “good enough” until you see some serious degree of variability in the input domain.