1) No support for JSON output 2) No support for PDFs
llm-document-ocr is a simple Node library that does these pre and post processing steps for you. It converts PDFs into PNGs, crops whitespace around the images, and parses the JSON output.
Hope this saves you some time if you are building your own OCR stack on top of GPT and other LLMs!