I'd imagine other areas around StackOverflow (SQL, R?) are fighting similar issues. I've just tried it with a question (sure enough the second newest Pandas tagged question had a table as an image), and your tool produced a nice .csv.
It would be a godsend to have a button on StackOverflow that would replace a user-uploaded image of a table with some Pandas code that constructs the same DataFrame. Currently I would have to download the image, upload it to extract-table.com, download the .csv, load it into Python, run some code to create the code-based DataFrame.
I'd consider sending people on StackOverflow to your tool if you cut down some of the steps: (1) allowing to paste in an URL of an image, and (2) producing Pandas code output that can be directly copy/pasted from the site (not having to download a csv).
For illustration: here's what the Pandas code would look like for the first example of extract-table.com:
df = pd.DataFrame( {'Name': {0: 'David', 1: 'Jessica', 2: 'Warren'}, 'Gender': {0: 'Male', 1: 'Female', 2: 'Male'}, 'Age': {0: 23, 1: 47, 2: 12}} )[1] https://www.johnsnowlabs.com/spark-ocr/
[2] https://www.adobe.io/apis/documentcloud/dcsdk/pdf-extract.ht...
I used to work for a bank on their innovation team and pitched basically this, but as an intern I had neither the skill nor time to do it. But it was certainly something a bunch of people internally wanted.
Do you happen to know how to paste regular UTF-8 text into Excel/Google sheets as multiple cells? If I copy two cells in Sheets, I get a tab character (\t) between the cells. But if I try to paste "hello \t world" into Sheets then it's just dumped into one cell.
At least my bank was comfortable with cloud everything and people using APIs from approved partners. If you can write the report in Google Docs, as long as they were the ones plugging in their API key for the OCR, I imagine it would be fine.
Update: I just checked a bit carefully, and this example[4] is also missing the last row.
Also, Danish ø seems problematic on your web page whereas the CSV has the right UTF-8 encoded bytes.
[1]: https://i.stack.imgur.com/y7Zrt.png
[2]: https://stackoverflow.com/q/69363708/100754
[3]: https://results.extract-table.com/8d4818867ad604792819e98808...
[4]: https://results.extract-table.com/254d95722a2c2b1df72fc26b59...
I'm not sure what application you are thinking off. But the reason I'm following this problem is UX. Years ago, I worked on a project where anyone can add product prices into a DB. They do that by typing their receipt (line items) into the DB. The major issue was, the UX was horrible.
With an API like yours, this is super simply. One photo. That's all.
Maybe I'll revisit it as a side project.
Can AWS Textract be used directly with curl to return text strings of an uploaded image?