undefined | Better HN

0 pointsbrad07y ago0 comments

Evernote is an interesting case.

They store every word that MAY be in the scanned document.

So their OCR engine will find a lot of legitimate words, but it will also find a lot of words that don't sense too.

When putting in a term for searching, it looks at the entire index (both legit words and the garbage) and returns you the documents that match.

I think it's quite clever.

Bear in mind that this feature was many years ago, I have no idea if this is still the case.

0 comments

ocrcustomserver7y ago

Yeah, Evernote's OCR engine will generate possible candidates for every given word and will sort them internally by confidence score.

Screenshot: https://s24953.pcdn.co/blog/wp-content/uploads/2018/02/longh...

Since it's not aimed for transcription (user doesn't know what he's looking for) but for retrieval (user knows what he's looking for), it can get away with mistakes.

References:

https://evernote.com/blog/how-evernotes-image-recognition-wo...

https://help.evernote.com/hc/en-us/articles/208314518-How-Ev...

https://evernote.com/blog/evernote-indexing-system/

julianz7y ago

Yep it's quite clever for searching for things, much less useful for doing something based on the recognized text.

ocrcustomserver7y ago

OneNote can do transcription (copy text from image).

j / k navigate · click thread line to collapse

0 pointsbrad07y ago0 comments

Evernote is an interesting case.

They store every word that MAY be in the scanned document.

So their OCR engine will find a lot of legitimate words, but it will also find a lot of words that don't sense too.

When putting in a term for searching, it looks at the entire index (both legit words and the garbage) and returns you the documents that match.

I think it's quite clever.

Bear in mind that this feature was many years ago, I have no idea if this is still the case.

0 comments

ocrcustomserver7y ago

Yeah, Evernote's OCR engine will generate possible candidates for every given word and will sort them internally by confidence score.

Screenshot: https://s24953.pcdn.co/blog/wp-content/uploads/2018/02/longh...

Since it's not aimed for transcription (user doesn't know what he's looking for) but for retrieval (user knows what he's looking for), it can get away with mistakes.

References:

https://evernote.com/blog/how-evernotes-image-recognition-wo...

https://help.evernote.com/hc/en-us/articles/208314518-How-Ev...

https://evernote.com/blog/evernote-indexing-system/

julianz7y ago

Yep it's quite clever for searching for things, much less useful for doing something based on the recognized text.

ocrcustomserver7y ago

OneNote can do transcription (copy text from image).

j / k navigate · click thread line to collapse