Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
sushid
1y ago
0 comments
Share
Is that not just traditional OCR applied on top of LLM?
0 comments
default
newest
oldest
energy123
1y ago
It's possible they have a software layer that does that. But I was assuming they don't, because the open source multimodal models don't.
maxlamb
1y ago
No it’s not, it’s a multimodal transformer model.
j
/
k
navigate · click thread line to collapse