Building an Internet Scale Meme Search Engine (opens in new tab)

(findthatmeme.com)

785 pointswhoisburbansky3y ago151 comments

151 comments

This title really undersells the absolute insanity of the described solution. This is a beautiful example of "if it's stupid, but it works, it's not stupid." The justification is very convincing.

One thing I'm curious about: how did you build your corpus of meme images and videos?

leokennis3y ago

It reminds me a little of https://www.beeper.com/. It allows you to read iMessages on Android and other platforms. To make this work, they will ship you some old iPhone to act as a server "bridge": https://twitter.com/ericmigi/status/1351934418961661959

If it works, it works. But it also speaks volumes about Apple's disregard/inexperience with exposing their stuff via the web - https://www.icloud.com/ being the prime example: half the stuff the phone apps can do are not available (cannot create a reminder with a due date...) and the things that are there are slow and buggy.

dewey3y ago

I think I've seen a post from https://texts.com about that they. I don't think they ship you the iPhone though, the host it themselves.

kaladin-jasnah3y ago

I would imagine the only scalable way to run such a service is to run macOS virtual machines with multiple user accounts for each iMessage user.

aemreunal3y ago

(Not the author) Maybe they leveraged https://knowyourmeme.com/? But that can't possibly have all the random memes, could it?

mandatory3y ago

Author here: KnowYourMeme is one of many sites that memes are continually ingested from (any site that has memes I try to ingest regularly) :)

aemreunal3y ago

Amazing work! Also, thank you for making that feed on the main page, been laughing for a while here :D

2 more replies

GistNoesis3y ago

Nice IPhone cluster.

Have you tried something based on deep-learning that uses Transformers : https://github.com/roatienza/deep-text-recognition-benchmark (available weights are for tasks that seem similar to OCR so there is a good chance you can use it out of the box). With a good gpu it should process hundreds to thousands image per seconds, so you likely can build your index in less than a day. (Maybe you can even port it to your iphone stack :) )

https://github.com/microsoft/GenerativeImage2Text (You'll probably have to train on your custom dataset that you have constituted)

There are tons of other freely available solutions that you can get with a search for things with keywords like "image to text ocr" "transformers" "visual transformers"...

2 more replies

taneq3y ago

All hail the memelord!

spiffytech3y ago

How do you ingest your social circle's in-group memes? Are they reliably posted to meme generator sites?

counttheforks3y ago

What about copyright?

1 more reply

code_duck3y ago

OP’s meme site lists where each image comes from. Looking through it I mainly see ifunny and 9gag.

solarkraft3y ago

Do you crawl telegram channels?

bryanrasmussen3y ago

yeah but lots of things that work are stupid because there are many other solutions that work better, the greatness of this crazy solution is it really seems like the best solution given price requirements.

jychang3y ago

I feel like I’m taking crazy pills in this thread. Am I the only one who talks to Gen Z kids who explore around their iPhone apps? This definitely isn’t the best option given price requirements. It’s not even the most convenient option.

I’m around age 30, not 13, so similar to the article, my first instinct was also to create a database and OCR the image. But by total coincidence, yesterday I had a conversation with my 14 year old cousin on the topic of saving memes. Her response was along the lines of “yeah, everyone nowadays just saves the image to your iPhone photos, and then just search for it later from the photos app”.

Yeah. This whole article is literally already built into iOS UI, not just a hidden API. And kids all seem to know about this, apparently.

This article uses an example meme with the text “Sorry young man But the armband (red) stays on during solo raids”. I saved it in my iPhone photos app… and found it again through the search function in the photos app.

https://imgur.com/a/BPICjOz

https://imgur.com/a/55el9uQ

This is a solved problem already, by teenager standards.

I felt extremely old yesterday when I was talking to my cousin. And I felt extremely old today, reading this article. This is because looking back, the past few decades of CS cultural intuition have established that text are text, and images are images. Strings and bitmaps don’t mix.

This seems sort of obvious to anyone in tech, but I realized that from a clueless grandma perspective, not being able to search up text in photos wasn’t really obvious. Well, the roles are reversed now. Ordinary people now have access to software that treats text as a first class citizen in photos by default.

counttheforks3y ago

The entire point is to find memes you don't already have.

1 more reply

ZephyrBlu3y ago

How would you solve sourcing and distribution using just iOS though? Sure, it's built into iPhones, but if you want to create a comprehensive globally accessible meme search I don't think you can do that by saving memes to your iPhone.

1 more reply

karatinversion3y ago

I only realised quite recently that I could now select text in images on my iPhone the same way I could if I was looking at a web page.

1 more reply

djmips3y ago

Well maybe this isn't for teenagers with iphones

merpkz3y ago

Nice project, I wanted to build meme search engine myself one day, but figured Tesseract will fail at most of the memes because of how stylized those have become. So I tuned down my meme source to only /r/bertstrips as those contain sane looking text and it's working quite alright - project has no frontend yet, I search from cli and click links.

> Initial testing with the Postgres Full Text Search indexing functionality proved unusably slow at the scale of anything over a million images, even when allocated the appropriate hardware resources.

I can guarantee you that correctly setup PostgreSQL text search will be faster than ES with much, much less hardware resources needed, it's just a matter of correctly creating tsvector column and creating GIN index on it (and ofc asking right queries so it's actually used). I can help you out setting postgres schema up and debugging queries if you are interested, for testing purposes at least.

tmzt3y ago

I recently worked on a project using lnx.rs. Simple to setup and use and fast at the scale I was using it. Built on Tantivy with a custom fast fuzzy search feature.

If you want to go beyond meme sites and possibly detect memes in the wild, common crawl might be something to start with.

iamflimflam13y ago

One issue I've had with postgres full text search is when you want to rank using ts_rank you end up with a full table scan.

CiceroCiceronis3y ago

This is really brilliant to see, and I've been surprised for quite a long time that nothing similar exists. I think it's a real shame that few people with interest in memes have interest in building solutions like this that help us engage with them.

People in the 21st century know a lot about the mistakes of the past century that led to much popular culture of the time being lost (especially terminally online people who've watched lots of Youtube documentaries about lost Dr. Who episodes and so on), so it surprises me how little we try and avoid those same mistakes with today's ephemeral pop culture in the form of memes. People like yourself who want to help make the internet's huge corpus of memes tractable are part of the solution in terms of meme archival and cultural memory.

(There's a good meme metadiscussion group on Discord, "The Philosopher's Meme," which you might be interested in joining. People there would be very keen to discuss what you've made.)

spiffytech3y ago

I've always been surprised that Reddit hasn't built meme search into its site search.

Memes are a core part of the Reddit experience, yet it's difficult to find something I know I saw before.

jychang3y ago

Not familiar with Discord, do you have a link to that group?

CiceroCiceronis3y ago

https://discord.com/invite/8MVFRMa

iamflimflam13y ago

Love the hackiness of this - however, the vision framework is available on Desktop macs as well - https://developer.apple.com/documentation/vision

and specifically:

https://developer.apple.com/documentation/vision/vnrecognize...

cassiogo3y ago

> My preliminary speed tests were fairly slow on my Macbook. However, once I deployed the app to an actual iPhone the speed of OCR was extremely promising (possibly due to the Vision framework using the GPU). I was then able to perform extremely accurate OCR on thousands of images in no time at all, even on the budget iPhone models like the 2nd gen SE.

He does mention running it on a macbook

msdrigg3y ago

I would guess that tests in this sentence refers to tests of the iOS app on the simulator.

Which would be slow expectedly

baggachipz3y ago

I would think it would run well on a M1/M2 Mac as a native app though, right?

1 more reply

dblitt3y ago

I would assume he's using an intel macbook and wouldn't have the gpu acceleration (and subsequent Vision framework integration) of the m1

pronoiac3y ago

There's ocrit, a CLI utility using Apple's Vision framework for OCR: https://github.com/insidegui/ocrit

price4569873y ago

What's the cost of building and running a cluster of iPhones vs Mac Minis?

Bad_CRC3y ago

in the article $40 second hand, imei banned and broken screen iPhones are being used so...

vlunkr3y ago

There's a ton of compute power available in the form of unused phones.

yakubin3y ago

The photo under Upgrading the iPhone OCR Service Into An OCR Cluster. In the future, data centres are going to host racks of iPhones.

sankha933y ago

It is already here to be honest. I know BrowserStack and other mobile testing platforms (at Facebook and Amazon) do host real devices, both Android and iPhones, in server farms like this. Meta wrote a blog post about it: https://engineering.fb.com/2016/07/13/android/the-mobile-dev...

At one of my previous workplaces, we discussed running the Z3 theorem prover on an iPhone cluster, because they run so much faster on A series processor than a desktop Intel machine.

formerly_proven3y ago

Reminds me of imgix, who built their product on Apple libraries so ended up having racks of macs to run their service.

oefrha3y ago

Modern app click farms already have walls of iPhones.

leokennis3y ago

To be fair...insane performance in a tiny cool package...iPhones would make great servers if you could order them without the screen/camera etc. :-)

defrost3y ago

There's your startup right there - washing line racks of discarded iPhones, near bleeding edge, busted screens, still functional o/wise.

Low entry cost, recycling, eco-friendly, . . . a deck that writes itself.

aabajian3y ago

I had a friend in med school who wrote a very early note-taking app for the iPad. Turns out that there was no way to render PowerPoint files when the iPad first came out. He realized that the iOS/Mac OS "quick preview" function could be used to take screenshots of each PowerPoint slide. For a brief time, his was the only app that could display PowerPoints (albeit, they were just screenshots!). There's a lot of hidden utility in Apple libraries.

ksdme93y ago

Love the inventiveness.

My question is about the image distribution costs. All the memes on the site seem to be coming straight off an object storage, all that bandwidth consumption has got to add up(?). Some sort of a CDN might help depending on the search patterns.

memeatlas3y ago

Although not as elegant a solution as this I've also tried my hand as well at indexing and categorizing memes. I wanted to save a very specific type of meme though since there are, in my mind, 2 main categories of memes. The first category are what I call "story" memes, they are standalone and typically what you see being shared on Facebook. They usually have text and are able to tell a story on their own with no additional context and can be presented as a single post, story, etc, (think 4 panel comics). The second type are reaction memes. These are used to respond to people and usually convey a feeling towards a post or tweet. They can also be standalone so they should probably be considered a subset of the "story" memes. I've gravitated towards the reaction memes as I see more utility in them and can be used in a more universal way. My site if anyone is interested (its still a work in progress):

https://www.memeatlas.com

operator-name3y ago

These different approaches really compliment each other - most of the memes you've categorised are used in a variety of situations and therefore not suited to text searching. Meanwhile if you're looking for a specific meme that you've seen, text search is the way to go.

Ideally there would be a best of both worlds where you could search memes by "characters" or "formats" in addition to text.

As feedback, it would be nice to search all memes from the homepage. The search on https://www.memeatlas.com/meme-templates.html also seems to be broken.

suave_dude3y ago

What are you using as the back end to host your website?

lormayna3y ago

If you don't need advanced search features, you can use Sonic (https://github.com/valeriansaliou/sonic). It's blazing fast and you can save lot of money on servers.

remram3y ago

Someone mentioned lnx.rs in this thread, how does it compare? And how about all the other new Rust solutions like MeiliSearch, Toshi, Quickwit?

lormayna3y ago

I don't know, I am not an expert on this topic. But this could be a nice topic for a blog article that will hit the frontpage on HN :)

philsnow3y ago

I sat down literally last night and started sketching out the scratch-my-own-itch solution to more or less exactly this problem, because I too have meme-aphasia where I know there exists a meme that fits perfectly in a conversation, but I have about 5 seconds to find it before the moment passes.

I'm so, so glad to see that I'm not the only person in the world with the same "problem". Well done, mandatory.

edit: holy crap you even index videos, nice

csande173y ago

I wonder how the performance of Vision.framework on desktop Mac hardware compares to a cluster of phones. (The author mentions that it was "fairly slow", but it sounds like they were running an iOS app in the simulator and not a macOS app.)

lathiat3y ago

I would expect an M1 native Mac app to work similarly well. Though the iPhone solution may win on price.

The framework is supported on macOS (even tvOS apparently) https://developer.apple.com/documentation/vision

papito3y ago

Same thoughts. You can make that fly on a Mac Mini (provided it can be made to work close to the metal and not in an emulator)

Fabricio203y ago

Does the Vision API call back to apple servers in any fashion? Like how google on-device voice recognition APIs will call back to Google when you are online (unless you explicitly pass flags to force it in offline mode).

If so, is there any risk in getting your account suspended or ip range banned somehow because of this, for example?

mandatory3y ago

Nope, you can use it totally offline. No way of getting banned as far as I'm aware.

tomw18083y ago

Absolutely amazing on the tech side!!!

Now, after reading the article, I gave your search engine a try. I was looking for that futurama its a trap meme (pretty much pops up on any image search here https://www.google.com/search?q=futurama+its+a+trap)

The problem is, the search engine you built is now very text-heavy, which seems to be usually very unconnected to the actual meme. So, searching for "its a trap" did not yield the results I was actually hoping for, but made total sense looking at how the search was implemented.

Are you planning to implement an actual tagging of the content of some sorts? Maybe a clustering of similar objects (like iphone clusters similar peoples faces in the gallery) and then tag those clusters with keywords somehow?

mandatory3y ago

Yes I definitely want to improve the search to be better. It is currently very text heavy and I (only recently) got image similarity indexing working. Hoping to leverage this to do something like you mentioned!

I'd also like to figure out how to turn an image into a description of whats in it. My ML/tensorflow knowledge is very weak though, so I still have a lot to learn here.

mseidl3y ago

Do you use mongodb to make it web scale? You turn it on and it scales right up.

zffr3y ago

https://www.youtube.com/watch?v=b2F-DItXtZs

Arbortheus3y ago

This is great, I particularly like the part about using compute from old unwanted iPhones. Quite an inventive way to reuse/recycle otherwise obsolete hardware!

permo-w3y ago

I have absolutely no experience in this area and I'm curious:

is there really no open-source text recognition software that's on-par with or close to Apple's (presumably proprietary) implementation? the article mentions Tesseract. is that the current best open-source option?

satvikchoudhary3y ago

Certainly not. https://paperswithcode.com/task/optical-character-recognitio...

MrGilbert3y ago

This is remarkable. I'd love to see that combined with some kind of sentiment analysis like Microsoft offers, just to see if something useful comes out of it.

Sometimes, I don’t know the exact words when looking for a meme, but once I see it, I know that’s the one.

Cloudly3y ago

Semantic search using CLIP embeddings / other LLM embeddings could be an amazing addition too.

astrange3y ago

Unfortunately semantic analysis barely works at the best of times, but it especially doesn't work here. Computers… they're just not good at irony.

IME CLIP embedding search can work strangely on memes as well, because it gets confused when images have words in them. Basically the same problem reported in the original CLIP paper where it thinks an apple and a piece of paper with "apple" written on it are the same thing.

oefrha3y ago

I suppose that’s an old Intel MacBook? I’d be very surprised if the Vision framework performs better on a 2nd gen iPhone SE than even the first M1 MacBook Air.

iamflimflam13y ago

I think he was running it in the simulator - which won't perform anywhere near as fast on this type of thing.

oefrha3y ago

The simulator runs a native build though. Might not be optimized, not sure.

komali23y ago

Would love to see that load balancer implementation, as I'm a scrub and this project fascinates me.

ksdme93y ago

nginx makes it really simple to setup a load balancer, it defaults to distributing requests equally between all upstream hosts but you can always assign weights to each of your server. https://docs.nginx.com/nginx/admin-guide/load-balancer/http-...

mandatory3y ago

Yep, this is exactly what I'm running on the raspberry pi LB. Nginx makes it super easy!

andai3y ago

I have a "hackish but works for me" meme database: I use my Telegram "self chat" to send memes I like to myself, and I tag them with the kind of words I'm likely to search for when looking for them later.

Works great for me.

It's kind of like trying to come up with a good Google search phrase, based on how other people must have phrased something, but relying on knowledge of how you phrase things instead.

solarkraft3y ago

I do this for some, but most of the time my use case is to look for something very obscure from a long time ago that I didn't regard as interesting when I first saw it (so didn't make the effort to manually categorize it).

lysecret3y ago

Wait what this is absolutely brilliant. Actually insane it works so well using a stack of iPhones as an ocr server. My deepest respect.

Thorentis3y ago

IaaS - iPhone as a Service, coming soon to AWS.

marginalia_nu3y ago

Now this is the sort of disgusting pile of jank I love to see.

JustARandomGuy3y ago

Very inventive. Admittedly when I read the first few paragraphs, I was thinking “he’s got to have $40K of iPhones doing image processing” but you made a good point about being able to use iPhones with screen and other damage.

What was your average price per iPhone, if you don’t mind disclosing?

ekns3y ago

Last time I looked into OCR stuff I came to a similar conclusion (though I didn't implement anything back thne). It would be really nice to have "open source" models that had similar performance, without having to deal with the iphone cluster hackery.

nysv3y ago

If only there was a way to filter out ifunny results, I absolutely detest that watermark.

Freak_NL3y ago

> Better yet, I don’t even want to use them as phones, so even iPhones that are IMEI banned or are locked to unpopular networks are perfectly fine for my use.

Fences worldwide will be overjoyed to hear of this novel application.

suave_dude3y ago

I have a question what do you guys think is the best back end for a video search engine app?

dirtyid3y ago

Outrageous effort! So far japanese, mandarin returning results as well.

Do you have list of sources where memes are ingested from?

Would be nice to have some option to explore memes by category.

looki3y ago

Very cool project! I'll try to remember it the next time I'm looking for a specific image. I noticed that repeated appearances of the search term are ranked higher, which isn't necessarily productive. Also, some kind of duplicate detection would be nice. Searching for "SpongeBob" yields many copies of the same images that mentions "SpongeBob" several times.

baradhiren073y ago

https://findthatmeme.com/?search=take+my+money

petesergeant3y ago

I was hoping this would help me find the Database Iceberg meme that shows different levels of database insanity. It didn’t. Anyone have a link?

cnity3y ago

This one? https://www.reddit.com/r/ProgrammerHumor/comments/wuu689/the...

petesergeant3y ago

Yes, that’s it, thankyou. Been searching a few days for it

cnity3y ago

You're welcome. FWIW I literally put your wording into google image search to find it:

> Database Iceberg meme that shows different levels of database insanity

But possibly my google search results are tailored differently from yours.

2Gkashmiri3y ago

Are you going to open source the "app" part of it ?

I would love to replicate this setup for my own project....

I am thinking, load balanced, multi location redundant "iOS machines" with 3-4 phones in with power backup and internet dongle.

We could use something like zerotier/tailscale to get internet access from outside your local network

ipsum23y ago

This is amazing. Out of curiosity, why not try deep learning OCR software instead of Tesseract? PaddleOCR is popular.

ducktective3y ago

>PaddleOCR

I once tried installing it on a recent Ubuntu. After messing with dependency hell and pip downloading half of the internet, when I finally invoked the CLI, it complained _again_ about a missing runtime dependency.

I called it quits. DL people are simply not interested in bundled static binaries.

solarkraft3y ago

That's a fun way to do OCR. Next up: Classifying memes by subjects and themes to build something like KnowYourMeme's gallery, but for every meme.

Bonus: Index from a lot of sources to help track a meme's origin.

This type of thing is on my long list of "can somebody else please do this already".

schappim3y ago

Pretty insane. If you don’t want to use iPhones, I made macOCR a while back. It uses the same vision APIs, with a very simple CLI interface. See: https://github.com/schappim/macOCR

petercooper3y ago

You can do it on macOS as well, it has the same API for fast high quality OCR. I used it to create an OCR system to detect secrets or credentials in screencasts: https://github.com/peterc/videocr

kgbcia3y ago

That's genius. I realized the cost advantage of text to speech on an old android versus Google cloud

sneak3y ago

Don't you have to re-sign and re-deploy tour iOS app every 7 days to keep it running on the iPhones?

paulmd3y ago

Is there a pgsync equivalent for Oracle? Spent some time building replication from a source-of-truth to a search engine at a previous job.

Wish we could have used postgres but the tools were dictated rather than letting the requirements drive the tooling.

spuz3y ago

I'm curious how well the iPhone OCR actually works. How do you deal with errors? Is the error rate low enough that you can accept the output from the iPhone OCR as is or do you also run it through a cleaning process (e.g. spell check)?

9dev3y ago

For a single data point, it works exceptionally well for me. I routinely copy-paste from images or screenshots, and it rarely fails (mostly for handwriting or obscure fonts).

leokennis3y ago

I am not sure if the Photos app search also uses the same OCR. But sometimes I search for a word and iOS will find a photo that has that text in a 50x20 cluster of pixels somewhere far in the background...it's remarkably good.

yreg3y ago

This is absolutely brilliant.

I believe I will actually use it a lot if you keep the site up.

Minor feedback for the blog post: It deserves a better meta description (for link previews). The first paragraph doesn't advertise how good the article is going to be.

causality03y ago

I tried a few memes. The results were quite poor, and vastly inferior to just using Google. In the case of text searches I had to scroll through dozens of results before finding the original meme images.

nowahe3y ago

Out of curiosity, how does your Image Similarity Search works ? Are you also using some feature of Apple's Vision framework, or running some ML model on your linode instance ?

mandatory3y ago

The image similarity search is probably a blog post of its own.

Short TL;DR: It runs off my home server running a large vector database (opendistro): https://opendistro.github.io/for-elasticsearch-docs/docs/knn...

Liquidor3y ago

Brilliant! :-)

Maybe a dumb question, but could you use your data to train a new OCR model so you wouldn't have to rely on iOS?

I don't know much about ML/AI so maybe not feasible.

gerdesj3y ago

"It looked like it was time to bite the bullet and write an OCR iOS server in Swift."

Quite a large bullet required, one with plenty of chewing left in it.

joshu3y ago

Heh, this finds a bunch of copies of a video I made. If you are going to cache them and repost them, you probably need to have a DMCA process.

kome3y ago

you are a genius. also, the Search Engine works so well

surume3y ago

Thank you for building this! It's so much fun. I looked for memes that I saw years ago and found them in seconds. Excellent work!!

Scaevolus3y ago

Looks like you've solved the OCR problem, now to solve the duplication problem and use it as a ranking hint. :-)

1wsk3y ago

Could you not extract the model and run it on a server? Its probably not as easy but i know it has been done with NeuralHash

nisegami3y ago

Is this the person on /r/hardwareswap who's been looking for semi-functional, even IMEI banned, iPhones?

the_arun3y ago

Nice hack. But doesn't Google/(any public search engine) image search do this for us already?

kevmo3143y ago

Could you expose your iPhone cluster as an OCR API? Seems like it would be competitive with the GCP API.

urbandw311er3y ago

I wonder if there might be something in the Apple TOS prohibiting this.

OJFord3y ago

I think there is right? At least assuming it's the same as Macs where the minimum billing period is 1 day, because you're technically renting a physical Mac (and 1d is the minimum that qualifies as adhering to that).

urbandw311er3y ago

I was more referring to bundling up an OS framework library feature and exposing it as an Internet API.

francis-io3y ago

Would be great if the images had a unique name so I could save them without having to rename them.

mateuszbuda3y ago

Great project! With your DIY attitude, if you need to build your own infrastructure for web scraping, here is a tutorial for mobile proxy setup which might be helpful: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap...

9dev3y ago

This is the single bestest thing I have read in a long while. Absolute madness. Pure bliss.

fuzzygroup3y ago

This is utterly fantastic and you are to be commended for your Crazy Mad Scientist genius!

julianeon3y ago

This makes me wonder what other cool things I can do with an old iPhone.

Tade03y ago

I wonder how it fares against deep-fried "E" or Opossum memes?

skizm3y ago

Where did the original set of meme images, gifs, and videos come from?

coayer3y ago

I love the iPhone cluster so much!

danbruder3y ago

this is very clever. I wonder what other use cases could leverage this approach

ariehkovler3y ago

This is mad and I love it.

geek_at3y ago

really amazing! I love the solution and the project in general

transitivebs3y ago

My #1 recommendation for anyone thinking about the convoluted OCR solution: use a cheap OCR API and save yourself months of time / hassle / upkeep. Google's OCR API is a good place to start, but AWS has one too and dozens of others out there.

mike_d3y ago

Without this "convoluted OCR solution" it never would have been built. Mandatory would have easily had to spend hundreds of thousands of dollars to OCR his meme collection alone, even without scraping other meme sites.

papito3y ago

On the contrary, this sort of creative thinking is what's needed instead of automatically reaching for that shiny Cloud Toy. It's easy to get a proof-of-concept working, sure, but at scale, you start torching through cash.

Many places keep adding cloud services to their stacks until one day someone in the C-suite notices the AWS bill.

langsoul-com3y ago

The author calculated the cost of 1 iPhone se was sub $50, which is 27k Google ocr images.

Only makes sense for small scale.

Unfortunate how there's no decent ocr library to self host, would be cheoaer than cloud costs.

elorant3y ago

If you have to process tens of millions of photos though the cost gradually becomes forbidding.

j / k navigate · click thread line to collapse

151 comments

CobrastanJorji3y ago

This title really undersells the absolute insanity of the described solution. This is a beautiful example of "if it's stupid, but it works, it's not stupid." The justification is very convincing.

One thing I'm curious about: how did you build your corpus of meme images and videos?

leokennis3y ago

dewey3y ago

I think I've seen a post from https://texts.com about that they. I don't think they ship you the iPhone though, the host it themselves.

kaladin-jasnah3y ago

I would imagine the only scalable way to run such a service is to run macOS virtual machines with multiple user accounts for each iMessage user.

aemreunal3y ago

(Not the author) Maybe they leveraged https://knowyourmeme.com/? But that can't possibly have all the random memes, could it?

mandatory3y ago

Author here: KnowYourMeme is one of many sites that memes are continually ingested from (any site that has memes I try to ingest regularly) :)

aemreunal3y ago

Amazing work! Also, thank you for making that feed on the main page, been laughing for a while here :D

2 more replies

GistNoesis3y ago

Nice IPhone cluster.

https://github.com/microsoft/GenerativeImage2Text (You'll probably have to train on your custom dataset that you have constituted)

There are tons of other freely available solutions that you can get with a search for things with keywords like "image to text ocr" "transformers" "visual transformers"...

2 more replies

taneq3y ago

All hail the memelord!

spiffytech3y ago

How do you ingest your social circle's in-group memes? Are they reliably posted to meme generator sites?

counttheforks3y ago

What about copyright?

1 more reply

code_duck3y ago

OP’s meme site lists where each image comes from. Looking through it I mainly see ifunny and 9gag.

solarkraft3y ago

Do you crawl telegram channels?

bryanrasmussen3y ago

jychang3y ago

Yeah. This whole article is literally already built into iOS UI, not just a hidden API. And kids all seem to know about this, apparently.

https://imgur.com/a/BPICjOz

https://imgur.com/a/55el9uQ

This is a solved problem already, by teenager standards.

counttheforks3y ago

The entire point is to find memes you don't already have.

1 more reply

ZephyrBlu3y ago

1 more reply

karatinversion3y ago

I only realised quite recently that I could now select text in images on my iPhone the same way I could if I was looking at a web page.

1 more reply

djmips3y ago

Well maybe this isn't for teenagers with iphones

merpkz3y ago

tmzt3y ago

I recently worked on a project using lnx.rs. Simple to setup and use and fast at the scale I was using it. Built on Tantivy with a custom fast fuzzy search feature.

If you want to go beyond meme sites and possibly detect memes in the wild, common crawl might be something to start with.

iamflimflam13y ago

One issue I've had with postgres full text search is when you want to rank using ts_rank you end up with a full table scan.

CiceroCiceronis3y ago

(There's a good meme metadiscussion group on Discord, "The Philosopher's Meme," which you might be interested in joining. People there would be very keen to discuss what you've made.)

spiffytech3y ago

I've always been surprised that Reddit hasn't built meme search into its site search.

Memes are a core part of the Reddit experience, yet it's difficult to find something I know I saw before.

jychang3y ago

Not familiar with Discord, do you have a link to that group?

CiceroCiceronis3y ago

https://discord.com/invite/8MVFRMa

iamflimflam13y ago

Love the hackiness of this - however, the vision framework is available on Desktop macs as well - https://developer.apple.com/documentation/vision

and specifically:

https://developer.apple.com/documentation/vision/vnrecognize...

cassiogo3y ago

He does mention running it on a macbook

msdrigg3y ago

I would guess that tests in this sentence refers to tests of the iOS app on the simulator.

Which would be slow expectedly

baggachipz3y ago

I would think it would run well on a M1/M2 Mac as a native app though, right?

1 more reply

dblitt3y ago

I would assume he's using an intel macbook and wouldn't have the gpu acceleration (and subsequent Vision framework integration) of the m1

pronoiac3y ago

There's ocrit, a CLI utility using Apple's Vision framework for OCR: https://github.com/insidegui/ocrit

price4569873y ago

What's the cost of building and running a cluster of iPhones vs Mac Minis?

Bad_CRC3y ago

in the article $40 second hand, imei banned and broken screen iPhones are being used so...

vlunkr3y ago

There's a ton of compute power available in the form of unused phones.

yakubin3y ago

The photo under Upgrading the iPhone OCR Service Into An OCR Cluster. In the future, data centres are going to host racks of iPhones.

sankha933y ago

At one of my previous workplaces, we discussed running the Z3 theorem prover on an iPhone cluster, because they run so much faster on A series processor than a desktop Intel machine.

formerly_proven3y ago

Reminds me of imgix, who built their product on Apple libraries so ended up having racks of macs to run their service.

oefrha3y ago

Modern app click farms already have walls of iPhones.

leokennis3y ago

To be fair...insane performance in a tiny cool package...iPhones would make great servers if you could order them without the screen/camera etc. :-)

defrost3y ago

There's your startup right there - washing line racks of discarded iPhones, near bleeding edge, busted screens, still functional o/wise.

Low entry cost, recycling, eco-friendly, . . . a deck that writes itself.

aabajian3y ago

ksdme93y ago

Love the inventiveness.

memeatlas3y ago

https://www.memeatlas.com

operator-name3y ago

Ideally there would be a best of both worlds where you could search memes by "characters" or "formats" in addition to text.

As feedback, it would be nice to search all memes from the homepage. The search on https://www.memeatlas.com/meme-templates.html also seems to be broken.

suave_dude3y ago

What are you using as the back end to host your website?

lormayna3y ago

If you don't need advanced search features, you can use Sonic (https://github.com/valeriansaliou/sonic). It's blazing fast and you can save lot of money on servers.

remram3y ago

Someone mentioned lnx.rs in this thread, how does it compare? And how about all the other new Rust solutions like MeiliSearch, Toshi, Quickwit?

lormayna3y ago

I don't know, I am not an expert on this topic. But this could be a nice topic for a blog article that will hit the frontpage on HN :)

philsnow3y ago

I'm so, so glad to see that I'm not the only person in the world with the same "problem". Well done, mandatory.

edit: holy crap you even index videos, nice

csande173y ago

lathiat3y ago

I would expect an M1 native Mac app to work similarly well. Though the iPhone solution may win on price.

The framework is supported on macOS (even tvOS apparently) https://developer.apple.com/documentation/vision

papito3y ago

Same thoughts. You can make that fly on a Mac Mini (provided it can be made to work close to the metal and not in an emulator)

Fabricio203y ago

If so, is there any risk in getting your account suspended or ip range banned somehow because of this, for example?

mandatory3y ago

Nope, you can use it totally offline. No way of getting banned as far as I'm aware.

tomw18083y ago

Absolutely amazing on the tech side!!!

mandatory3y ago

I'd also like to figure out how to turn an image into a description of whats in it. My ML/tensorflow knowledge is very weak though, so I still have a lot to learn here.

mseidl3y ago

Do you use mongodb to make it web scale? You turn it on and it scales right up.

zffr3y ago

https://www.youtube.com/watch?v=b2F-DItXtZs

Arbortheus3y ago

This is great, I particularly like the part about using compute from old unwanted iPhones. Quite an inventive way to reuse/recycle otherwise obsolete hardware!

permo-w3y ago

I have absolutely no experience in this area and I'm curious:

satvikchoudhary3y ago

Certainly not. https://paperswithcode.com/task/optical-character-recognitio...

MrGilbert3y ago

This is remarkable. I'd love to see that combined with some kind of sentiment analysis like Microsoft offers, just to see if something useful comes out of it.

Sometimes, I don’t know the exact words when looking for a meme, but once I see it, I know that’s the one.

Cloudly3y ago

Semantic search using CLIP embeddings / other LLM embeddings could be an amazing addition too.

astrange3y ago

Unfortunately semantic analysis barely works at the best of times, but it especially doesn't work here. Computers… they're just not good at irony.

oefrha3y ago

I suppose that’s an old Intel MacBook? I’d be very surprised if the Vision framework performs better on a 2nd gen iPhone SE than even the first M1 MacBook Air.

iamflimflam13y ago

I think he was running it in the simulator - which won't perform anywhere near as fast on this type of thing.

oefrha3y ago

The simulator runs a native build though. Might not be optimized, not sure.

komali23y ago

Would love to see that load balancer implementation, as I'm a scrub and this project fascinates me.

ksdme93y ago

mandatory3y ago

Yep, this is exactly what I'm running on the raspberry pi LB. Nginx makes it super easy!

andai3y ago

Works great for me.

It's kind of like trying to come up with a good Google search phrase, based on how other people must have phrased something, but relying on knowledge of how you phrase things instead.

solarkraft3y ago

lysecret3y ago

Wait what this is absolutely brilliant. Actually insane it works so well using a stack of iPhones as an ocr server. My deepest respect.

Thorentis3y ago

IaaS - iPhone as a Service, coming soon to AWS.

marginalia_nu3y ago

Now this is the sort of disgusting pile of jank I love to see.

JustARandomGuy3y ago

What was your average price per iPhone, if you don’t mind disclosing?

ekns3y ago

nysv3y ago

If only there was a way to filter out ifunny results, I absolutely detest that watermark.

Freak_NL3y ago

> Better yet, I don’t even want to use them as phones, so even iPhones that are IMEI banned or are locked to unpopular networks are perfectly fine for my use.

Fences worldwide will be overjoyed to hear of this novel application.

suave_dude3y ago

I have a question what do you guys think is the best back end for a video search engine app?

dirtyid3y ago

Outrageous effort! So far japanese, mandarin returning results as well.

Do you have list of sources where memes are ingested from?

Would be nice to have some option to explore memes by category.

looki3y ago

baradhiren073y ago

https://findthatmeme.com/?search=take+my+money

petesergeant3y ago

I was hoping this would help me find the Database Iceberg meme that shows different levels of database insanity. It didn’t. Anyone have a link?

cnity3y ago

This one? https://www.reddit.com/r/ProgrammerHumor/comments/wuu689/the...

petesergeant3y ago

Yes, that’s it, thankyou. Been searching a few days for it

cnity3y ago

You're welcome. FWIW I literally put your wording into google image search to find it:

> Database Iceberg meme that shows different levels of database insanity

But possibly my google search results are tailored differently from yours.

2Gkashmiri3y ago

Are you going to open source the "app" part of it ?

I would love to replicate this setup for my own project....

I am thinking, load balanced, multi location redundant "iOS machines" with 3-4 phones in with power backup and internet dongle.

We could use something like zerotier/tailscale to get internet access from outside your local network

ipsum23y ago

This is amazing. Out of curiosity, why not try deep learning OCR software instead of Tesseract? PaddleOCR is popular.

ducktective3y ago

>PaddleOCR

I called it quits. DL people are simply not interested in bundled static binaries.

solarkraft3y ago

That's a fun way to do OCR. Next up: Classifying memes by subjects and themes to build something like KnowYourMeme's gallery, but for every meme.

Bonus: Index from a lot of sources to help track a meme's origin.

This type of thing is on my long list of "can somebody else please do this already".

schappim3y ago

Pretty insane. If you don’t want to use iPhones, I made macOCR a while back. It uses the same vision APIs, with a very simple CLI interface. See: https://github.com/schappim/macOCR

petercooper3y ago

You can do it on macOS as well, it has the same API for fast high quality OCR. I used it to create an OCR system to detect secrets or credentials in screencasts: https://github.com/peterc/videocr

kgbcia3y ago

That's genius. I realized the cost advantage of text to speech on an old android versus Google cloud

sneak3y ago

Don't you have to re-sign and re-deploy tour iOS app every 7 days to keep it running on the iPhones?

paulmd3y ago

Is there a pgsync equivalent for Oracle? Spent some time building replication from a source-of-truth to a search engine at a previous job.

Wish we could have used postgres but the tools were dictated rather than letting the requirements drive the tooling.

spuz3y ago

9dev3y ago

For a single data point, it works exceptionally well for me. I routinely copy-paste from images or screenshots, and it rarely fails (mostly for handwriting or obscure fonts).

leokennis3y ago

yreg3y ago

This is absolutely brilliant.

I believe I will actually use it a lot if you keep the site up.

Minor feedback for the blog post: It deserves a better meta description (for link previews). The first paragraph doesn't advertise how good the article is going to be.

causality03y ago

nowahe3y ago

Out of curiosity, how does your Image Similarity Search works ? Are you also using some feature of Apple's Vision framework, or running some ML model on your linode instance ?

mandatory3y ago

The image similarity search is probably a blog post of its own.

Short TL;DR: It runs off my home server running a large vector database (opendistro): https://opendistro.github.io/for-elasticsearch-docs/docs/knn...

Liquidor3y ago

Brilliant! :-)

Maybe a dumb question, but could you use your data to train a new OCR model so you wouldn't have to rely on iOS?

I don't know much about ML/AI so maybe not feasible.

gerdesj3y ago

"It looked like it was time to bite the bullet and write an OCR iOS server in Swift."

Quite a large bullet required, one with plenty of chewing left in it.

joshu3y ago

Heh, this finds a bunch of copies of a video I made. If you are going to cache them and repost them, you probably need to have a DMCA process.

kome3y ago

you are a genius. also, the Search Engine works so well

surume3y ago

Thank you for building this! It's so much fun. I looked for memes that I saw years ago and found them in seconds. Excellent work!!

Scaevolus3y ago

Looks like you've solved the OCR problem, now to solve the duplication problem and use it as a ranking hint. :-)

1wsk3y ago

Could you not extract the model and run it on a server? Its probably not as easy but i know it has been done with NeuralHash

nisegami3y ago

Is this the person on /r/hardwareswap who's been looking for semi-functional, even IMEI banned, iPhones?

the_arun3y ago

Nice hack. But doesn't Google/(any public search engine) image search do this for us already?

kevmo3143y ago

Could you expose your iPhone cluster as an OCR API? Seems like it would be competitive with the GCP API.

urbandw311er3y ago

I wonder if there might be something in the Apple TOS prohibiting this.

OJFord3y ago

urbandw311er3y ago

I was more referring to bundling up an OS framework library feature and exposing it as an Internet API.

francis-io3y ago

Would be great if the images had a unique name so I could save them without having to rename them.

mateuszbuda3y ago

9dev3y ago

This is the single bestest thing I have read in a long while. Absolute madness. Pure bliss.

fuzzygroup3y ago

This is utterly fantastic and you are to be commended for your Crazy Mad Scientist genius!

julianeon3y ago

This makes me wonder what other cool things I can do with an old iPhone.

Tade03y ago

I wonder how it fares against deep-fried "E" or Opossum memes?

skizm3y ago

Where did the original set of meme images, gifs, and videos come from?

coayer3y ago

I love the iPhone cluster so much!

danbruder3y ago

this is very clever. I wonder what other use cases could leverage this approach

ariehkovler3y ago

This is mad and I love it.

geek_at3y ago

really amazing! I love the solution and the project in general

transitivebs3y ago

mike_d3y ago

papito3y ago

Many places keep adding cloud services to their stacks until one day someone in the C-suite notices the AWS bill.

langsoul-com3y ago

The author calculated the cost of 1 iPhone se was sub $50, which is 27k Google ocr images.

Only makes sense for small scale.

Unfortunate how there's no decent ocr library to self host, would be cheoaer than cloud costs.

elorant3y ago

If you have to process tens of millions of photos though the cost gradually becomes forbidding.

j / k navigate · click thread line to collapse