One thing I'm curious about: how did you build your corpus of meme images and videos?
If it works, it works. But it also speaks volumes about Apple's disregard/inexperience with exposing their stuff via the web - https://www.icloud.com/ being the prime example: half the stuff the phone apps can do are not available (cannot create a reminder with a due date...) and the things that are there are slow and buggy.
I’m around age 30, not 13, so similar to the article, my first instinct was also to create a database and OCR the image. But by total coincidence, yesterday I had a conversation with my 14 year old cousin on the topic of saving memes. Her response was along the lines of “yeah, everyone nowadays just saves the image to your iPhone photos, and then just search for it later from the photos app”.
Yeah. This whole article is literally already built into iOS UI, not just a hidden API. And kids all seem to know about this, apparently.
This article uses an example meme with the text “Sorry young man But the armband (red) stays on during solo raids”. I saved it in my iPhone photos app… and found it again through the search function in the photos app.
This is a solved problem already, by teenager standards.
I felt extremely old yesterday when I was talking to my cousin. And I felt extremely old today, reading this article. This is because looking back, the past few decades of CS cultural intuition have established that text are text, and images are images. Strings and bitmaps don’t mix.
This seems sort of obvious to anyone in tech, but I realized that from a clueless grandma perspective, not being able to search up text in photos wasn’t really obvious. Well, the roles are reversed now. Ordinary people now have access to software that treats text as a first class citizen in photos by default.
> Initial testing with the Postgres Full Text Search indexing functionality proved unusably slow at the scale of anything over a million images, even when allocated the appropriate hardware resources.
I can guarantee you that correctly setup PostgreSQL text search will be faster than ES with much, much less hardware resources needed, it's just a matter of correctly creating tsvector column and creating GIN index on it (and ofc asking right queries so it's actually used). I can help you out setting postgres schema up and debugging queries if you are interested, for testing purposes at least.
If you want to go beyond meme sites and possibly detect memes in the wild, common crawl might be something to start with.
People in the 21st century know a lot about the mistakes of the past century that led to much popular culture of the time being lost (especially terminally online people who've watched lots of Youtube documentaries about lost Dr. Who episodes and so on), so it surprises me how little we try and avoid those same mistakes with today's ephemeral pop culture in the form of memes. People like yourself who want to help make the internet's huge corpus of memes tractable are part of the solution in terms of meme archival and cultural memory.
(There's a good meme metadiscussion group on Discord, "The Philosopher's Meme," which you might be interested in joining. People there would be very keen to discuss what you've made.)
Memes are a core part of the Reddit experience, yet it's difficult to find something I know I saw before.
and specifically:
https://developer.apple.com/documentation/vision/vnrecognize...
He does mention running it on a macbook
At one of my previous workplaces, we discussed running the Z3 theorem prover on an iPhone cluster, because they run so much faster on A series processor than a desktop Intel machine.
Low entry cost, recycling, eco-friendly, . . . a deck that writes itself.
My question is about the image distribution costs. All the memes on the site seem to be coming straight off an object storage, all that bandwidth consumption has got to add up(?). Some sort of a CDN might help depending on the search patterns.
Ideally there would be a best of both worlds where you could search memes by "characters" or "formats" in addition to text.
As feedback, it would be nice to search all memes from the homepage. The search on https://www.memeatlas.com/meme-templates.html also seems to be broken.
I'm so, so glad to see that I'm not the only person in the world with the same "problem". Well done, mandatory.
edit: holy crap you even index videos, nice
The framework is supported on macOS (even tvOS apparently) https://developer.apple.com/documentation/vision
If so, is there any risk in getting your account suspended or ip range banned somehow because of this, for example?
Now, after reading the article, I gave your search engine a try. I was looking for that futurama its a trap meme (pretty much pops up on any image search here https://www.google.com/search?q=futurama+its+a+trap)
The problem is, the search engine you built is now very text-heavy, which seems to be usually very unconnected to the actual meme. So, searching for "its a trap" did not yield the results I was actually hoping for, but made total sense looking at how the search was implemented.
Are you planning to implement an actual tagging of the content of some sorts? Maybe a clustering of similar objects (like iphone clusters similar peoples faces in the gallery) and then tag those clusters with keywords somehow?
I'd also like to figure out how to turn an image into a description of whats in it. My ML/tensorflow knowledge is very weak though, so I still have a lot to learn here.
is there really no open-source text recognition software that's on-par with or close to Apple's (presumably proprietary) implementation? the article mentions Tesseract. is that the current best open-source option?
Sometimes, I don’t know the exact words when looking for a meme, but once I see it, I know that’s the one.
IME CLIP embedding search can work strangely on memes as well, because it gets confused when images have words in them. Basically the same problem reported in the original CLIP paper where it thinks an apple and a piece of paper with "apple" written on it are the same thing.
I suppose that’s an old Intel MacBook? I’d be very surprised if the Vision framework performs better on a 2nd gen iPhone SE than even the first M1 MacBook Air.
Works great for me.
It's kind of like trying to come up with a good Google search phrase, based on how other people must have phrased something, but relying on knowledge of how you phrase things instead.
What was your average price per iPhone, if you don’t mind disclosing?
Fences worldwide will be overjoyed to hear of this novel application.
Do you have list of sources where memes are ingested from?
Would be nice to have some option to explore memes by category.
I would love to replicate this setup for my own project....
I am thinking, load balanced, multi location redundant "iOS machines" with 3-4 phones in with power backup and internet dongle.
We could use something like zerotier/tailscale to get internet access from outside your local network
I once tried installing it on a recent Ubuntu. After messing with dependency hell and pip downloading half of the internet, when I finally invoked the CLI, it complained _again_ about a missing runtime dependency.
I called it quits. DL people are simply not interested in bundled static binaries.
Bonus: Index from a lot of sources to help track a meme's origin.
This type of thing is on my long list of "can somebody else please do this already".
Wish we could have used postgres but the tools were dictated rather than letting the requirements drive the tooling.
I believe I will actually use it a lot if you keep the site up.
Minor feedback for the blog post: It deserves a better meta description (for link previews). The first paragraph doesn't advertise how good the article is going to be.
Short TL;DR: It runs off my home server running a large vector database (opendistro): https://opendistro.github.io/for-elasticsearch-docs/docs/knn...
Maybe a dumb question, but could you use your data to train a new OCR model so you wouldn't have to rely on iOS?
I don't know much about ML/AI so maybe not feasible.
Quite a large bullet required, one with plenty of chewing left in it.
Many places keep adding cloud services to their stacks until one day someone in the C-suite notices the AWS bill.
Only makes sense for small scale.
Unfortunate how there's no decent ocr library to self host, would be cheoaer than cloud costs.