They would most likely use the browsers they offer users to scrap and stream the content back to an endpoint for ingest and processing as users browse Reddit, think Recap the Law extension for Pacer (which scrapes Pacer while a user browses it and ships the data to the Internet Archive) or ArchiveTeam’s Warrior VM. You can’t defend against scraping when every user browser, that looks like a human because it is a human, is a crawler node.
At least, this is how I would engineer a public browser operating as an adversarial distributed crawler network.