Largest was over 300TB. I talked with the owner beforehand and got access to the internal IP address so traffic wouldn't leave the datacenter (free of cost).
I offered to help them set up an API instead of scraping, but they decided scraping was easier in the short term.