There's a blog that's analyzing Geocities, that's about 1 terabyte of 1 KB files.
http://contemporary-home-computing.org/1tb/ The analysis tracks changes in template design, follows modifications to logos and gifs, and unearths collections of shrines to dead children etc.
But that's from when it was harder to make and upload data, so people only put meaningful (to them) stuff online. These days we'd have a hundred thousand copies of a few popular MP3's and everyone's crappy digital photos. The percentage of meaningful stuff would be a lot lower.