Here's a recent example:
http://www.chillingeffects.org/dmca512c/notice.cgi?NoticeID=...
Someone should create a searchable database of these URLs, so that you can search for pirated content that's already been validated as authentic by RIAA/MPAA lawyers. Perhaps that would put a "chilling effect" on Internet censorship.
Actually, it would most likely invite new theories of vicarious liability for infringement and make such transparency more legally risky.
That said, reading through the ChillingEffects list is a great way to gather news. For example, you find out about leaked materials rather quickly.
I was initially surprised that Microsoft dwarfs the RIAA in requests, but of course their software sells at much higher prices than albums. I wonder how popular an artist has to be for the RIAA to consider it worth paying someone to find links and submit requests...
It also sounds like an enormous burden (read: barrier to entry) for search engines. I'm sure Google is constantly optimizing just how much of the process it can automate. (ex. If a submitter has had X requests approved for Y domain, remove it automatically?) I love how they let webmasters know about it, though, to remove fears of false positives.
Put differently: you'd have to be made of stupid to have a ratio of infringing URLs to overall URLs that looked unfavorable; all you have to do to minimize that metric is to spray crap all over some portion of your site that nobody but Googlebot cares about.
You may know more than I do about the existence of pages of crap on torrentz.eu because I have never been there, but it's rather difficult for me to believe that anyone is intentionally gaming a report they had no way of knowing the existence of until Google's announcement.
You have to dig not to find copyright-infringing links at the root of TORRENTZ.EU's index.
At yet the statistic on Google's summary page suggests that TORRENTZ.EU is primarily --- no, overwhelmingly --- a non-infringing site. And that is the reason the statistic is there: to put forward that argument.
We don't have to agree on the policy debate here, but let's at least call spades spades.
"We removed 97% of search results specified in requests that we received between July and December 2011."
It doesn't say anything about how the requests must be formatted and if they are legally enforceable. i.e. Can just anyone submit a request? Does it have to include any kind of evidence?
[1]http://www.google.com/transparencyreport/removals/copyright/...
http://www.law.cornell.edu/uscode/text/17/512#c_3
edit: I'm not exactly sure what you're asking with "legally enforceable", but again, if you mean in terms of the DMCA, my understanding is that if a notification satisfies those requirements, you must comply. That's why the counter-notification process is so important.
> we try to catch erroneous or abusive removal requests.... [examples of bad requests]... We try to catch these ourselves, but we also notify webmasters in our Webmaster Tools when pages on their website have been targeted by a copyright removal request, so that they can submit a counter-notice if they believe the removal request was inaccurate.
Formatting isn't really an issue, the real issues are whether the notice contains the proper information and whether it's sent to the right place.
As always, get proper legal advice if any of this information is of more than academic interest to you.
http://www.google.com/transparencyreport/removals/copyright/
Should we only bother with it after it becomes a problem (hopefully by then we'll have enough revenue to afford this)? Or is it too risky to launch without such system and get brought down ourselves?
Manual processing of requests is probably fine for many launching apps (unless there is the possibility of automated content creation on the part of your users), which also lets you be really careful with complying with applicable laws while (crucially) maintaining the rights of your users.
But, yes, if you're in the US and want to maintain safe harbor protections, you will have to handle any takedown requests you receive.
If you don't register an agent, you are not protected by the DMCA safe harbor provisions.
Once you've got a registered agent, you'll need to internally create a DMCA notification procedure that you'll follow once you receive them. The system is very simple and extremely fair to ISPs/sites. Talk to a lawyer, but most don't know the DMCA well and you can find this stuff out on your own generally.
It's far more likely that no one will even notice your site, rather than spammers start piling on. Implement basic logging & monitoring and you'll be fine. It may suck when/if it comes, but scaling & abuse defense is usually a good problem to have. Worst case, you've got a completely idle server!