For a long time an out of the box server installation would include anonymous ftp access. Of course nothing is quite so attractive as a 'free' place to dump and retrieve stuff. It was kind of like setting up a warez/malware camera trap.
I think this is worth emphasizing more than the article does. The problem is just as much with the after-the-fact direct access as with the upload. Given the wide variety of illegal things you will quickly end up hosting and the amount of traffic this will generate, cross site scripting attacks may not be your top concern.
Uploading php files instead of images has been used to gain access to machines. Anything that gets stored as a file on the filesystem of the destination machine is a huge risk. All it takes is one little misconfiguration somewhere else and you're wide open.
So if someone uploads a file called `image.php.jpg`, the file is executed by Apache as PHP code. And obviously verifying the MIME type or even the content of the file won't help you here, since you can just write a JPEG header and then throw in `<?php system("..."); ?>` after it.
Even when you think you're safe based on what you'd consider to be obvious assumptions ("the file extension is whatever comes up after the last period"), there are weird things like this that might bite you.
I have a site that allows uploads (students turning in Java files) but the files are just stored in a folder on the server that isn't in the web-served path. They can't see the file again once uploaded. I assume (and I think rightly) that there's no security risk in my case.
Hosting user-uploaded files on a separate domain would probably solve this problem.
Indeed. Nothing about this applies to sites that accept uploads for internal use (e.g. parsed as input data).
The blog post in the OP solely discusses XSS vulnerabilities that are introduced by unrestricted file uploads. There are numerous other issues that can occur from arbitrary file uploads (malware hosting, arbitrary code execution if it's PHP, phishing), but to prevent against a user content ever reaching sensitive data via XSS, placing all user data on a separate domain is pretty much your best bet.
Content-Disposition: attachment; filename=”image.jpg”
header mean you can no longer display the image in your service? Won't browsers treat it as a file download? Most services that allow image uploads do so because the images will get displayed on a page? (that's what I do)Most services seem to be moving file uploads to S3 (or similar services) these days, so I'm not sure this advice is really helpful. To take that a step further, my preference now is to upload directly to S3 and bypass my app server altogether. At least in Rails, it's fairly easy to setup.
Some also restricts so that different filetypes on S3 will be served as Inline content, but that will just save you from XSS, and not the CSRF leakage. It's still suprisingly common with a crossdomain.xml restricted to [wildcard].domain.com.
By uploading straight to S3 you also get a faster upload (than, say, Heroku) and server separation.
The idea that you can "verify the contents" is pretty much just wrong. You actually have to parse the files and write out your own known-safe version. It's a real pain in the butt to do that correctly and securely across a wide variety of file types.
Even parsing arbitrary user uploads with something like ImageMagick is probably exploitable, simply because those libraries weren't designed to handle hostile input.
If a PHP page is allowing file uploads and only verifies the content of the data, but nothing else, then no protection is offered against arbitrary code execution. It's easy to craft a JPEG header and then place `<?php ... ?>` right after it; you could even append it to a valid JPEG body, too.