Home Blog Security In Layers: Preventing XSS Attacks with AWS S3 Direct Upload

DEV

Security In Layers: Preventing XSS Attacks with AWS S3 Direct Upload

Posted by alex on Oct. 19, 2017, 1:56 p.m.

For any any web developer, allowing users to upload files to a service and then serving those files back other users is a great way to open those users and your service up to a whole host of security vulnerabilities. It is therefore imperative to understand and mitigate all of these issues before deploying to your users.

If your service allows the upload and sharing of arbirtary filetypes, then of course there's nothing you can do to completely protect your users. Preventing users from executing malicious binary files on their own machines is something web and browser developers can do very little about. However, web and browser developers can prevent the exploitation of their own services through user uploaded files, and to help users understand when they're no longer interacting with your service and instead viewing a user-uploaded file.

This article will attempt to give some guidelines on a few things to be aware of when hosting user files, and is specifically aimed at using Amazon S3 direct upload.

Serve files from a different domain

The first and most important thing an application can do is to serve user uploaded files from a different domain than the domain the main application is on. This gives the user an indication that the file they're downloading is not directly endorsed by your applicaiton, but more importantly, any scripts the browser might execute won't be able to read information from your application because of the same-origin policy.

If you're using S3 and direct upload for your file hosting, you're likely already covered by this. The main application code should be on your own domain, and the S3 files will live on another domain if you've configured it, or simply on the domain associated with the bucket you've created. (Note, there are security issues around serving files on a subdomain of your domain. The strongest same-origin policy protections are provided when you serve files on a completely separate domain.) The only way this can bite you is if you're also hosting your website through S3. In this case, be sure to configure different buckets for your site and your user uploaded files.

Set Content Type and Content Disposition

The next thing to do is to make it clear to the user that when they've clicked on a link to a user uploaded file that the browser handles it in a way that doesn't put the user at risk. Browsers have a couple of methods for determining what to do with the response of an HTTP request. For as many responses as it can, a browser will attempt to render the content for you. If you click on a link to a web page, the page is rendered in the browser. Images, videos, PDF files and other documents are all types of files that browsers know how to handle, and for some of these files, its alright to let the browser render it for your users.

Problems come in to play however for file types that can tell the browser to execute code. It is possible, for example, to embed JavaScript inside of an SVG file, and this JavaScript can be executed during the render of the file. Although we have prevented the JavaScript from being able to read any of our service's private data thanks to the previous step (this JavaScript is not under the same origin as all of our private data), it's still a bad idea to allow users to get other users to execute arbitrary JavaScript in their browser.

For dangerous filetypes like SVG, it is possible to prevent the browser from rendering the SVG to the user. This can be done by setting the Content-Disposition and Content-Type HTTP headers when you send the file to the user. Setting Content-Disposition to attachment and setting Content-Type to application/octet-stream gives the browser a strong indication that it should show a download dialog box rather than attempt to render it inline.

With Amazon S3, you don't have direct control over the HTTP server that serves up your files, and with direct upload, you don't even have direct control over the settings that the client sends to S3.

You can, however, give the client a signed URL that tells Amazon exactly what parameters to expect on the upload. If the client attempts to upload a file that does not match any of these parameters, their upload will fail with a 403 Permission Denied error. The parameters that this URL should enforce in this case are Content-Type and Content-Disposition.

Here's an example of how one might generate such a signed URL with boto3 in Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
s3 = boto3.client(
    's3',
    region_name=settings.AWS_S3_REGION_NAME,
    aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
    aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
)

signedPutUrl = s3.generate_presigned_url(
    ClientMethod='put_object',
    Params={
        'Bucket': settings.AWS_STORAGE_BUCKET_NAME,
        'Key': key,
        'ContentType': file_data['content_type'],
        'ContentDisposition': file_data['content_disposition'],
    }
)

Earlier in this process we looked at the content type of the file and compared it against a list of known, safe content types. Any content type that was not known to be safe was changed to application/octet-stream and its content disposition set to attachment. Boto's generate_presigned_url function creates a URL that contains all of these parameters and is signed with our secret key. Any modification of this URL will be detectable by Amazon.

When the client attempts to upload a file to this URL, Amazon will check that the supplied parameters match the policy that the URL specifies, and any deviation from the policy results in Amazon denying permission for the upload. Then, later, when the file is retrieved from S3, it will have the same headers set, which results in the browser showing a download dialog instead of displaying inline.

Doing either of these two options separately will mitigate against some attacks against your users. JavaScript served on the same origin but not executed can't exfiltrate data, and executing JavaScript on a different origin limits the damage that it can do significantly, but together these two techniques protect your users from many angles of attack.