User Uploads in platformOS
File upload is often an important part of a web service or application. Usually files are uploaded to the application server, and then the server sends them to S3 (or any other cloud storage). Nowadays, more tech-savvy companies are moving away from this method for a couple of reasons: speed, cost, and security.
Direct S3 upload is a way for developers to deliver a faster experience for their users while decreasing infrastructure cost. In this article we explain why we chose to take advantage of this technology.
Less load on the application server - Cost
Because files are sent directly to S3, the application server is not under load, no matter how many of those files are sent. This means platformOS can offer lower prices - we use less servers than traditional solutions.
Fewer middlemen - Speed, cost
When a user uploads files directly to S3 and skips the application server in the middle, the uploaded files are faster on the target location. In the traditional solution, files have to be uploaded twice, once from the user to the application server, then from the server to S3. Usually the second upload was placed in a background job (asynchronous), to queue uploads when there is a lot of them. No queues (that can be flooded) means, that there is one attack vector less if someone tried to DDoS our infrastructure.
Less bandwidth used - Cost
Because files are going straight to S3, platformOS is not using bandwidth to first receive a file, just to upload it again to S3. The larger the scale, the larger the savings.
No file processing - Security
Security holes in file uploads are often used to break into a web application. Eliminating the application server completely from this equation and separating those concerns means that there is, again, one attack vector less for the attacker to exploit.
Bigger file size limits
When files are processed and forwarded by the application server, you want to keep them as small as possible, again, to mitigate attack vector and queue size. Direct S3 upload is constrained only by AWS S3 limits, which is 5GB (gigabytes) for single file upload when using single part upload, and 5TB (terabytes) when uploading multi part. For download, there is no limit - as long as you can upload a file, you can download it. Read more.
Less moving parts
There is no better code than no code. We believe that removing things that can go wrong from a system improves the reliability of said system. It is easy to imagine, in an old paradigm, how a file could get uploaded to the application server, but for some reason it did not get uploaded to a storage. Software is very complicated and those things happen all the time. We try hard to give them as little chance to happen as possible, and the easiest way of achieving that is to have less of them.
How it works
Now let's look at how things are glued together in the case of platformOS.
There are two types of files in platformOS
- Attachments - Documents of any file type, they're private (soon it will be configurable).
- Images - Most popular graphical files on the web, i.e. jpeg, png, webp, gif, svg. Images are always public and
Content-Dispositionis always inline. This type of upload is meant to be displayed on the web.
Both types can have file size validation, so you can decide how small or big of a file users will be able to upload.
This is a high level description of what happens in a successful file upload flow:
Application: Presign URL requested via GraphQL
To be able to upload a file to S3, our server is creating a presigned URL. What you as a developer need to do is to get the URL using the
attachment_presign_urlmutation and point it to a model and property that it will be using. For example, model
Browser: POST file
This is the stage where
upload_url_payload- results from the
attachment_presign_urlmutation - will be consumed by the browser. The combination of URL, payload and file from the user is a complete package that has to be sent over to S3.
S3: Return URL to file
S3 will return status 204 and XML response with URL to file on S3, if everything went well. Otherwise it will return status 4xx and error reason in XML format.
Browser: POST URL to application
Until now, platformOS doesn't know anything about the file that the user has been uploading. This step is meant to change that. Using any method you wish, you need to send the URL to the uploaded file to the application server. Because
fileis just a property in a model, you update that property just like any other.
Application: Save URL to the database
After getting the URL to a file from the browser, usually you want this URL to be saved in a model for future reference.
When you need the URL to the file, as always, you need to use GraphQL and query for the model containing the property.
Images are mostly the same, with some additions for your convenience.
- You can define versions that will be generated by asynchronous serverless function after the original file has been uploaed to S3.
- When retrieving URLs, you need to specify to which version you want to get the URL to.
Read more on how to implement image upload using Direct S3 upload and Uppy.
- platformOS is multi-cloud, does this mean implementation will be different for every cloud provider?
No. It is the same for all cloud providers.