User Uploads in platformOS
File upload is often an important part of a web service or application. Usually files are uploaded to the application server, and then the server sends them to S3 (or any other cloud storage). Nowadays, more tech-savvy companies are moving away from this method for a couple of reasons: speed, cost, and security.
Direct S3 upload is a way for developers to deliver a faster experience for their users while decreasing infrastructure cost. In this article we explain why we chose to take advantage of this technology.
Advantages
-
Less load on the application server - Cost
Because files are sent directly to S3, the application server is not under load, no matter how many of those files are sent. This means platformOS can offer lower prices - we use less servers than traditional solutions.
-
Fewer middlemen - Speed, cost
When a user uploads files directly to S3 and skips the application server in the middle, the uploaded files are faster on the target location. In the traditional solution, files have to be uploaded twice, once from the user to the application server, then from the server to S3. Usually the second upload was placed in a background job (asynchronous), to queue uploads when there is a lot of them. No queues (that can be flooded) means, that there is one attack vector less if someone tried to DDoS our infrastructure. -
Less bandwidth used - Cost
Because files are going straight to S3, platformOS is not using bandwidth to first receive a file, just to upload it again to S3. The larger the scale, the larger the savings. -
No file processing - Security
Security holes in file uploads are often used to break into a web application. Eliminating the application server completely from this equation and separating those concerns means that there is, again, one attack vector less for the attacker to exploit. -
Bigger file size limits
When files are processed and forwarded by the application server, you want to keep them as small as possible, again, to mitigate attack vector and queue size. Direct S3 upload is constrained only by AWS S3 limits, which is 5GB (gigabytes) for single file upload when using single part upload, and 5TB (terabytes) when uploading multi part. For download, there is no limit - as long as you can upload a file, you can download it. Read more. -
Less moving parts
There is no better code than no code. We believe that removing things that can go wrong from a system improves the reliability of said system. It is easy to imagine, in an old paradigm, how a file could get uploaded to the application server, but for some reason it did not get uploaded to a storage. Software is very complicated and those things happen all the time. We try hard to give them as little chance to happen as possible, and the easiest way of achieving that is to have less of them.
How it works
Now let's look at how things are glued together in the case of platformOS. In order to start, create a property either in user.yml OR in your <table>.yml. The property type is upload. You can use options to configure whether the file should be private or public, max file size, generated versions, etc.
Upload
This is a high level description of what happens in a successful file upload flow:
- Application: Presign URL requested via GraphQL
To be able to upload a file to S3, our server is creating a presigned URL. What you as a developer need to do is to get the URL using theproperty_upload_presigned_urlmutation and point it to a record and property that it will be using. For example, you can define a recorddocumentsand propertyresumewith typeuploadinapp/schema/documents.yml:
name: documents
properties:
- name: resume
type: upload
options:
acl: private # default is public
- Browser: POST file
This is the stage whereupload_urlandupload_url_payload- results from theproperty_upload_presigned_urlmutation - will be consumed by the browser. The combination of URL, payload, and file from the user is a complete package that has to be sent over to S3. Your view can look like this:
{% graphql data %}
mutation presign {
property_upload_presigned_url(table: "documents", property_name: "resume", include_content_type: true) {
upload_url
upload_url_payload
}
}
{% endgraphql %}
<form action="{{ data.property_upload_presigned_url.upload_url }}" enctype="multipart/form-data" method="post">
{% for field in data.property_upload_presigned_url.upload_url_payload %}
<input type="hidden" name="{{ field[0] }}" value='{{ field[1] }}'>
{% endfor %}
<input type="hidden" name="Content-Type" value="application/pdf">
<button type="submit">Upload</button>
</form>
One additional thing that you would probably want to send with the payload is the Content-Type of the uploaded file. In the example, it is shown with a hardcoded value of application/pdf.
Usually, you would like to update this information dynamically in the front-end after user chooses the file. This typically requires a small amount of custom JavaScript, depending on your implementation.
It is possible to skip the Content-Type, but this will result in a default value of application/octet-stream being set.
- S3: Return URL to file
After submitting such form, you will be redirected to an XML response (204 status code) with the URL to the file on the storage servers. Otherwise it will return status 4xx and the reason for the error in XML format.
Please note that for optimal user experience, you would likely want to do this using an fetch request from JavaScript, so that the whole upload process stays invisible to the end user.
Be mindful that until now, platformOS doesn't know anything about the file that the user has uploaded. It exists only on the storage servers.
- Application: Save URL to the database
After getting the uploaded file URL you would likely want to store it in the database for future reference.
You store it the same way as any other string:
mutation record_create($direct_url: String!) {
record_create(
record: {
table: "documents"
properties: [{ name: "resume" value: $direct_url }]
}
) {
id
}
}
If, in the schema, you've set the property type to be upload then platformOS will update the field automatically after that. It will change the string you've provided to a JSON object containing additional information except the file URL. Additionally, for images, it can generate variants and URLs to those will also be automatically added to the record.
When you need the URL to the file, as always, you need to use GraphQL and query for the record containing the property to get it using property_upload, for example:
query get_documents {
records(per_page: 20 filter: { table: { value: "documents" } }) {
results {
my_upload: property_upload(name: "resume") {
url
}
}
}
}
Images and processing
If you want your users to be able to upload images and automatically generate multiple versions of it (for example a thumbnail) you can achieve it by using the versions option of property upload.
Read more on how to implement image upload with version processing using Direct S3 upload and Uppy.
Additionaly, when versions option is set, it is possible to skip setting the Content-Version metadata as mentioned above. In this case, the system assumes the file is an image, and the image processing software will set it automatically.
FAQ
- platformOS is multi-cloud, does this mean implementation will be different for every cloud provider?
No. If you follow our recommendations from step 2 and use aforloop to automatically generate the form for direct upload, we will be able to replace S3 with any other compatible service.