User Uploads in platformOS
File upload is often an important part of a web service or application. Usually files are uploaded to the application server, and then the server sends them to S3 (or any other cloud storage). Nowadays, more tech-savvy companies are moving away from this method for a couple of reasons: speed, cost, and security.
Direct S3 upload is a way for developers to deliver a faster experience for their users while decreasing infrastructure cost. In this article we explain why we chose to take advantage of this technology.
Advantages
-
Less load on the application server - Cost
Because files are sent directly to S3, the application server is not under load, no matter how many of those files are sent. This means platformOS can offer lower prices - we use less servers than traditional solutions.
-
Fewer middlemen - Speed, cost
When a user uploads files directly to S3 and skips the application server in the middle, the uploaded files are faster on the target location. In the traditional solution, files have to be uploaded twice, once from the user to the application server, then from the server to S3. Usually the second upload was placed in a background job (asynchronous), to queue uploads when there is a lot of them. No queues (that can be flooded) means, that there is one attack vector less if someone tried to DDoS our infrastructure. -
Less bandwidth used - Cost
Because files are going straight to S3, platformOS is not using bandwidth to first receive a file, just to upload it again to S3. The larger the scale, the larger the savings. -
No file processing - Security
Security holes in file uploads are often used to break into a web application. Eliminating the application server completely from this equation and separating those concerns means that there is, again, one attack vector less for the attacker to exploit. -
Bigger file size limits
When files are processed and forwarded by the application server, you want to keep them as small as possible, again, to mitigate attack vector and queue size. Direct S3 upload is constrained only by AWS S3 limits, which is 5GB (gigabytes) for single file upload when using single part upload, and 5TB (terabytes) when uploading multi part. For download, there is no limit - as long as you can upload a file, you can download it. Read more. -
Less moving parts
There is no better code than no code. We believe that removing things that can go wrong from a system improves the reliability of said system. It is easy to imagine, in an old paradigm, how a file could get uploaded to the application server, but for some reason it did not get uploaded to a storage. Software is very complicated and those things happen all the time. We try hard to give them as little chance to happen as possible, and the easiest way of achieving that is to have less of them.
How it works
Now let's look at how things are glued together in the case of platformOS. In order to start, create a property either in user.yml
OR in your <table>.yml
. The property type is upload
. You can use options
to configure whether the file should be private or public, max file size, generated versions, etc.
Upload
This is a high level description of what happens in a successful file upload flow:
- Application: Presign URL requested via GraphQL
To be able to upload a file to S3, our server is creating a presigned URL. What you as a developer need to do is to get the URL using theproperty_upload_presigned_url
mutation and point it to a record and property that it will be using. For example, you can define a recorddocuments
and propertyresume
with typeupload
inapp/schema/documents.yml
:
name: documents
properties:
- name: resume
type: upload
options:
acl: private # default is public
- Browser: POST file
This is the stage whereupload_url
andupload_url_payload
- results from theproperty_upload_presigned_url
mutation - will be consumed by the browser. The combination of URL, payload, and file from the user is a complete package that has to be sent over to S3. Your view can look like this:
{% graphql data %}
mutation presign {
property_upload_presigned_url(table: "documents", property_name: "resume") {
upload_url
upload_url_payload
}
}
{% endgraphql %}
<form action="{{ data.property_upload_presigned_url.upload_url }}">
{% for field in data.property_upload_presigned_url.upload_url_payload %}
<input type="hidden" name="{{ field[0] }}" value='{{ field[1] }}'>
{% endfor %}
</form>
-
S3: Return URL to file
S3 will return status 204 and an XML response with the URL to the file on S3, if everything went well. Otherwise it will return status 4xx and the reason for the error in XML format. -
Browser: POST URL to application
Until now, platformOS doesn't know anything about the file that the user has been uploading. This step is meant to change that. Using any method you wish, you need to send the URL to the uploaded file to the application server. Becausefile
is just an URL (string) property in a record, you update that property just like any other. -
Application: Save URL to the database
After getting the URL to a file from the browser, usually you want this URL to be saved in a record for future reference. You store it the same way as any other string - if the property type isupload
, we will do all the magic behind the scenes:
mutation record_create($direct_url: String!) {
record_create(
record: {
table: "documents"
properties: [{ name: "resume" value: $direct_url }]
}
) {
id
}
}
When you need the URL to the file, as always, you need to use GraphQL and query for the record containing the property to get it using property_upload
, for example:
query get_documents {
records(per_page: 20 filter: { table: { value: "documents" } }) {
results {
my_upload: property_upload(name: "resume") { url }
}
}
}
Images and processing
If you want your users to be able to upload images and automatically generate multiple versions of it (for example a thumbnail) you can achieve it by using the versions
option of property upload.
Read more on how to implement image upload with version processing using Direct S3 upload and Uppy.
FAQ
- platformOS is multi-cloud, does this mean implementation will be different for every cloud provider?
No. If you follow our recommendations from step 2 and use afor
loop to automatically generate the form for direct upload, we will be able to replace S3 with any other compatible service.