Data Import
This topic describes data import including fast data imports using CSV files.
Data to be imported needs to be in JSON format, it is an array of user data and array of records data:
{
"users": [...],
"records": [...]
}
You can generate data from an existing Instance with the pos-cli data export
command (read more about this here). It will generate the data.json
file. You can also import data from other sources (e.g. old stack, other systems) once you generate the proper JSON file. It is fairly easy to convert, for example, a CSV file into a JSON file.
General information
-
Before import, your Tables need to be deployed.
-
When importing data, the
id
fields will be changed. Relations will be kept. If you have your own relations defined in properties, you need to first describe them in Table files - read more about this here. -
If you test your imports and need to clean Instance data, use the
pos-cli data clean
command. Note, that this command will work only on staging Instances. -
When you run import twice, it will try to match previously imported data without removing existing changes. Example:
- You have a record
shirt
with properties:color
,size
- First import adds new
shirt
with JSON:{ "id": 1, "properties": { "color": "red" }}
- User sets attributes of
shirt
, nowshirt
has the following properties:{ "color": "green", "size": "large" }
- Second import with the same file changes
shirt
object to:{ "color": "red", "size": "large" }
- You have a record
-
Import will only do basic validations, such as the existence of properties and type validation. It doesn't check any of the validations defined on forms.
Data Import with the CLI
With pos-cli data, you can export/import Instance data (information entered by the users).
platformOS CLI provides the pos-cli data import
command for uploading and importing data from a JSON file or ZIP archive with CSV files. It accepts one option:
- --path (short: -p): A path to a JSON or ZIP files to be imported
Under the hood, the CLI uses the GraphQL import API that's described in the next section. However, using the CLI provides some major benefits:
- Uploading file from disk
- Leveraging the CLI's authentication mechanism (i.e. you don't need to manually send your authentication token)
Input format
The required data format is a JSON file with users and records keys that have an array of maps. Each of those collections needs to be in the same format as a corresponding GraphQL mutation. Under the hood import is just a set of mutations: import_users()
, import_records()
. When importing data using the CLI, the file should contain data in a format that matches the following structure:
{
"users": [<UserImport],
"records": [<CustomizationImport>],
}
Example
Example data import file data.json
{
"users": [
{
"id": 1000,
"email": "[email protected]",
"first_name": "Tom",
"slug": "tom"
},
{
"id": 1001,
"email": "[email protected]",
"first_name": "mike",
"slug": "mike",
"created_at": "2010-10-10 10:00:00",
"profiles": [
{
"type_name": "student",
"id": 2001,
"properties": {
"age": 21
},
"attachments": [
{
"name": "document",
"url": "files/document.pdf"
}
],
"images": [
{
"name": "avatar",
"url": "files/avatar.png"
}
]
}
]
}
],
"records": [
{
"id": 1,
"user_id": 1000,
"type_name": "house",
"customizable_id": 1000,
"customizable_type": "User",
"properties": {
"color": "red"
}
},
{
"id": 2,
"user_id": 1001,
"type_name": "house",
"customizable_id": 1001,
"customizable_type": "User",
"properties": {
"color": "green",
"buyer_id": 1000,
"beach_house_id": 1
},
"addresses": [
{
"name": "foo_address",
"address": "Warsaw",
"postcode": "01-001"
}
]
},
{
"id": 3,
"user_id": 1001,
"type_name": "house",
"properties": {
"buyer_id": 1000,
"beach_house_id": 2
},
"attachments": [
{
"name": "driver_licence",
"url": "files/licence.pdf"
}
],
"images": [
{
"name": "headshot",
"url": "files/headshot.png"
}
]
}
]
}
To import the data from a file, run the following command in the terminal:
pos-cli data import --path=data.json staging-instance
Data Import with GraphQL mutations
It is possible to import data using GraphQL mutations. This gives you more flexibility, and even allows you to create your own import tool as a website feature. Please remember to handle authentication properly.
There are two main mutations for importing import_users()
and import_records()
. You can create a page that will import sent data.
app/graphql/import_data.graphql
mutation import_data(
$users: [UserImport!]!,
$records: [CustomizationImport!]!
) {
import_users(users: $users) { ids }
import_records(records: $records) { ids }
}
app/views/pages/import_data.json.liquid
---
method: post
---
{% graphql res = "import_data",
users: context.params.users,
records: context.params.records
%}
{{ g }}
curl -d @transformed.json \
-H "Content-Type: application/json" \
-X POST https://example.com/import_data.json
Data Import with raw import API
Our CLI uses the HTTP API to import a data file. You can use it in your own tool.
Please note, that you need to provide your authentication token in the HTTP Authorization header of the request.
curl -F "import[data][email protected]" \
-H "Authorization: Token token=[YOUR API TOKEN]" \
-X POST https://example.com/api/app/imports
Note
You can find your API Token in Partner Portal under "Access Key".
Data import using CSV files
It is possible to import data using CSV files — this is the most straightforward and performant way of data import.
pos-cli data import --zip --path=archive.zip staging-instance
CSV data format
Inside the ZIP archive there should be CSV files with data to be imported. For example:
records.csv
users.csv
user_profiles.csv
- Before import, your Tables need to be deployed.
- CSV files only support comma separated values, if in doubt, it might be a good idea to use a linter like CSVlint to check the file.
- A well formatted file has a header with names of columns available for the given type, and one or more rows of data.
Example file for Records
id,user_id,created_at,updated_at,properties,model_schema
1,,2020-03-13 11:22:54.825836,2020-03-17 05:32:41.477453,"{""name"": ""Buy milk"", ""done"": ""false""}",todo
2,,2020-03-13 11:27:26.236759,2020-03-17 05:39:34.765109,"{""name"": ""Do code reviews"", ""done"": ""true""}",todo
3,,2020-03-13 11:27:26.236759,2020-03-17 05:39:34.765109,"{""name"": ""Fix a tpyo"", ""done"": ""false""}",todo
This file will import Records matching the Table for: "todo". If you notice, there's a typo in the object with id=3, to fix it you can edit the file like this:
id,user_id,created_at,updated_at,properties,model_schema
1,,2020-03-13 11:22:54.825836,2020-03-17 05:32:41.477453,"{""name"": ""Buy milk"", ""done"": ""false""}",todo
2,,2020-03-13 11:27:26.236759,2020-03-17 05:39:34.765109,"{""name"": ""Do code reviews"", ""done"": ""true""}",todo
3,,2020-03-13 11:27:26.236759,2020-03-17 05:39:34.765109,"{""name"": ""Fix a typo"", ""done"": ""true""}",todo
And now importing will just update the data of matching rows. Even if you add new rows, only they will be added, existing rows will not be deleted.
Additionally, not all columns are required, so for example this will work as well:
properties,table
"{""name"": ""Buy milk"", ""done"": ""false""}",todo
"{""name"": ""Do code reviews"", ""done"": ""true""}",todo
Example file for Users
id,email,encrypted_password,created_at,updated_at,name,authentication_token,slug,time_zone,first_name,middle_name,last_name,external_id,properties
1,[email protected],abcdefghijaklmnoprst111,2020-03-11 05:44:13.302432,2020-03-11 05:44:13.302432,[email protected],xyz123fhfhfh,api-example-com,,John,A,Smith,af56f347-ecdd-4636-8cf6-76aa58f89b61,{}
It is important to note that encrypted_password
will only work if you are importing data previously exported from our system through other means. So setting passwords this way will not work: they will be saved but the system will not accept them when checking through application code. It would be unsafe to set passwords this way so it's better to use a random value here and require users to change their passwords after migration.
Example file for UserProfiles
id,user_id,proile_type,created_at,updated_at,properties,instance_profile_type_name
1,2,"buyer",2020-03-11 05:44:13.302432,2020-03-11 05:44:13.302432,"{""name"": ""Our Services""}","modules/menu/buyer"