Upload Agent#
Overview#
Roboto offers a first party upload agent which can be used to coordinate dataset creation and file upload from devices to Roboto.
Initial Setup#
The upload agent is available as a standalone binary executable file, and is hosted on GitHub Releases.
Once downloaded, we recommend renaming it from its roboto-agent-macos-aarch64
(or similar) OS specific format to roboto-agent
.
The agent is configured using a JSON file placed in $HOME/.roboto/upload_agent.json
, described in more detail below.
You can generate this file interactively by running roboto-agent configure
, or can create it manually.
In order to run, the upload agent also requires a Device Access Token or
Personal Access Token to be available in $HOME/.roboto/config.json
, as described
on those pages.
Agent Config File#
The following is an example upload_agent.json file:
{
"version": "v1",
"delete_uploaded_files": true,
"search_paths": [
"/Users/roboto/datasets/autoupload"
],
"upload_config_filename": ".roboto_upload.json"
}
The fields it exposes are:
version
- This should always be set to “v1”.delete_uploaded_files
- If true, the agent will delete files after they’ve been uploaded to Roboto. This behavior can be overridden on an upload-by-upload basis as described below.Regardless of this setting, the agent will always delete the
.roboto_upload.json
upload config file from local disk after the upload is completed.
search_paths
- Directories which the agent will search recursively for upload config files.If you make these too broad (like
/
), the agent may take a long time to run, so you should use the most specific paths that you can.
upload_config_filename
- The name of the upload config file which signals that a directory should be uploaded as a new dataset. The default is.roboto_upload.json
, but you can change this if you have a compelling reason.
Upload Config File#
The upload agent works by recursively scanning the search_paths
defined in its agent config file for directories
containing an upload config file, (named .roboto_upload.json
using default settings).
The presence of this file signals to the agent that a new dataset should be created, and that files in the same directory or subdirectories as the upload config file should be uploaded to it.
In addition to acting as a signal file, the contents of the upload config file can be used to control properties of the dataset which will be created, and to configure other upload parameters, such as whether the agent should delete files locally once they’ve been uploaded.
The following is an example upload config file:
{
"version": "v1",
"upload": {
"delete_uploaded_files": false,
"exclude_patterns": ["**/.DS_Store"]
},
"dataset": {
"org_id": "og_123456789012",
"description": "This dataset contains logs from the Roboto team driving an RC car in the park",
"tags": ["upload_agent_example", "crash", "4wd"],
"add_to_collections": ["cl_12345"],
"metadata": {
"operator": "benji@roboto.ai",
"location": "Gas Works Park",
"avg_temp_f": "77",
"visibility": "clear"
}
}
}
The fields it exposes are:
version
- This should always be set to “v1”.upload.delete_uploaded_files
- (Optional) This allows you to override the agent config file’sdelete_uploaded_files
setting for an individual upload.upload.exclude_patterns
- (Optional) This allows you to specify a list of .gitignore format glob patterns which the agent will use to exclude files from the upload. If omitted, everything will be uploaded.dataset.org_id
- (Optional) The ID of the org to which the dataset should be associated.You only need to set this if you’re a member of multiple orgs, otherwise it will be determined implicitly.
dataset.description
- (Optional) A freeform description of the dataset.dataset.tags
- (Optional) Initial tags to apply to the dataset.dataset.metadata
- (Optional) Initial key-value metadata pairs to apply to the dataset.dataset.add_to_collections
- (Optional) A list of collection IDs to which the dataset should be added.
Running the Agent#
There are two main ways to run the agent:
roboto-agent run
will run a single scan of the search paths, upload any datasets found, and then exit.This is the best choice for most upload scenarios
roboto-agent run --forever
will run the agent in a loop, scanning the search paths every 30 seconds.This is intended to be run as a daemon process