Collections#
Overview#
A collection is a named, versioned container that groups related resources for retrieval, sharing, and programmatic access. For example, you might build a file collection that pulls front-camera images from ten separate flight datasets to assemble a training set without duplicating any data.
Three properties are worth understanding before working with collections:
A collection holds one resource type — dataset, file, or event. You choose the type at creation time and cannot change it. To group both files and whole datasets, create two collections.
Collections are versioned: every mutation increments the version number and the full change history is retained.
Collection IDs begin with the
cl_prefix.
Resource types#
Dataset — the collection references whole datasets. Use this when the unit of work is an entire session or mission, for example grouping all datasets captured on a particular firmware release for batch regression testing.
File — the collection references individual files drawn from any number of datasets. Use this when you need specific files, such as all front-camera images regardless of which dataset they live in.
Event — the collection references events. Use this when curating a labelled set of moments across many logs, such as all “hard braking” events that make up an ML training dataset.
Creating a collection#
The minimal CLI invocation creates an empty collection and defaults its resource type to
file. Pass --resource-type explicitly to suppress the default warning, or seed the
collection with initial resource IDs so the type can be inferred:
# Empty file collection (--resource-type suppresses the default warning)
roboto collections create \
--name "Front camera images" \
--resource-type file
# Dataset collection seeded with two datasets
roboto collections create \
--name "Firmware 3.4.2 missions" \
--description "All flights captured on fw 3.4.2 for regression testing" \
--dataset-id ds_aaa111 --dataset-id ds_bbb222 \
--tag regression --tag fw-3.4.2
The equivalent SDK calls:
from roboto import Collection
# File collection seeded with two files
collection = Collection.create(
name="Front camera images",
description="Images for training the front-camera model",
file_ids=["fi_aaa111", "fi_bbb222"],
tags=["training", "front-camera"],
)
print(collection.collection_id) # cl_...
# Dataset collection
collection = Collection.create(
name="Firmware 3.4.2 missions",
dataset_ids=["ds_aaa111", "ds_bbb222"],
tags=["regression", "fw-3.4.2"],
)
Adding and removing resources#
Use roboto collections update to add or remove resources, rename the collection, or
update its description:
# Add files
roboto collections update cl_abc123 \
--add-file-id fi_ccc333 --add-file-id fi_ddd444
# Remove a file
roboto collections update cl_abc123 \
--remove-file-id fi_aaa111
# Remove a dataset
roboto collections update cl_abc123 \
--remove-dataset-id ds_aaa111
# Rename and re-describe in the same call
roboto collections update cl_abc123 \
--name "Updated collection name" \
--description "Revised description"
The SDK exposes per-type convenience methods as well as a general update method for
batching multiple changes in one call:
from roboto import Collection
from roboto.domain.collections import CollectionResourceRef, CollectionResourceType
collection = Collection.from_id("cl_abc123")
# Per-type convenience methods
collection.add_file("fi_ccc333")
collection.remove_file("fi_aaa111")
collection.add_dataset("ds_ccc333")
collection.remove_dataset("ds_aaa111")
collection.add_event("ev_ccc333")
collection.remove_event("ev_aaa111")
# Batch multiple changes in a single call
collection.update(
add_resources=[
CollectionResourceRef(
resource_type=CollectionResourceType.File,
resource_id="fi_eee555",
)
],
add_tags=["validated"],
remove_tags=["draft"],
)
Versioning#
Every mutation — adding or removing resources, updating the name, description, or tags — creates a new version. The version number is a monotonically incrementing integer stored on the collection record.
Each version records the exact set of resource IDs that belonged to the collection at that point in time, making it possible to look up what the collection contained at any past version. Versioning tracks collection membership, not the underlying resources: if a resource is deleted, the version history still records it as a former member, but the resource itself is no longer accessible.
# Show the collection as it existed at version 3
roboto collections show cl_abc123 --collection-version 3
# Show all changes between versions 1 and 5
roboto collections changelog cl_abc123 --from-version 1 --to-version 5
from roboto import Collection
# Retrieve a specific historical version
historical = Collection.from_id("cl_abc123", version=3)
# Iterate the change log
for change in collection.changes(from_version=1, to_version=5):
print(change.to_version, change.applied_by, change.change_set)
Listing and retrieving collections#
# List all collections in your org
roboto collections list
# Show a specific collection (resource IDs included by default)
roboto collections show cl_abc123
# Show with full resource details hydrated
roboto collections show cl_abc123 --content-mode full
The --content-mode flag controls how much detail is returned:
summary_only— metadata only, no resource IDs or details. Useful for listing.references— metadata plus the ID of every resource in the collection. This is the default forshowandlist.full— metadata plus the fully hydrated record of every resource. Use this when you need resource details without making separate API calls per resource.
from roboto import Collection
from roboto.domain.collections import CollectionContentMode
# Fetch by ID with full content hydrated
collection = Collection.from_id(
"cl_abc123",
content_mode=CollectionContentMode.Full,
)
# List all collections (summary mode by default)
for c in Collection.list_all():
print(c.collection_id, c.record.name)
Deleting a collection#
Deleting a collection removes the collection container only. The underlying resources — datasets, files, or events — are not affected.
roboto collections delete cl_abc123
collection.delete()
CLI reference#
See the roboto collections CLI for the complete list of commands and flags.