Skip to content

Background

Carmen Tawalika edited this page Sep 22, 2021 · 3 revisions

Concepts:

Item, Catalog and Collection

Item

  • An Item represents a single spatiotemporal asset as GeoJSON so it can be searched.
  • Items require a link back to the collection they are part of. If a field is the same for all items in one collection, the field should be moved to the collection
  • Items can only belong to one Collection
  • Items are always tree leaves
  • A STAC item is always a GeoJSON with geometry (bbox), properties and links. For properties, the field datetime is required
  • Other fields might be thumbnail, asset links (download or streaming access), relationship links, core set of Common Metadata + STAC Content Extensions (see below)

Catalog

  • The Catalog specification provides structural elements, to group Items and Collections.
  • A catalog/collection is typically the "entry point" into a STAC object hierarchy (root endpoint).
  • Catalogs are used for two main things:
    • Split overly large collections into groups or
    • Group collections into a catalog of Collections (e.g. as entry point).
  • A catalog has an id, description, stac_version and links

Collection

  • Collections are catalogs, that add more required metadata and describe a group of related Items.
  • A Collection can have parent Catalog and Collection objects, as well as child Item, Catalog, and Collection objects. It must have none
  • Ideally Collections also link to fuller metadata (ISO 19115, etc) when available
  • A collection has all fields of a catalog + license and extent

Misc

How they interact

  • they are wildly hyperlinked (keywords: Hypermedia, HATEOAS)
  • STAC core object type specifications (Item, Catalog, and Collection specifications) can be implemented in static manner (static on filesystem)
  • STAC API specification allows more complex queries. STAC objects usually held in a database instead of static on filesystem

relative vs absolute hyperlinks

  • Relative: Self-contained Catalogs
    • Nearly all links are relative links (except maybe license) - no self link possible
      • Variation 1: Self-contained Metadata Only (assets have absolute links)
      • Variation 2: Self-contained with Assets (assets are included)
  • Absolute: Published Catalogs
    • Absolute Published Catalog (in links + assets all links are absolute)
    • Relative Published Catalog: Self-contained Catalogs with absolute self link at the root

Harvesting

It should be noted that a Catalog does not have to link back to all the other Catalogs that point to it. Thus a published root catalog might be a sub-catalog of someone else's structure. The goal is for data providers to publish all the information and links they want to, while also encouraging a natural web of information to arise as Catalogs and Items are linked to across the web.

=> this means, that it is encouraged to "harvest" and it is fine when links leave the own STAC Catalog. For absolute links this works out-of-the-box, for relative links it would be necessary to change them into absolute links. Recursive harvesting would also be possible but we want to store as little external information as possible.

Further Reading:

STAC in openEO

Important: STAC specification and STAC API are different specifications and have different version numbers after version 0.9.0. The openEO API only implements STAC API version 0.9.0, which allows to serve all STAC specification versions in the range of 0.9.x and 1.x.x (see the stac_version property).

The Collection specification is used standalone quite easily - it is used to describe an aggregation of data, and doesn't require links down to sub-catalogs and Items. [...] They often have an optimized internal format that doesn't make sense to expose as Items. OpenEO and Google Earth Engine are two examples that only use STAC collections, ..

STAC API recommendations

  • data discovery as in openeo (STAC + OGC features)
    • GET /collections
    • GET /collections/{collection_id}
  • STAC full implementation, not in openeo
    • GET /collections/{collectionId}/items
    • GET /collections/{collectionId}/items/{featureId}
    • POST /search
    • GET /search
  • examples: