| Title: | |
|---|---|
| Description: | A collection of tools for automating geospatial catalogs |
| Authors: | Mike Johnson [aut, cre] (ORCID: <https://orcid.org/0000-0002-5288-8350>), Justin Singh-Mohudpur [aut] (ORCID: <https://orcid.org/0000-0002-5233-5799>), ESIP [fnd] |
| Maintainer: | Mike Johnson <[email protected]> |
| License: | MIT |
| Version: | 0.1.0 |
| Built: | 2026-05-28 14:41:04 UTC |
| Source: | https://github.com/mikejohnson51/climateR-catalogs |
This is a generic base class for describing
catalog data sources. Each data source must
provide a pull and tidy method.
(raw())
The data member of this data source in Arrow IPC Stream format
(character(1))
The given path, invisibly.
.id(character(1))
.data(arrow::Table)
.pull(function)
.tidy(function)
.error(logical(1))
.error_steps(data.frame)
.finished(logical(1))TRUE if $tidy() has been called
successfully, otherwise FALSE.
id(character(1))
Identifier of this data source.
result(arrow::Table)
Result of this data source after $tidy() is called.
(Read-only)
.id(character(1))
.data(arrow::Table)
.pull(function)
.tidy(function)
.error(logical(1))
.error_steps(data.frame)
.finished(logical(1))TRUE if $tidy() has been called
successfully, otherwise FALSE.
new()
Create a new catalog data source.
data_source$new(id, pull, tidy)
id(character(1))
Identifier for this data source.
pull(function)
Pull method for this class.
The pull function may require any amount of parameters.
The pull function must return one of:
arrow::Table, data.table::data.table, or a data.frame.
tidy(function)
Tidy method for this class. See the tidy method for details.
The tidy function must require at least 1 argument as its
first argument that takes in, and also returns, one of:
arrow::Table, data.table::data.table, or a data.frame
print()
data_source$print(...)
pull()
Pull a catalog data source from its endpoint.
This method is user-defined at object creation.
data_source$pull(..., ..attempts)
...(any)
User-defined parameters that may be used.
tidy()
Tidy a raw catalog data source into the catalog schema.
data_source$tidy(..., ..attempts)
...(any)
User-defined parameters that may be used.
to_ipc_stream()
Marshal this data source to Arrow IPC Stream format
data_source$to_ipc_stream()
from_ipc_stream()
Unmarshals an Arrow IPC Stream to a data source
data_source$from_ipc_stream(stream)
stream(raw())
The given Arrow IPC Stream
to_ipc_file()
Output this data source to Arrow IPC File format
data_source$to_ipc_file(path)
path(character(1))
Path to file to write to. Should have extension '.arrow'.
from_ipc_file()
Read a data source from Arrow IPC File format
data_source$from_ipc_file(path)
path(character(1))
Path to Arrow IPC file.
clone()
The objects of this class are cloneable with this method.
data_source$clone(deep = FALSE)
deepWhether to make a deep clone.
If all parameters are missing, then an empty data_source is created. This is only useful for reading from IPC.
Create a new climateR.catalogs data source plugin
new_data_source(name, dir)new_data_source(name, dir)
name |
Name of the data source |
dir |
Directory to output |
Parse ISO 8601 duration string to human-readable interval
parse_iso8601_duration(step)parse_iso8601_duration(step)
step |
ISO 8601 duration string (e.g. "P1DT0H0M0S", "P0Y1M0DT0H0M0S") |
Character string like "1 day", "1 month", "1 year"
Read all child collections from a parent STAC collection
read_stac_children(parent_url, id)read_stac_children(parent_url, id)
parent_url |
Full URL to the parent STAC collection |
id |
Catalog identifier for this data source |
data.frame with rows from all child collections combined
Read a STAC collection and build catalog rows
read_stac_collection(url, id, asset_name = "zarr-s3-osn")read_stac_collection(url, id, asset_name = "zarr-s3-osn")
url |
Full URL to the STAC collection endpoint |
id |
Catalog identifier for this data source |
asset_name |
Name of the asset key for the zarr store (default "zarr-s3-osn") |
data.frame with one row per variable, columns matching catalog schema