API Reference¶
Config¶
- class config_versioned.Config(config, versions=None)[source]¶
Bases:
objectConfiguration object for versioned file I/O pipelines.
Loads settings from a YAML file (or a plain dict) and provides methods to construct directory and file paths, read files, and write files. Supports versioned directories where the full path is
{base_path}/{version}.The
directorieskey in the config dict is special. Each entry must have: -versioned(bool): whether the directory has version subdirectories -path(str): base path to the directory -files(dict): named file stubs within the directoryFor versioned directories, a corresponding entry must exist in the
versionsdict (top-level key) whose value is the version string for the current run.- Parameters:
Examples
>>> import importlib.resources as r >>> p = str(r.files("versioning") / "data" / "example_config.yaml") >>> cfg = Config(p) >>> cfg.get("a") 'foo' >>> cfg.get_dir_path("prepared_data") PosixPath('/path/to/prepared_data/v1')
- get(*keys, **kwargs)[source]¶
Retrieve a nested value from the config.
- Parameters:
- Returns:
The value at the specified path. Returns the full config dict if
no keys are provided.
- Raises:
KeyError – If a key does not exist at any level (unless
fail_if_none=False).
- get_dir_path(dir_name, custom_version=None, fail_if_does_not_exist=False)[source]¶
Construct the full path for a named directory.
For non-versioned directories, returns the base path. For versioned directories, returns
{base_path}/{version}.- Parameters:
dir_name (str) – Name of the directory as defined in
config['directories'].custom_version (str, optional) – Override the version for this call instead of using
config['versions'][dir_name].fail_if_does_not_exist (bool, default False) – Raise FileNotFoundError if the directory does not exist on disk.
- Returns:
Full path to the directory.
- Return type:
- get_file_path(dir_name, file_name, custom_version=None, fail_if_does_not_exist=False)[source]¶
Construct the full path for a named file within a directory.
- Parameters:
dir_name (str) – Name of the directory as defined in
config['directories'].file_name (str) – Name of the file as defined in
config['directories'][dir_name]['files'].custom_version (str, optional) – Override the directory version for this call.
fail_if_does_not_exist (bool, default False) – Raise FileNotFoundError if the file does not exist on disk.
- Returns:
Full path to the file.
- Return type:
- read(dir_name, file_name, custom_version=None, **kwargs)[source]¶
Read a file using autoread, resolving the path from the config.
- Parameters:
- Return type:
The object loaded from the file.
- write(x, dir_name, file_name, custom_version=None, **kwargs)[source]¶
Write an object to a file using autowrite, resolving the path from config.
- Parameters:
x (object) – The object to write.
dir_name (str) – Directory name from
config['directories'].file_name (str) – File name from
config['directories'][dir_name]['files'].custom_version (str, optional) – Override the directory version for this call.
**kwargs – Additional keyword arguments passed to the format-specific writer.
File I/O¶
- config_versioned.autoread(file, **kwargs)[source]¶
Automatically read a file based on its extension.
- Parameters:
file (str or Path) – Full path to the file to read. Tilde expansion is applied.
**kwargs – Additional keyword arguments passed to the format-specific reader.
- Returns:
The object loaded from the file. Return type depends on the format
- csv/tsv/xlsx/dta (pandas DataFrame)
- yaml/yml (dict)
- txt (list of str)
- shp/geojson/etc. (geopandas GeoDataFrame)
- tif/geotiff (dict with keys “data” (numpy ndarray) and “profile” (dict))
- nc (xarray Dataset)
- dbf (pandas DataFrame (or list of dicts if pandas not installed))
- Raises:
FileNotFoundError – If the file does not exist.
IsADirectoryError – If the path points to a directory.
ValueError – If the file has no extension or the extension is not supported.
- config_versioned.autowrite(x, file, **kwargs)[source]¶
Automatically write an object to a file based on its extension.
- Parameters:
x (object) – The object to write. Expected types per format: - csv: pandas DataFrame - shp/geojson/etc.: geopandas GeoDataFrame - tif/geotiff: (ndarray, profile) tuple or dict with “data”/”profile” - txt: str or list of str - yaml/yml: dict (or any yaml-serializable object) - nc: xarray Dataset
file (str or Path) – Full path where the file should be saved. Tilde expansion is applied. The parent directory must already exist.
**kwargs – Additional keyword arguments passed to the format-specific writer.
- Raises:
FileNotFoundError – If the parent directory does not exist.
ValueError – If the file has no extension or the extension is not supported.
Format Registries¶
Utilities¶
- config_versioned.pull_from_config(x, *keys, fail_if_none=True)[source]¶
Safely retrieve a nested value from a dict or list by sequential keys.
- Parameters:
- Returns:
The value at the specified path in x. If no keys are provided,
returns x unchanged.
- Raises:
TypeError – If a key is not str or int.
KeyError – If a str key is absent from its dict.
IndexError – If an int index is out of range for its list.
ValueError – If the value is None and fail_if_none is True.