API Reference

Config

class config_versioned.Config(config, versions=None)[source]

Bases: object

Configuration object for versioned file I/O pipelines.

Loads settings from a YAML file (or a plain dict) and provides methods to construct directory and file paths, read files, and write files. Supports versioned directories where the full path is {base_path}/{version}.

The directories key in the config dict is special. Each entry must have: - versioned (bool): whether the directory has version subdirectories - path (str): base path to the directory - files (dict): named file stubs within the directory

For versioned directories, a corresponding entry must exist in the versions dict (top-level key) whose value is the version string for the current run.

Parameters:
  • config (dict, str, or Path) – Either a dict of settings or a filepath to a YAML file.

  • versions (dict, optional) – Key/value pairs to set or override in config['versions'].

config

The dictionary representation of the loaded configuration.

Type:

dict

Examples

>>> import importlib.resources as r
>>> p = str(r.files("versioning") / "data" / "example_config.yaml")
>>> cfg = Config(p)
>>> cfg.get("a")
'foo'
>>> cfg.get_dir_path("prepared_data")
PosixPath('/path/to/prepared_data/v1')
__init__(config, versions=None)[source]
__repr__()[source]

Return repr(self).

get(*keys, **kwargs)[source]

Retrieve a nested value from the config.

Parameters:
  • *keys (str or int) – Sequential keys to traverse the config dict.

  • **kwargs – Keyword arguments passed to pull_from_config, e.g. fail_if_none=False to return None instead of raising when a key is missing or the value is None.

Returns:

  • The value at the specified path. Returns the full config dict if

  • no keys are provided.

Raises:

KeyError – If a key does not exist at any level (unless fail_if_none=False).

get_dir_path(dir_name, custom_version=None, fail_if_does_not_exist=False)[source]

Construct the full path for a named directory.

For non-versioned directories, returns the base path. For versioned directories, returns {base_path}/{version}.

Parameters:
  • dir_name (str) – Name of the directory as defined in config['directories'].

  • custom_version (str, optional) – Override the version for this call instead of using config['versions'][dir_name].

  • fail_if_does_not_exist (bool, default False) – Raise FileNotFoundError if the directory does not exist on disk.

Returns:

Full path to the directory.

Return type:

pathlib.Path

get_file_path(dir_name, file_name, custom_version=None, fail_if_does_not_exist=False)[source]

Construct the full path for a named file within a directory.

Parameters:
  • dir_name (str) – Name of the directory as defined in config['directories'].

  • file_name (str) – Name of the file as defined in config['directories'][dir_name]['files'].

  • custom_version (str, optional) – Override the directory version for this call.

  • fail_if_does_not_exist (bool, default False) – Raise FileNotFoundError if the file does not exist on disk.

Returns:

Full path to the file.

Return type:

pathlib.Path

read(dir_name, file_name, custom_version=None, **kwargs)[source]

Read a file using autoread, resolving the path from the config.

Parameters:
  • dir_name (str) – Directory name from config['directories'].

  • file_name (str) – File name from config['directories'][dir_name]['files'].

  • custom_version (str, optional) – Override the directory version for this call.

  • **kwargs – Additional keyword arguments passed to the format-specific reader.

Return type:

The object loaded from the file.

write(x, dir_name, file_name, custom_version=None, **kwargs)[source]

Write an object to a file using autowrite, resolving the path from config.

Parameters:
  • x (object) – The object to write.

  • dir_name (str) – Directory name from config['directories'].

  • file_name (str) – File name from config['directories'][dir_name]['files'].

  • custom_version (str, optional) – Override the directory version for this call.

  • **kwargs – Additional keyword arguments passed to the format-specific writer.

write_self(dir_name, custom_version=None, **kwargs)[source]

Write the config dict as config.yaml to a directory.

Parameters:
  • dir_name (str) – Directory name from config['directories']. The directory must already exist on disk.

  • custom_version (str, optional) – Override the directory version for this call.

  • **kwargs – Additional keyword arguments passed to the YAML writer.

File I/O

config_versioned.autoread(file, **kwargs)[source]

Automatically read a file based on its extension.

Parameters:
  • file (str or Path) – Full path to the file to read. Tilde expansion is applied.

  • **kwargs – Additional keyword arguments passed to the format-specific reader.

Returns:

  • The object loaded from the file. Return type depends on the format

  • - csv/tsv/xlsx/dta (pandas DataFrame)

  • - yaml/yml (dict)

  • - txt (list of str)

  • - shp/geojson/etc. (geopandas GeoDataFrame)

  • - tif/geotiff (dict with keys “data” (numpy ndarray) and “profile” (dict))

  • - nc (xarray Dataset)

  • - dbf (pandas DataFrame (or list of dicts if pandas not installed))

Raises:
config_versioned.autowrite(x, file, **kwargs)[source]

Automatically write an object to a file based on its extension.

Parameters:
  • x (object) – The object to write. Expected types per format: - csv: pandas DataFrame - shp/geojson/etc.: geopandas GeoDataFrame - tif/geotiff: (ndarray, profile) tuple or dict with “data”/”profile” - txt: str or list of str - yaml/yml: dict (or any yaml-serializable object) - nc: xarray Dataset

  • file (str or Path) – Full path where the file should be saved. Tilde expansion is applied. The parent directory must already exist.

  • **kwargs – Additional keyword arguments passed to the format-specific writer.

Raises:
  • FileNotFoundError – If the parent directory does not exist.

  • ValueError – If the file has no extension or the extension is not supported.

Format Registries

config_versioned.get_file_reading_functions()[source]

Return a dict mapping file extensions to reading functions.

Returns:

Keys are lowercase file extensions (without the dot). Values are callables with signature f(file, **kwargs) that read the file and return the loaded object.

Return type:

dict

config_versioned.get_file_writing_functions()[source]

Return a dict mapping file extensions to writing functions.

Returns:

Keys are lowercase file extensions (without the dot). Values are callables with signature f(x, file, **kwargs) that write the object to the file.

Return type:

dict

Utilities

config_versioned.pull_from_config(x, *keys, fail_if_none=True)[source]

Safely retrieve a nested value from a dict or list by sequential keys.

Parameters:
  • x (dict or list) – The top-level container to index into.

  • *keys (str or int) – Sequence of keys or indices to apply in order. Use str for dict keys and int for list indices.

  • fail_if_none (bool, default True) – If True, raise ValueError when the retrieved value is None.

Returns:

  • The value at the specified path in x. If no keys are provided,

  • returns x unchanged.

Raises:
  • TypeError – If a key is not str or int.

  • KeyError – If a str key is absent from its dict.

  • IndexError – If an int index is out of range for its list.

  • ValueError – If the value is None and fail_if_none is True.