Installation

Core package (YAML support only):

pip install config-versioned

With optional file-format extras:

pip install config-versioned[pandas]    # CSV, TSV, Excel, Stata
pip install config-versioned[geo]       # Shapefiles, GeoJSON, GeoPackage, etc.
pip install config-versioned[raster]    # GeoTIFF and other raster formats
pip install config-versioned[xarray]    # NetCDF
pip install config-versioned[dbfread]   # DBF files
pip install config-versioned[all]       # All of the above

Config file structure

The config YAML has two special top-level keys — directories and versions — alongside any arbitrary settings your pipeline needs:

project_name: 'my_analysis'

directories:
  raw_data:
    versioned: false
    path: '~/data/raw'
    files:
      input_table: 'records.csv'

  results:
    versioned: true
    path: '~/data/results'
    files:
      output_table: 'processed.csv'
      summary:      'summary.txt'

versions:
  results: 'v1'

Each entry under directories requires three fields:

  • versioned (bool) — whether the directory uses version subdirectories.

  • path (str) — base path (tilde expansion is applied).

  • files (dict) — named file stubs within the directory.

For versioned directories the full path is {path}/{version}, where the version comes from the versions dict (or a custom_version argument).

Supported file extensions

Format

Extensions

Requires

CSV / TSV

csv, tsv, gz, bz2

pandas

Excel

xls, xlsx

pandas, openpyxl

Stata

dta

pandas

DBF

dbf

dbfread

YAML

yaml, yml

(core)

Plain text

txt

(core)

Vector geospatial

shp, geojson, gpkg, fgb, gml, kml, …

geopandas

Raster

tif, geotiff

rasterio

NetCDF

nc

xarray

For raster files, autoread() returns {"data": np.ndarray, "profile": dict} and autowrite() accepts that same structure (or a (data, profile) tuple).