Installation¶
Core package (YAML support only):
pip install config-versioned
With optional file-format extras:
pip install config-versioned[pandas] # CSV, TSV, Excel, Stata
pip install config-versioned[geo] # Shapefiles, GeoJSON, GeoPackage, etc.
pip install config-versioned[raster] # GeoTIFF and other raster formats
pip install config-versioned[xarray] # NetCDF
pip install config-versioned[dbfread] # DBF files
pip install config-versioned[all] # All of the above
Config file structure¶
The config YAML has two special top-level keys — directories and
versions — alongside any arbitrary settings your pipeline needs:
project_name: 'my_analysis'
directories:
raw_data:
versioned: false
path: '~/data/raw'
files:
input_table: 'records.csv'
results:
versioned: true
path: '~/data/results'
files:
output_table: 'processed.csv'
summary: 'summary.txt'
versions:
results: 'v1'
Each entry under directories requires three fields:
versioned (bool) — whether the directory uses version subdirectories.
path (str) — base path (tilde expansion is applied).
files (dict) — named file stubs within the directory.
For versioned directories the full path is {path}/{version}, where the
version comes from the versions dict (or a custom_version argument).
Supported file extensions¶
Format |
Extensions |
Requires |
|---|---|---|
CSV / TSV |
csv, tsv, gz, bz2 |
|
Excel |
xls, xlsx |
|
Stata |
dta |
|
DBF |
dbf |
|
YAML |
yaml, yml |
(core) |
Plain text |
txt |
(core) |
Vector geospatial |
shp, geojson, gpkg, fgb, gml, kml, … |
|
Raster |
tif, geotiff |
|
NetCDF |
nc |
|
For raster files, autoread() returns
{"data": np.ndarray, "profile": dict} and
autowrite() accepts that same structure (or a
(data, profile) tuple).