Problem
Pipeline concerns are often mixed, so reruns and ownership become hard to reason about.
Contract-first orchestration design for ingestion, transformation, and warehouse execution.
Quick overview
Pipeline concerns are often mixed, so reruns and ownership become hard to reason about.
A contract-first engine with typed settings and adapter boundaries across runtime stages.
REST and file ingestion, Duck-compatible warehouses, and dbt-first transformation.
Pipeline flow stays centralised while backend-specific behaviour remains isolated.
Context
Data projects often blur orchestration, source logic, warehouse behaviour, and transformation execution. This increases maintenance cost and makes rerun behaviour harder to reason about. Axiomatic Engine addresses this by separating responsibilities with typed protocols and adapter boundaries.
Delivery architecture
The engine is organised into contracts, adapters, and core stages. Sources expose resources through protocol boundaries, ingestion is executed with dlt into a selected warehouse adapter, and transformation is delegated through a transformation adapter.
This keeps pipeline flow centralised while backend specifics remain isolated. Settings are loaded from AXIOMATIC_* variables with CLI override support.
Design decisions
Source, storage, warehouse, and transformation protocols define extension points. Literal kinds constrain available options, so unsupported backends fail explicitly instead of degrading silently. This helps keep project code domain-aware and engine code domain-agnostic.
Runtime flow is consistent: resources are wrapped and enriched, dlt performs ingestion into the configured warehouse destination, and the transformation stage runs only when enabled. Settings are loaded from AXIOMATIC_* variables with CLI override support.
The transformation stage is dbt-first by design. Dependency graphing, model ordering, and tests stay with dbt, while orchestration stays with the engine. This avoids reimplementing mature dbt capabilities in engine internals and keeps CI execution straightforward.
DuckDB and MotherDuck share a Duck-compatible base for common behaviour, with concrete adapters handling backend-specific validation. MotherDuck token handling remains environment-driven and is kept out of URI paths to reduce accidental credential exposure.
What I deliver
Data from APIs and files is loaded on a schedule, modelled into a shared reporting layer, and used across dashboards and AI-assisted workflows.
The goal is one trusted model, clear ownership, and straightforward handover.
Delivery scope
Implemented scope includes REST and file-based ingestion paths, DuckDB-compatible warehouse execution, typed settings, and dbt-first transformation orchestration. Declared extension points remain for additional storage backends and warehouses, with explicit not-implemented signalling.
Implemented: REST and file ingestion, Duck-compatible warehouses, and dbt-first transformation.
Phase 1 roadmap: S3, GCS, S3-compatible storage, and BigQuery.
Phase 2 roadmap: GraphQL API, SQL source, vendor REST connectors, PostgreSQL, Snowflake, and Redshift.
Phase 3 roadmap: SharePoint, OneDrive, Fabric, Databricks, and Azure Blob/ADLS.
Delivery notes
The current design favours clear boundaries and predictable execution over broad backend coverage. This reduces ambiguity for early adopters, but it also means some integrations remain planned. Semantic and MCP standards are tracked as planned capability work until concrete implementation is published.
Reproducibility
cd examples/fake_store
uv sync
uv run python run_pipeline.py --skip-transforms
uv run python run_pipeline.py --run-transforms
# pip alternative: python -m pip install -r requirements.txt Evidence
https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/001-warehouse-adapter-hierarchy.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/002-dbt-first-transformation-orchestration.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/003-source-contract-boundary.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/004-axiomatic-extraction-metadata-injection.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/005-pipeline-stage-gating-and-force-reload.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/006-environment-first-configuration-with-cli-overrides.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/007-dbt-project-and-profiles-path-resolution.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/008-observability-baseline-for-pipeline-execution.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/009-resource-load-hints-contract-and-bridge-mapping.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/010-hybrid-schema-evolution-policy.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/011-dbt-runtime-invocation-and-motherduck-token-propagation.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/012-package-readiness-and-uv-quality-gate.md
- Read source ADR https://github.com/axiomatic-bi/axiomatic-engine/blob/main/docs/adr/013-specific-source-kinds-and-source-factory-routing.md
- Read source ADR Narrative detail
This page anchors engine claims to architectural and ADR evidence, rather than framework-level marketing statements.
dbt (dbt-first orchestration path).duckdb and motherduck with shared Duck-compatible base behaviour.AXIOMATIC_* variables with CLI-over-env override support in project runners.gcs/s3, warehouse bigquery, and additional transformation backends.write_disposition (append, replace, merge) and optional primary_key.auto, strict, and discard.merge for idempotent upsert behaviour, replace for deterministic snapshots, append for arrival history.dlt.common.* APIs to reduce compatibility risk across upstream changes.ResourceLoadHints
- write_disposition: append | replace | merge
- primary_key: optional, used by merge semantics
- schema_evolution_mode: auto | strict | discard
Engine orchestration is exercised here through a project runner rather than a standalone engine CLI.
Future semantic and MCP orchestration standards remain explicitly planned until implementation artefacts are published.