At a glance

Problem

Public API data was easy to access but not ready for governed, repeatable reporting.

What was built

A reusable ingestion and modelling workflow with a reporting-ready retail star schema and Power BI outputs.

Why it matters

Reporting stays consistent because business logic lives in the model, not the dashboard.

Implementation focus

A config-driven reference build using Axiomatic Engine, dbt-led transformation, and client-ownable delivery assets.

Data model

The Fake Store source data is reshaped into a reporting-ready star schema built for reuse across dashboards and analysis. Business logic sits in the model layer so metrics stay consistent across outputs.

The model separates cart-item facts from shared dimensions for products, customers, and dates. That makes the reporting layer easier to extend, test, and maintain.

Model decisions

  • Star schema keeps joins predictable.
  • Cart-item grain supports revenue and quantity analysis at product item and order level.
  • Shared dimensions keep reporting definitions consistent.
flowchart LR fact["fact_cart_items"] dimProduct["dim_products"] dimCustomer["dim_customers"] dimDate["dim_dates"] dimProduct --> fact dimCustomer --> fact dimDate --> fact
Table Role Grain Purpose
fct_cart_items Core sales fact One row per cart item Supports revenue, quantity, and product analysis.
dim_products Product dimension One row per product Holds product attributes and category information.
dim_users Customer dimension One row per user Supports customer-level analysis and segmentation.
dim_dates Date dimension One row per date Supports time-based filtering and reporting.

Data quality

  • dbt schema tests.
  • Deterministic reruns from incremental load strategies per resource.

Architecture

This diagram summarises the delivery flow from ingestion to warehouse modelling and then BI consumption.

For clients, this creates a reliable reporting foundation where pipeline changes and dashboard changes can be managed independently.

flowchart LR subgraph ingestGroup["Ingestion"] direction LR apiNode["<b>API extraction</b><br/>Products, users, carts via dlt resources"] bronzeNode["<b>Bronze load</b><br/>Raw outputs persisted to warehouse tables"] apiNode --> bronzeNode end subgraph transformGroup["Transformation"] direction LR silverNode["<b>Silver staging</b><br/>Typed and normalised dbt models"] goldNode["<b>Gold star schema</b><br/>Conformed dimensions and fact tables for analytics"] martsNode["<b>Mart layer (optional)</b><br/>Derived serving views for consumers"] silverNode --> goldNode goldNode -.-> martsNode end subgraph consumeGroup["Consumption"] direction LR consumeNode["<b>Consumption tools</b><br/>Power BI, Tableau, Evidence, Dash, MCP, or other analytics apps"] end bronzeNode --> silverNode goldNode --> consumeNode martsNode -.-> consumeNode

Reporting outputs

Reporting assets are versioned separately from engine runtime code, so BI can evolve independently while staying aligned to the same warehouse contract.

Power BI

The PBIX artefact, semantic model assets, and screenshots show how reporting consumes the governed model in practice.

Open PBIX artefact location

How to reproduce

Open engine project repository

Execution details

Assumptions

  • Environment values are loaded from local, non-committed files.
  • Warehouse credentials and dataset permissions are configured before transform runs.

Commands

cd projects/fake_store && uv sync
cd projects/fake_store && uv run python run_pipeline.py --skip-transforms
cd projects/fake_store && uv run python run_pipeline.py --run-transforms

Limitations

  • Demonstration-focused: production scheduling and alerting are not included.
  • Semantic MCP integration remains planned and is not yet published in this example.

Delivery intent, modelling, and transform decisions

This case study demonstrates a repeatable reporting workflow from API-first source data without embedding business logic in dashboards. It provides a clear handover between data engineering and BI delivery while keeping implementation evidence verifiable.

Engine layer

The engine layer defines ingestion, rerun behaviour, and transform contracts in versioned code.

Reporting layer

Reporting assets consume the same contract but are versioned separately from engine runtime code.

  • The PBIX artefact demonstrates report implementation against the governed model.
  • Reporting outputs are evidence of delivery, not a place for hidden transformation logic.

Model decisions

Model choices prioritise explicit grain, predictable reruns, and auditable quality checks.

  • Gold models follow a star schema centred on cart-item facts with conformed user, product, and date dimensions.
  • Resource-level load hints define rerun semantics (merge for updates, replace for snapshots).
  • dbt schema tests run in the same path as model execution to keep validation aligned with transforms.

Outcomes and limits

This example is demonstration-focused but designed to show reproducible and verifiable delivery quality.

  • Demonstrates deterministic reruns and reproducible outputs from documented commands.
  • Provides a baseline for new source adapters or additional reporting consumers.
  • Production scheduling and alerting are not included in this reference build.