
Fluid Forge
Declarative Data Products — Write YAML, Deploy Anywhere.
🎯 One Contract. Every Cloud.
Write a single YAML contract and deploy to GCP, AWS, Snowflake, or your laptop. Fluid Forge handles the cloud plumbing — datasets, tables, IAM, monitoring — so you don't have to.
🚀 Zero to Production in Minutes
pip install → init → apply → done. No cloud account needed to start. Pre-built blueprints, AI-powered scaffolding, and a local DuckDB provider for instant feedback.
🔄 Pipelines That Write Themselves
Auto-generate production-ready Airflow DAGs, Dagster graphs, and Prefect flows straight from your contracts. No hand-written orchestration code.
🛡️ Governance from Day One
Policy-as-code, sovereignty controls, column-level security, data masking, and full audit trails baked in — not bolted on.
☁️ True Multi-Cloud
Same CLI. Same Jenkinsfile. Same contract. Deploy to GCP (BigQuery), AWS (Athena, Glue), and Snowflake without rewriting a single line.
🧩 Extend Everything
Build a custom cloud provider in ~40 lines of Python. Plug in any LLM for AI generation. Export to open standards (ODPS, ODCS) for full interoperability.
Why Fluid Forge?
Every cloud wants you locked in. Every SDK wants you to rewrite everything when you switch providers. Fluid Forge says no.
Write one declarative YAML contract. Deploy it to any cloud. Move between providers in seconds. This is Infrastructure-as-Code for data engineering — and it actually works.
::: code-group
from google.cloud import bigquery, storage
client = bigquery.Client(project='my-project')
dataset = bigquery.Dataset(client.dataset('analytics'))
dataset.location = 'US'
dataset.description = 'Customer analytics data'
client.create_dataset(dataset, exists_ok=True)
table_ref = dataset.table('customers')
schema = [
bigquery.SchemaField('id', 'INTEGER', mode='REQUIRED'),
bigquery.SchemaField('name', 'STRING', mode='REQUIRED'),
bigquery.SchemaField('email', 'STRING', mode='REQUIRED'),
]
table = bigquery.Table(table_ref, schema=schema)
client.create_table(table, exists_ok=True)
# ... 80 more lines of IAM, monitoring, error handling
# ... then rewrite everything for AWS and Snowflake
# contract.fluid.yaml
fluidVersion: "0.7.1"
kind: DataProduct
id: analytics.customers
name: Customer Analytics
metadata:
owner: { team: data-engineering }
exposes:
- exposeId: customers_table
kind: table
binding:
platform: gcp # or aws, snowflake, local
resource:
type: bigquery_table
dataset: analytics
table: customers
contract:
schema:
- name: id
type: INTEGER
required: true
- name: name
type: STRING
required: true
- name: email
type: STRING
required: true
sensitivity: pii
:::
Then deploy with one command:
fluid apply contract.fluid.yaml --yes
That same contract deploys to GCP, AWS, Snowflake, or runs locally on DuckDB — zero code changes.
Quick Start
# Install
pip install fluid-forge
# Create a project with sample data
fluid init my-project --quickstart
cd my-project
# Validate and run — no cloud account needed
fluid validate contract.fluid.yaml
fluid apply contract.fluid.yaml --yes
That's it. A working data product on your laptop in under 2 minutes — no cloud account, no credit card, no config hell. When you're ready for production, change platform: local to platform: gcp and run the exact same command.
Ready to dive deeper?
Platform Support
| Platform | Deploy | IAM / RBAC | Airflow Gen | Key Services |
|---|---|---|---|---|
| GCP | ✅ Production | ✅ | ✅ | BigQuery, GCS, IAM |
| AWS | ✅ Production | ✅ | ✅ | S3, Glue, Athena, IAM |
| Snowflake | ✅ Production | ✅ | ✅ | Databases, Schemas, RBAC |
| Local | ✅ Production | — | — | DuckDB, CSV, Parquet |
| Azure | 🔜 Planned | 🔜 | 🔜 | Synapse, Data Lake |
All cloud providers use the same CLI commands and the same CI/CD pipeline — see Universal Pipeline.
What's In the Box
| Feature | Description |
|---|---|
| 40+ CLI commands | validate, plan, apply, verify, generate-airflow, export, policy-check, and more |
| Blueprints | Pre-built templates: customer-360, enterprise-snowflake, analytics starters |
| AI Copilot | fluid forge --mode copilot — adaptive interview, discovery, validation/repair, then scaffolding |
| Governance Engine | Access policies, sovereignty controls, data classification, compliance checks |
| Orchestration Export | Generate Airflow DAGs, Dagster pipelines, and Prefect flows from contracts |
| Open Standards | Export to ODPS v4.1, ODCS v3.1, and data mesh catalogs |
| Custom Providers | Build your own provider with ~40 lines of Python using the Provider SDK |
| Universal CI/CD | One Jenkinsfile that works for every provider — zero branching logic |
Who Uses Fluid Forge?
| Role | How Fluid Forge Helps |
|---|---|
| Data Engineers | Build production pipelines without wrestling with cloud SDKs |
| Analytics Teams | Create self-service data products with governance built-in |
| Platform Teams | Standardize data infrastructure across the entire org |
| Data Scientists | Deploy ML feature pipelines with proper contracts and testing |
Next Steps
Developed with pride by DustLabs · Copyright 2025-2026 Agentics Transformation Pty Ltd · Open source under Apache 2.0