Fluid ForgeFluid Forge
Home
Get Started
  • Local (DuckDB)
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
CLI Reference
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Home
Get Started
  • Local (DuckDB)
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
CLI Reference
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • /
    • Getting Started
    • Snowflake Quickstart
    • Vision & Roadmap
  • Walkthroughs

    • Walkthrough: Local Development
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
  • CLI Reference

    • CLI Reference
    • init Command
    • validate Command
    • plan Command
    • apply Command
    • verify Command
    • generate-airflow Command
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Agents
    • FLUID Forge Contract GPT Packet
    • Forge Copilot Discovery Guide
    • Forge Copilot Memory Guide
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Snowflake Provider

Deploy data products to Snowflake Data Cloud — databases, schemas, tables, RBAC grants — using the same contract and CLI commands as every other provider.

Status: ✅ Production
Version: 0.7.8
Tested Services: Databases, Schemas, Tables, Warehouses, RBAC Grants


Overview

The Snowflake provider turns a FLUID contract into real Snowflake infrastructure:

  • ✅ Plan & Apply — Databases, schemas, tables, warehouses
  • ✅ RBAC Compilation — fluid policy-compile generates Snowflake GRANT statements from accessPolicy
  • ✅ Sovereignty Validation — Region constraints enforced before deployment
  • ✅ Airflow DAG Generation — fluid generate-airflow produces Snowflake-operator DAGs
  • ✅ Governance — Classification, column masking, row-level security, audit labels
  • ✅ Universal Pipeline — Same Jenkinsfile as GCP and AWS — zero provider logic

Choose Your Starting Path

Use Snowflake in one of these two modes:

  • Enterprise recommended path: dbt-snowflake plus explicit environment-specific warehouse, database, schema, and role settings. Start with the billing_history example and the Snowflake quickstart.
  • Minimal starter path: native SQL with the smoke example when you want the smallest contract that still proves auth, validate, plan, apply, and verify.

For production teams, make these explicit per environment:

  • SNOWFLAKE_ACCOUNT
  • SNOWFLAKE_USER
  • SNOWFLAKE_WAREHOUSE
  • SNOWFLAKE_DATABASE
  • SNOWFLAKE_SCHEMA
  • SNOWFLAKE_ROLE

For production and CI, use this authentication order:

  1. SNOWFLAKE_PRIVATE_KEY_PATH for key-pair auth
  2. SNOWFLAKE_OAUTH_TOKEN for federated automation
  3. SNOWFLAKE_AUTHENTICATOR for interactive SSO

Password auth is still supported, but it should be treated as a fallback rather than the default production path.

If no explicit credentials are present, browser SSO is only attempted in an interactive TTY session. Non-interactive runs should supply key-pair, OAuth, or another explicit authenticator instead of relying on browser prompts.

Working Example: Bitcoin Price Tracker

This is a production-tested example that runs end-to-end in Jenkins CI.

Contract

fluidVersion: "0.7.1"
kind: DataProduct
id: crypto.bitcoin_prices_snowflake_governed
name: Bitcoin Price Index (FLUID 0.7.1 + Snowflake + Governance)
description: >
  Real-time Bitcoin price data with comprehensive governance policies
  on Snowflake Data Cloud
domain: finance

tags:
  - cryptocurrency
  - real-time
  - governed
  - gdpr-compliant
  - snowflake

labels:
  cost_center: "CC-1234"
  business_criticality: "high"
  compliance_gdpr: "true"
  compliance_soc2: "true"
  platform: "snowflake"

metadata:
  layer: Gold
  owner:
    team: data-engineering
    email: data-engineering@company.com

# ── Data Sovereignty ──────────────────────────────────────────
sovereignty:
  jurisdiction: "EU"
  dataResidency: true
  allowedRegions:
    - eu-west-1          # Snowflake AWS Europe (Ireland)
    - eu-central-1       # Snowflake AWS Europe (Frankfurt)
    - europe-west4       # Snowflake GCP Europe (Netherlands)
  deniedRegions:
    - us-east-1
    - us-west-2
    - us-central1
  crossBorderTransfer: false
  transferMechanisms:
    - SCCs
  regulatoryFramework:
    - GDPR
    - SOC2
  enforcementMode: advisory
  validationRequired: true

# ── Access Policy: Snowflake RBAC ─────────────────────────────
accessPolicy:
  grants:
    - principal: "role:DATA_ANALYST"
      permissions: [read, select, query]

    - principal: "role:FINANCE_ANALYST"
      permissions: [read, select]

    - principal: "role:TRADER"
      permissions: [read, select, query]

    - principal: "role:DATA_ENGINEER"
      permissions: [write, insert, update, delete, create]

    - principal: "user:looker_service@company.com"
      permissions: [read, select]

# ── Expose: Snowflake Table ───────────────────────────────────
exposes:
  - exposeId: bitcoin_prices_table
    title: "Bitcoin Real-time Price Feed"
    version: "1.0.0"
    kind: table

    binding:
      platform: snowflake
      format: snowflake_table
      location:
        account: "{{ env.SNOWFLAKE_ACCOUNT }}"
        database: "CRYPTO_DATA"
        schema: "MARKET_DATA"
        table: "BITCOIN_PRICES"
      properties:
        cluster_by: ["price_timestamp"]
        table_type: "STANDARD"
        data_retention_time_in_days: 7
        change_tracking: true

    # Governance policies
    policy:
      classification: Internal
      authn: snowflake_rbac
      authz:
        readers:
          - role:DATA_ANALYST
          - role:FINANCE_ANALYST
          - role:TRADER
        writers:
          - role:DATA_ENGINEER
        columnRestrictions:
          - principal: "role:JUNIOR_ANALYST"
            columns: [market_cap_usd, volume_24h_usd]
            access: deny
      privacy:
        masking:
          - column: "ingestion_timestamp"
            strategy: "hash"
            params:
              algorithm: "SHA256"
        rowLevelPolicy:
          expression: >
            price_timestamp >= DATEADD(day, -30, CURRENT_TIMESTAMP())

    # Schema contract
    contract:
      schema:
        - name: price_timestamp
          type: TIMESTAMP_NTZ
          required: true
          description: UTC timestamp when price was recorded
          sensitivity: cleartext
          semanticType: "timestamp"

        - name: price_usd
          type: NUMBER(18,2)
          required: true
          description: Bitcoin price in USD
          sensitivity: cleartext
          semanticType: "currency"

        - name: price_eur
          type: NUMBER(18,2)
          required: false
          description: Bitcoin price in EUR

        - name: price_gbp
          type: NUMBER(18,2)
          required: false
          description: Bitcoin price in GBP

        - name: market_cap_usd
          type: NUMBER(20,2)
          required: false
          description: Total market capitalization in USD
          sensitivity: internal

        - name: volume_24h_usd
          type: NUMBER(20,2)
          required: false
          description: 24-hour trading volume in USD
          sensitivity: internal

        - name: price_change_24h_pct
          type: NUMBER(10,4)
          required: false
          description: 24-hour price change percentage

        - name: last_updated
          type: TIMESTAMP_NTZ
          required: false
          description: Timestamp from CoinGecko API

        - name: ingestion_timestamp
          type: TIMESTAMP_NTZ
          required: true
          description: When data was ingested into our system

# ── Build: API Ingestion ──────────────────────────────────────
builds:
  - id: bitcoin_price_ingestion
    description: Fetch Bitcoin prices from CoinGecko API
    pattern: hybrid-reference
    engine: python
    repository: ./runtime
    properties:
      model: ingest
    execution:
      trigger:
        type: manual
        iterations: 1
        delaySeconds: 3
      runtime:
        platform: snowflake
        resources:
          warehouse: "COMPUTE_WH"
          warehouse_size: "X-SMALL"
      retries:
        count: 3
        backoff: exponential
    outputs:
      - bitcoin_prices_table

Key Schema Patterns

The 0.7.1 binding schema uses three fields to identify platform resources:

FieldPurposeSnowflake Values
binding.platformCloud providersnowflake
binding.formatStorage formatsnowflake_table
binding.locationResource coordinatesaccount, database, schema, table

This is identical to GCP (platform: gcp, format: bigquery_table) and AWS (platform: aws, format: parquet).

CLI Commands

Every normal Snowflake provider command is autodetected from binding.platform, so --provider snowflake is not required for plan, apply, verify, or test.

# Validate Snowflake connectivity with the same config the provider uses
fluid auth status snowflake

# Validate contract shape
fluid validate contract.fluid.yaml

# Generate execution plan
fluid plan contract.fluid.yaml --env dev --out plans/plan-dev.json

# Deploy database, schema, table, and build logic
fluid apply contract.fluid.yaml --env dev --yes

# Verify the deployed Snowflake object against the contract schema
fluid verify contract.fluid.yaml --strict

# Optional: run the live contract test flow
fluid test contract.fluid.yaml

# Validate governance declarations
fluid policy-check contract.fluid.yaml

# Compile RBAC / access bindings from accessPolicy grants
fluid policy-compile contract.fluid.yaml --env dev --out runtime/policy/bindings.json

# Apply RBAC bindings (dry-run or enforce)
fluid policy-apply runtime/policy/bindings.json --mode check
fluid policy-apply runtime/policy/bindings.json --mode enforce

# Generate Airflow DAG
fluid generate-airflow contract.fluid.yaml --out airflow-dags/bitcoin_snowflake.py

Recommended deployment gate for enterprise teams:

  1. fluid validate
  2. fluid plan
  3. fluid policy-check
  4. fluid policy-compile
  5. fluid apply
  6. fluid verify --strict
  7. optional fluid test

Every Snowflake session opened through the provider carries a QUERY_TAG so statements can be attributed in Snowflake QUERY_HISTORY. In practice this means plan/apply/verify traffic can be traced back to the contract and environment that issued it.

RBAC Policy Compilation

fluid policy-compile reads accessPolicy.grants and generates Snowflake GRANT statements:

{
  "provider": "snowflake",
  "bindings": [
    {
      "role": "role:DATA_ANALYST",
      "resource": "bitcoin_prices_table",
      "permissions": [
        "SELECT on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES",
        "USAGE on DATABASE CRYPTO_DATA",
        "USAGE on SCHEMA CRYPTO_DATA.MARKET_DATA"
      ]
    },
    {
      "role": "role:DATA_ENGINEER",
      "resource": "bitcoin_prices_table",
      "permissions": [
        "INSERT on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES",
        "UPDATE on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES",
        "DELETE on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES"
      ]
    }
  ]
}

The permission mapping:

Contract PermissionSnowflake GRANT
read, select, querySELECT on table, USAGE on database + schema
write, insertINSERT on table
updateUPDATE on table
deleteDELETE on table

Governance Scope

Use the governance commands this way:

  • fluid policy-check validates governance declarations in the contract.
  • fluid policy-compile and fluid policy-apply manage Snowflake RBAC and access-policy bindings.
  • Snowflake governance during apply handles object-level controls such as tags, descriptions, and masking policies.
  • fluid verify checks deployed schema and drift. It does not perform a full RBAC or entitlement audit.

Credentials Setup

Jenkins CI (Recommended)

Create a Jenkins Secret File credential containing your Snowflake env vars:

# File contents (plain key=value, no 'export' prefix)
SNOWFLAKE_ACCOUNT=xy12345.eu-central-1
SNOWFLAKE_USER=FLUID_SERVICE
SNOWFLAKE_PASSWORD=xxxxxxxxxx
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_ROLE=SYSADMIN
SNOWFLAKE_DATABASE=CRYPTO_DATA
SNOWFLAKE_SCHEMA=MARKET_DATA

The Universal Pipeline auto-detects this format and sources it into every stage. No provider-specific credential logic.

Local Development

# .env file (same format as Jenkins)
cat > .env << 'EOF'
SNOWFLAKE_ACCOUNT=xy12345.eu-central-1
SNOWFLAKE_USER=FLUID_SERVICE
SNOWFLAKE_PASSWORD=xxxxxxxxxx
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_ROLE=SYSADMIN
SNOWFLAKE_DATABASE=CRYPTO_DATA
SNOWFLAKE_SCHEMA=MARKET_DATA
EOF

# Source and run
set -a; . .env; set +a
fluid auth status snowflake
fluid plan contract.fluid.yaml --env dev --out runtime/plan.json
fluid apply contract.fluid.yaml --env dev --yes
fluid verify contract.fluid.yaml --strict

Infrastructure Created

When you run fluid apply on a Snowflake contract, the provider creates:

ResourceDetails
DatabaseCRYPTO_DATA
SchemaCRYPTO_DATA.MARKET_DATA
TableCRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES — clustered by price_timestamp
WarehouseCOMPUTE_WH (X-SMALL) — used for queries and ingestion

What the Pipeline Produces

After a successful run, the pipeline inserts real data:

SELECT price_timestamp, price_usd, price_eur, market_cap_usd
FROM CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES
ORDER BY price_timestamp DESC
LIMIT 5;
┌──────────────────────┬───────────┬───────────┬────────────────┐
│ PRICE_TIMESTAMP      │ PRICE_USD │ PRICE_EUR │ MARKET_CAP_USD │
├──────────────────────┼───────────┼───────────┼────────────────┤
│ 2025-01-30 14:30:52  │ 104809.00 │  96543.00 │ 2075000000.00  │
└──────────────────────┴───────────┴───────────┴────────────────┘

Governance Features

Data Sovereignty

The sovereignty block enforces region restrictions before any infrastructure is deployed:

sovereignty:
  jurisdiction: "EU"
  allowedRegions: [eu-west-1, eu-central-1, europe-west4]
  deniedRegions: [us-east-1, us-west-2, us-central1]
  crossBorderTransfer: false
  regulatoryFramework: [GDPR, SOC2]
  enforcementMode: advisory  # or strict (blocks deployment)

Column-Level Security

Restrict specific columns from specific roles:

authz:
  columnRestrictions:
    - principal: "role:JUNIOR_ANALYST"
      columns: [market_cap_usd, volume_24h_usd]
      access: deny

Privacy Masking

Hash sensitive fields and enforce retention policies:

privacy:
  masking:
    - column: "ingestion_timestamp"
      strategy: "hash"
      params:
        algorithm: "SHA256"
  rowLevelPolicy:
    expression: "price_timestamp >= DATEADD(day, -30, CURRENT_TIMESTAMP())"

Snowflake-Native Security

The contract's governance maps naturally to Snowflake's built-in features:

Contract FeatureSnowflake Implementation
accessPolicy.grantsGRANT SELECT/INSERT/UPDATE ON TABLE ... TO ROLE ...
columnRestrictionsDynamic Data Masking policies
rowLevelPolicyRow Access Policies
sovereignty.allowedRegionsAccount region validation
classificationObject tagging via TAG

CI/CD Pipeline

The Snowflake example uses the exact same Jenkinsfile as GCP and AWS — the Universal Pipeline. Key stages:

StageCommandWhat Happens
Validatefluid validateContract checked against 0.7.1 schema
Exportfluid odps export / fluid odcs exportStandards files generated
Compile RBACfluid policy-compileaccessPolicy → Snowflake GRANT bindings
Planfluid planExecution plan generated
Applyfluid applyDatabase + schema + table created
Apply RBACfluid policy-applyRBAC grants enforced
Executefluid executeingest.py runs, inserts rows to Snowflake
Airflow DAGfluid generate-airflowProduction DAG generated

Snowflake Table Properties

The binding.properties block supports Snowflake-specific table features:

binding:
  platform: snowflake
  format: snowflake_table
  location:
    database: "ANALYTICS"
    schema: "MARTS"
    table: "CUSTOMER_METRICS"
  properties:
    cluster_by: ["customer_id", "order_date"]
    table_type: "STANDARD"              # STANDARD or TRANSIENT
    data_retention_time_in_days: 7      # Time Travel retention
    change_tracking: true               # Enable CDC streams

See Also

  • Snowflake Team Collaboration Walkthrough - Role-based PR review example for Snowflake teams
  • Universal Pipeline — Same Jenkinsfile for every provider
  • AWS Provider — Amazon Web Services integration
  • GCP Provider — Google Cloud Platform integration
  • CLI Reference — Full command documentation
Edit this page on GitHub
Last Updated: 4/4/26, 10:47 PM
Contributors: khanya_ai, fas89
Prev
AWS Provider
Next
Local Provider