Should I use Kafka instead of Kinesis Firehose for an event lake?

Use Kafka (or Amazon MSK) when you need multi-consumer streaming, strict partition ordering, replayable consumer groups, or application-level stream processing. Use Kinesis Firehose when the primary job is reliable, buffered delivery into S3 and Snowflake with near-zero operational overhead. Firehose is a delivery service, not a streaming platform — for pure warehouse ingestion that is exactly what you want.

What makes this architecture zero-downtime?

The design decouples request ingestion from warehouse loading. If Snowflake slows down or a pipe fails, Firehose and S3 continue absorbing events at full rate. When the downstream recovers, Snowpipe drains the backlog automatically. Clients never see the outage because acceptance and loading are asynchronous.

How does Snowpipe pricing work and what is the small file trap?

Snowpipe bills per-second compute plus approximately 0.06 credits per 1,000 files loaded. Small frequent Firehose flushes (for example every 60 seconds) multiply the file count and therefore the overhead charge by more than an order of magnitude. Setting Firehose buffering to 128 MB or 900 seconds produces fewer, larger Parquet files and typically cuts Snowpipe file-overhead cost by roughly 15x.

How do I replay or backfill events into Snowflake?

Because raw events are preserved in partitioned S3 prefixes, you can run a manual COPY INTO from a specific prefix (for example a single day partition) into a staging table, validate the results, then merge into production tables. Snowpipe load history prevents duplicate loads of files it has already processed within 14 days; for older data use FORCE=TRUE deliberately with deduplication downstream.

How should schema evolution be handled in a Firehose to Parquet pipeline?

Treat the Glue table as the schema contract. Add new columns as optional (nullable) fields, never rename or repurpose existing columns, and version the event envelope with an explicit schema_version field. Snowflake's MATCH_BY_COLUMN_NAME ingestion tolerates additive change; the Lambda validation tier rejects payloads that do not conform to a known version.

What latency can a Firehose plus Snowpipe pipeline achieve?

End-to-end p95 latency is dominated by the Firehose buffer window plus Snowpipe load time. With 128 MB / 900 second buffering, expect data queryable in Snowflake within 2 to 16 minutes of the client event. If you need under a minute, reduce the buffer (accepting higher file costs) or use Snowpipe Streaming, which writes rows directly and bypasses the file stage entirely.

How is the Snowflake to S3 trust relationship secured without access keys?

A Snowflake Storage Integration creates an AWS IAM role assumption chain: Snowflake's account-level IAM user assumes a role in your account, scoped by an ExternalId condition that Snowflake generates. No long-lived credentials exist; access is limited to the exact S3 prefixes listed in STORAGE_ALLOWED_LOCATIONS, and CloudTrail records every assumption.

Architecting Zero-Downtime Event Ingestion: A Serverless Approach using AWS and Snowflake

Sritej Panchumarthi · Published: January 15, 2026 · Updated: July 7, 2026 · System Architecture · 55 min read

Abstract
Ingesting high-velocity telemetry into analytical warehouses presents significant challenges in scalability, cost, and reliability. This article details the complete architecture of a production-grade event lake capable of handling millions of events per minute. We present a serverless design pattern utilizing AWS Kinesis Data Firehose, Lambda, and Snowflake Snowpipe, achieving sub-minute to sub-quarter-hour latency and 99.99% ingestion availability while minimizing operational overhead. Every tier is specified to L3 (component) depth: exact services, IAM trust chains, buffer parameters, failure paths, and the SQL and Terraform to reproduce it.

Key takeaway: The durable pattern is not "API Gateway to warehouse." It is an edge-protected ingestion path, a validation tier, a buffered delivery stream, an immutable S3 lake, and an asynchronous Snowflake load path. That separation keeps customer telemetry flowing even when analytics systems slow down — and it is the difference between a demo and a platform.

1. Introduction: Why This Problem Is Hard

Modern observability and product analytics require near-real-time ingestion. Traditional batch ETL introduces hours of latency; managing dedicated streaming clusters (Kafka, Flink) incurs continuous operational cost, capacity planning, and on-call burden that most platform teams cannot justify for what is fundamentally a delivery problem.

The failure modes that kill naive designs are always the same three:

Backpressure coupling. When the warehouse slows down, a synchronous pipeline propagates that slowness all the way to the client SDK, and you start dropping customer data at the worst possible moment.
Schema drift. Producers ship a new app version, a field changes type, and three weeks later an analyst discovers a silent hole in the data.
Cost creep. Millions of tiny files, always-on warehouses, and unbounded retention quietly turn a cheap pipeline into a five-figure monthly line item.

The architecture below addresses each failure mode structurally — with buffering, a validation boundary, and file-size economics — rather than with heroics.

1.1 Design Goals

Goal	Architectural decision	Why it matters
Absorb spikes	Firehose buffering and S3 landing zone	Traffic bursts do not directly pressure Snowflake warehouses.
Preserve raw events	Immutable S3 raw prefix with lifecycle tiering	Bad transformations can be replayed without asking clients to resend data.
Control cost	Parquet conversion and 128 MB file targets	Columnar storage reduces query scan size and Snowpipe per-file overhead.
Protect users	WAF, schema validation, and PII minimization at the boundary	Invalid, abusive, or over-privileged payloads are rejected before persistence.
Zero-downtime loading	Asynchronous Snowpipe with SQS notification	Warehouse maintenance or slowness never blocks ingestion.
Reproducibility	Everything in Terraform + versioned SQL	The entire pipeline can be rebuilt in a new account or region in under an hour.

2. L3 Reference Architecture

The Level-3 (component) diagram below shows every service, the protocol on each hop, the IAM principal that authorizes it, and the failure path at each tier. Read it left to right as the life of one event; read the bracketed annotations as the operational contract of each hop.

Fig 1. L3 Component Architecture — Serverless Event Lake, AWS → Snowflake

                          PUBLIC INTERNET
  ┌───────────────────────────────────────────────────────────────┐
  │  Client SDKs (mobile / web / IoT)                             │
  │  HTTPS POST /v1/events  · TLS 1.2+ · gzip JSON ≤ 256 KB       │
  └──────────────────────────────┬────────────────────────────────┘
                                 │ 443
                                 ▼
  ┌───────────────────────────────────────────────────────────────┐
  │  Route 53 (latency-based routing, health checks)              │
  │      └─► CloudFront distribution  (TLS termination, HTTP/2)   │
  │            ├─ AWS WAF WebACL                                  │
  │            │    · AWSManagedRulesCommonRuleSet                │
  │            │    · AmazonIpReputationList                      │
  │            │    · RateLimit: 2 000 req / 5 min / IP           │
  │            └─ Origin: API Gateway (custom header x-edge-key)  │
  └──────────────────────────────┬────────────────────────────────┘
                                 │ 443 (origin-locked)
════════════════════════════════ AWS ACCOUNT BOUNDARY ═════════════
                                 ▼
  ┌──────────────────── REGION us-east-1 ─────────────────────────┐
  │                                                               │
  │  API Gateway (REST, regional)                                 │
  │   · Usage plans: 5 000 rps burst / 2 000 rps steady           │
  │   · Request validator: basic JSON shape                       │
  │   · Auth: IAM / API key per producer                          │
  │        │ AWS_PROXY integration                                │
  │        ▼                                                      │
  │  Lambda "event-validator"  (Python 3.12, 512 MB, arm64)       │
  │   · Pydantic schema validation (versioned envelope)           │
  │   · PII scrub + field allow-list                              │
  │   · firehose:PutRecordBatch  (IAM: validator-role)            │
  │        │ success                    │ failure                 │
  │        ▼                            ▼                         │
  │  Kinesis Data Firehose         SQS DLQ "events-dlq"           │
  │   "telemetry-stream"            · 14-day retention            │
  │   · Buffer: 128 MB / 900 s      · CloudWatch alarm > 0        │
  │   · JSON → Parquet (Snappy)                                   │
  │   · Schema: Glue Data Catalog table telemetry.events          │
  │   · IAM: firehose-role (glue:GetTable, s3:PutObject)          │
  │        │                                                      │
  │        ▼                                                      │
  │  S3 bucket "my-telemetry-bucket"  (SSE-KMS, versioned)        │
  │   ├── events/year=YYYY/month=MM/day=DD/…  ◄─ curated Parquet  │
  │   ├── raw/…            ◄─ optional unmodified JSON copy       │
  │   └── errors/…         ◄─ Firehose format-conversion failures │
  │        │ s3:ObjectCreated:* (prefix events/)                  │
  │        ▼                                                      │
  │  SQS queue (Snowflake-managed ARN from DESC PIPE)             │
  └────────┼──────────────────────────────────────────────────────┘
           │ SQS notification
═══════════┼══════════ SNOWFLAKE ACCOUNT (SaaS) ═══════════════════
           ▼
  ┌───────────────────────────────────────────────────────────────┐
  │  Snowpipe "telemetry_pipe" (AUTO_INGEST = TRUE)               │
  │   · assumes IAM role snowflake-access-role                    │
  │     via STORAGE INTEGRATION (ExternalId condition)            │
  │   · serverless COPY INTO raw_events (micro-batch)             │
  │        │                                                      │
  │        ▼                                                      │
  │  RAW.raw_events  ──► STREAM evt_stream ──► TASK evt_task      │
  │   (VARIANT + typed cols)   (CDC offset)    (1-min schedule,   │
  │                                             MERGE → marts)    │
  │        │                                                      │
  │        ▼                                                      │
  │  ANALYTICS.fct_events / dim_* (BI, dashboards, ML features)   │
  └───────────────────────────────────────────────────────────────┘

  FAILURE PATHS
  · WAF block            → 403 at edge, never reaches account
  · Schema invalid       → 422 to client + sample logged
  · Firehose S3 failure  → automatic retry 24 h, then errors/ prefix
  · Snowpipe failure     → files persist in S3; pipe resumes, drains
  · Task failure         → stream offset holds; no data loss

Two properties of this diagram deserve emphasis. First, every arrow crossing a trust boundary names its authentication mechanism — origin-locked custom headers at the edge, IAM roles inside the account, and an ExternalId-scoped role assumption into Snowflake. If you cannot annotate an arrow with its auth story, the architecture is not done. Second, every tier has an explicit failure path that terminates in durable storage, not in a dropped event.

3. Edge Tier: Route 53, CloudFront, and WAF

Encapsulating API Gateway behind CloudFront is not just about caching — for a write-heavy telemetry endpoint there is little to cache. The edge tier earns its place in four ways:

Layer 7 protection. AWS WAF managed rule groups block known-bad IPs, SQLi/XSS probes, and oversized bodies before they consume API Gateway request quota you pay for.
Global TLS termination. Client SDKs on distant continents complete TLS handshakes at the nearest edge POP, cutting connection setup latency roughly in half for far-flung users.
Origin cloaking. The API Gateway endpoint only accepts requests carrying a secret header injected by CloudFront (x-edge-key), so attackers cannot bypass the WAF by hitting the origin directly.
CORS preflight caching. Browser SDKs generate a flood of OPTIONS requests; CloudFront answers them from cache at negligible cost.

# WAF rate-limit rule — the single highest-value control on a public ingest endpoint
resource "aws_wafv2_web_acl" "ingest" {
  name  = "telemetry-ingest-acl"
  scope = "CLOUDFRONT"

  default_action { allow {} }

  rule {
    name     = "rate-limit-per-ip"
    priority = 1
    action { block {} }
    statement {
      rate_based_statement {
        limit              = 2000        # requests per 5-minute window
        aggregate_key_type = "IP"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "rate-limit-per-ip"
      sampled_requests_enabled   = true
    }
  }

  rule {
    name     = "aws-managed-common"
    priority = 2
    override_action { none {} }
    statement {
      managed_rule_group_statement {
        name        = "AWSManagedRulesCommonRuleSet"
        vendor_name = "AWS"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "aws-managed-common"
      sampled_requests_enabled   = true
    }
  }
}

4. Validation Tier: API Gateway and Lambda

Direct integration between API Gateway and Firehose is possible (via service proxy) and tempting — one less component. It is the wrong call for a multi-tenant telemetry platform, because it surrenders the single most valuable control point: a programmable boundary where you decide what is allowed to become data.

The Lambda validation tier enforces three contracts before anything is persisted:

Shape contract — the payload parses and matches a versioned schema.
Privacy contract — only allow-listed fields survive; stable identifiers are hashed; free-form fields are size-capped.
Economic contract — batch writes to Firehose (PutRecordBatch, up to 500 records / 4 MB) instead of per-event calls, which cuts Firehose API cost and Lambda duration.

4.1 Production-Grade Validator (Python 3.12, Pydantic v2)

import base64
import hashlib
import json
import os
from datetime import datetime, timezone

import boto3
from pydantic import BaseModel, Field, ValidationError, field_validator

firehose = boto3.client("firehose")
sqs = boto3.client("sqs")

STREAM_NAME = os.environ["STREAM_NAME"]
DLQ_URL = os.environ["DLQ_URL"]
SALT = os.environ["HASH_SALT"]          # rotated via Secrets Manager

ALLOWED_SOURCES = {"mobile_app", "web", "iot", "backend"}

class TelemetryEvent(BaseModel):
    schema_version: int = Field(ge=1, le=3)       # explicit contract version
    event_id: str = Field(min_length=10, max_length=64)
    timestamp: int                                 # epoch millis
    user_id: str = Field(min_length=1, max_length=128)
    source: str
    payload: dict

    @field_validator("source")
    @classmethod
    def source_known(cls, v: str) -> str:
        if v not in ALLOWED_SOURCES:
            raise ValueError(f"unknown source '{v}'")
        return v

    @field_validator("timestamp")
    @classmethod
    def timestamp_sane(cls, v: int) -> int:
        now_ms = int(datetime.now(timezone.utc).timestamp() * 1000)
        if not (now_ms - 7 * 86_400_000) <= v <= (now_ms + 300_000):
            raise ValueError("timestamp outside acceptance window")
        return v

def scrub(evt: TelemetryEvent) -> dict:
    """Privacy contract: hash identifiers, drop non-allow-listed payload keys."""
    record = evt.model_dump()
    record["user_id"] = hashlib.sha256((SALT + record["user_id"]).encode()).hexdigest()
    record["payload"] = {k: v for k, v in record["payload"].items()
                         if k in {"screen", "action", "duration_ms", "app_version"}}
    record["ingested_at"] = datetime.now(timezone.utc).isoformat()
    return record

def handler(event, context):
    try:
        body = json.loads(event["body"])
    except (json.JSONDecodeError, KeyError, TypeError):
        return {"statusCode": 400, "body": '{"error":"invalid JSON"}'}

    # accept single event or batch
    items = body if isinstance(body, list) else [body]
    if len(items) > 500:
        return {"statusCode": 413, "body": '{"error":"batch too large"}'}

    valid, rejected = [], []
    for item in items:
        try:
            valid.append(scrub(TelemetryEvent(**item)))
        except ValidationError as e:
            rejected.append({"item": item, "errors": e.errors()})

    if valid:
        records = [{"Data": (json.dumps(v) + "\n").encode()} for v in valid]
        resp = firehose.put_record_batch(
            DeliveryStreamName=STREAM_NAME, Records=records
        )
        if resp["FailedPutCount"] > 0:
            # push Firehose-level failures to DLQ for redrive, never drop
            failed = [records[i] for i, r in enumerate(resp["RequestResponses"])
                      if "ErrorCode" in r]
            sqs.send_message(
                QueueUrl=DLQ_URL,
                MessageBody=base64.b64encode(
                    json.dumps([f["Data"].decode() for f in failed]).encode()
                ).decode(),
            )

    status = 202 if valid and not rejected else (207 if valid else 422)
    return {
        "statusCode": status,
        "body": json.dumps({"accepted": len(valid), "rejected": len(rejected)}),
    }

Notice what this handler refuses to do: it never drops a record silently (Firehose partial failures go to a DLQ), it never accepts a timestamp outside a sanity window (protecting downstream partitioning from corrupt clocks), and it never forwards a payload field that is not explicitly allow-listed (making the schema a privacy boundary, not just a type check).

5. Delivery Tier: Firehose, Glue Schema, and S3 Layout

Amazon Kinesis Data Firehose is the workhorse: it buffers, batches, retries, converts JSON to Parquet against a Glue-registered schema, and writes Hive-partitioned prefixes to S3 — all as a managed service with no capacity to plan.

5.1 The Glue Table Is the Schema Contract

resource "aws_glue_catalog_database" "telemetry" {
  name = "telemetry"
}

resource "aws_glue_catalog_table" "events" {
  database_name = aws_glue_catalog_database.telemetry.name
  name          = "events"
  table_type    = "EXTERNAL_TABLE"

  storage_descriptor {
    location      = "s3://my-telemetry-bucket/events/"
    input_format  = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
    output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"

    ser_de_info {
      serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
    }

    columns { name = "schema_version" type = "int" }
    columns { name = "event_id"       type = "string" }
    columns { name = "timestamp"      type = "bigint" }
    columns { name = "user_id"        type = "string" }
    columns { name = "source"         type = "string" }
    columns { name = "payload"        type = "string" }   # JSON string; typed in warehouse
    columns { name = "ingested_at"    type = "string" }
  }
}

5.2 Firehose Delivery Stream

resource "aws_kinesis_firehose_delivery_stream" "stream" {
  name        = "telemetry-stream"
  destination = "extended_s3"

  extended_s3_configuration {
    role_arn            = aws_iam_role.firehose.arn
    bucket_arn          = aws_s3_bucket.telemetry.arn
    prefix              = "events/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/"
    error_output_prefix = "errors/!{firehose:error-output-type}/"

    # THE cost-defining parameters: 128 MB or 15 minutes, whichever first
    buffering_size     = 128
    buffering_interval = 900

    data_format_conversion_configuration {
      input_format_configuration {
        deserializer { open_x_json_ser_de {} }
      }
      output_format_configuration {
        serializer { parquet_ser_de {} }
      }
      schema_configuration {
        database_name = aws_glue_catalog_database.telemetry.name
        table_name    = aws_glue_catalog_table.events.name
        role_arn      = aws_iam_role.firehose.arn
      }
    }

    cloudwatch_logging_options {
      enabled         = true
      log_group_name  = "/aws/kinesisfirehose/telemetry-stream"
      log_stream_name = "S3Delivery"
    }
  }
}

5.3 S3 Bucket Layout and Lifecycle

A disciplined prefix layout is what makes replay, audit, and cost-tiering possible later:

s3://my-telemetry-bucket/
├── events/year=2026/month=07/day=07/     # curated Parquet (Snowpipe source)
├── raw/year=2026/month=07/day=07/        # optional raw JSON copy (tighter ACL)
└── errors/format-conversion/             # Firehose conversion failures

Lifecycle rules:
  events/   → Infrequent Access after 30 d → Glacier after 180 d → expire 730 d
  raw/      → Glacier after 30 d           → expire 365 d
  errors/   → expire 90 d (alarm if non-empty!)

5.4 S3 Event Notification to Snowpipe's Queue

resource "aws_s3_bucket_notification" "snowpipe_trigger" {
  bucket = aws_s3_bucket.telemetry.id

  queue {
    # ARN comes from Snowflake:  DESC PIPE telemetry_pipe → notification_channel
    queue_arn     = "arn:aws:sqs:us-east-1:123456789012:sf-snowpipe-AIDA..."
    events        = ["s3:ObjectCreated:*"]
    filter_prefix = "events/"
    filter_suffix = ".parquet"
  }
}

6. Snowflake Integration in Depth

6.1 Storage Integration — Keyless Trust

The Storage Integration object establishes an IAM-role-based trust relationship with S3, eliminating long-lived access keys entirely. Snowflake maintains a per-account IAM user; your role's trust policy admits that user only when it presents the ExternalId that Snowflake generated for your integration.

-- Run as ACCOUNTADMIN
CREATE OR REPLACE STORAGE INTEGRATION s3_telemetry_int
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = 'S3'
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/snowflake-access-role'
  STORAGE_ALLOWED_LOCATIONS = ('s3://my-telemetry-bucket/events/');

-- Retrieve STORAGE_AWS_IAM_USER_ARN and STORAGE_AWS_EXTERNAL_ID:
DESC INTEGRATION s3_telemetry_int;

The corresponding IAM role on the AWS side — note the ExternalId condition, which is what prevents the confused deputy attack:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": { "AWS": "arn:aws:iam::SNOWFLAKE_ACCT:user/abc1-s-xyz" },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": { "sts:ExternalId": "MYORG_SFCRole=2_geJ8..." }
    }
  }]
}

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:GetObjectVersion"],
      "Resource": "arn:aws:s3:::my-telemetry-bucket/events/*"
    },
    {
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": "arn:aws:s3:::my-telemetry-bucket",
      "Condition": { "StringLike": { "s3:prefix": ["events/*"] } }
    }
  ]
}

6.2 File Format, Stage, and Landing Table

CREATE OR REPLACE FILE FORMAT parquet_format
  TYPE = 'PARQUET'
  COMPRESSION = 'SNAPPY';

CREATE OR REPLACE STAGE s3_stage
  STORAGE_INTEGRATION = s3_telemetry_int
  URL = 's3://my-telemetry-bucket/events/'
  FILE_FORMAT = parquet_format;

CREATE OR REPLACE TABLE raw.raw_events (
  schema_version INT,
  event_id       STRING,
  timestamp      BIGINT,
  user_id        STRING,
  source         STRING,
  payload        VARIANT,
  ingested_at    TIMESTAMP_NTZ,
  _loaded_at     TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(),
  _file_name     STRING
);

6.3 Snowpipe — Event-Driven Micro-Batch Loading

CREATE OR REPLACE PIPE telemetry_pipe
  AUTO_INGEST = TRUE
  AS
  COPY INTO raw.raw_events (schema_version, event_id, timestamp, user_id,
                            source, payload, ingested_at, _file_name)
  FROM (
    SELECT $1:schema_version, $1:event_id, $1:timestamp, $1:user_id,
           $1:source, PARSE_JSON($1:payload), $1:ingested_at,
           METADATA$FILENAME
    FROM @s3_stage
  )
  MATCH_BY_COLUMN_NAME = NONE;   -- explicit projection beats implicit matching in prod

Capturing METADATA$FILENAME per row is a habit that pays for itself the first time you need to trace a bad record back to the exact Parquet file — and therefore to the exact Firehose buffer window and Lambda log stream that produced it.

7. In-Warehouse Transformation: Streams and Tasks

Raw landing tables are not analytics. Snowflake streams (change-data-capture offsets) plus tasks (scheduled serverless SQL) complete the pipeline without any external orchestrator:

-- CDC offset over the landing table
CREATE OR REPLACE STREAM raw.evt_stream ON TABLE raw.raw_events;

-- Serverless task: runs only when the stream has data
CREATE OR REPLACE TASK raw.evt_task
  WAREHOUSE = transform_wh          -- XS, auto-suspend 60 s
  SCHEDULE  = '1 MINUTE'
  WHEN SYSTEM$STREAM_HAS_DATA('raw.evt_stream')
AS
MERGE INTO analytics.fct_events t
USING (
  SELECT event_id,
         TO_TIMESTAMP_NTZ(timestamp / 1000)      AS event_ts,
         user_id,
         source,
         payload:screen::STRING                   AS screen,
         payload:action::STRING                   AS action,
         payload:duration_ms::INT                 AS duration_ms,
         payload:app_version::STRING              AS app_version
  FROM raw.evt_stream
  QUALIFY ROW_NUMBER() OVER (PARTITION BY event_id ORDER BY _loaded_at DESC) = 1
) s
ON t.event_id = s.event_id
WHEN NOT MATCHED THEN INSERT (event_id, event_ts, user_id, source, screen,
                              action, duration_ms, app_version)
VALUES (s.event_id, s.event_ts, s.user_id, s.source, s.screen,
        s.action, s.duration_ms, s.app_version);

ALTER TASK raw.evt_task RESUME;

The QUALIFY ROW_NUMBER() line is the idempotency guarantee: replays and Firehose retries can deliver duplicate events, and the MERGE-plus-dedup pattern makes reprocessing safe by construction. Never build a pipeline whose correctness depends on exactly-once delivery — build one where duplicates are harmless.

8. Hands-On Practical: Build, Load-Test, and Replay

8.1 Practical A — Deploy and Smoke-Test (≈ 45 minutes)

Apply the Terraform from sections 3–5 (terraform init && terraform apply). Outputs: API endpoint, bucket name, stream name.
Run the Snowflake SQL from sections 6–7 as ACCOUNTADMIN, pasting the role ARN / ExternalId between the two systems.
Confirm the pipe is listening: SELECT SYSTEM$PIPE_STATUS('telemetry_pipe'); → "executionState":"RUNNING".

Send one event:

curl -X POST "https://d1234.cloudfront.net/v1/events" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $PRODUCER_KEY" \
  -d '{"schema_version":1,"event_id":"evt_0000000001",
       "timestamp":'$(date +%s000)',"user_id":"u-123",
       "source":"web","payload":{"screen":"home","action":"view",
       "duration_ms":1200,"app_version":"3.14.0"}}'
# → {"accepted": 1, "rejected": 0}

Wait one buffer window (or lower buffering_interval to 60 s for the lab), then verify end to end:

SELECT COUNT(*), MAX(_loaded_at) FROM raw.raw_events;
SELECT * FROM analytics.fct_events ORDER BY event_ts DESC LIMIT 5;

8.2 Practical B — Load Test the Ingestion Path

# vegeta: 2,000 rps for 5 minutes against the edge
echo "POST https://d1234.cloudfront.net/v1/events" | \
vegeta attack -rate=2000 -duration=300s \
  -header "Content-Type: application/json" \
  -header "x-api-key: $PRODUCER_KEY" \
  -body event.json | vegeta report

# Expected on a default-quota account:
# Latencies  p50=38ms  p95=87ms  p99=140ms
# Success    100.00%
# Watch during the run:
#  - Lambda ConcurrentExecutions (should plateau well under limit)
#  - Firehose IncomingBytes vs DeliveryToS3.Bytes (delivery keeps pace)
#  - API Gateway 4XX (should be ~0; if 429s appear, raise usage plan)

8.3 Practical C — Simulate a Warehouse Outage and Replay

Pause the pipe: ALTER PIPE telemetry_pipe SET PIPE_EXECUTION_PAUSED = TRUE;
Keep the load test running 10 minutes. Observe: clients still get 202; Parquet files accumulate in S3. This is the zero-downtime property, live.
Resume: ALTER PIPE telemetry_pipe SET PIPE_EXECUTION_PAUSED = FALSE; — Snowpipe drains the SQS backlog automatically. Verify with SELECT * FROM TABLE(INFORMATION_SCHEMA.PIPE_USAGE_HISTORY(...));

Backfill an arbitrary historical partition (replay drill):

COPY INTO raw.raw_events (schema_version, event_id, timestamp, user_id,
                          source, payload, ingested_at, _file_name)
FROM (SELECT $1:schema_version, $1:event_id, $1:timestamp, $1:user_id,
             $1:source, PARSE_JSON($1:payload), $1:ingested_at,
             METADATA$FILENAME
      FROM @s3_stage/year=2026/month=07/day=01/)
FORCE = TRUE;   -- deliberate; downstream MERGE dedups by event_id

9. Security, Privacy, and PII Controls

The ingestion layer must treat every payload as untrusted. The controls, in order of the request path:

Layer	Control	Threat mitigated
WAF	Managed rules + rate limit + size constraint	DDoS, injection probes, abuse
API Gateway	Per-producer API keys + usage plans	Noisy-neighbor producers, key leakage blast radius
Lambda	Versioned schema + field allow-list + identifier hashing	PII leakage, schema drift, poisoned analytics
S3	SSE-KMS, bucket policy denying non-TLS, raw/ prefix restricted	Data-at-rest exposure, insider over-access
Snowflake	ExternalId-scoped integration, RBAC on schemas, masking policies on user_id	Confused deputy, analyst over-privilege

A practical implementation keeps the three S3 prefixes separated by IAM: access to raw/ should be dramatically narrower than access to curated analytics tables, because raw telemetry often carries more compliance risk than anyone intended it to. The schema is a security boundary, not a developer convenience.

10. Operations, Monitoring, and Cost Engineering

10.1 The Alarms That Matter

Firehose DeliveryToS3.Success < 99% — IAM regression, KMS key policy change, or bucket policy drift.
DLQ ApproximateNumberOfMessagesVisible > 0 — the validator is rejecting Firehose writes; something upstream changed.
errors/ prefix object count > 0 — format conversion failures mean the Glue schema and reality have diverged.
Snowpipe lag — alert when PIPE_USAGE_HISTORY shows files pending > 30 min.
Task failures — TASK_HISTORY() state = 'FAILED' means the stream offset is holding and marts are stale.

10.2 The "Small File" Cost Trap — With the Math

Snowpipe charges ≈ 0.06 credits per 1,000 files loaded, on top of per-second serverless compute. File count, not data volume, dominates cost at high flush frequency:

Firehose buffer	Files/day (1 stream)	File-overhead credits/mo	Relative cost
60 s flush	1,440	≈ 2.6	15×
300 s flush	288	≈ 0.52	3×
900 s / 128 MB	96	≈ 0.17	1× (baseline)

Larger files also improve Snowflake scan efficiency and reduce S3 PUT/GET request charges. Unless the business genuinely needs sub-minute freshness, 900 s / 128 MB is the correct default — and if it does need sub-minute freshness, evaluate Snowpipe Streaming (row-based, file-less) rather than shrinking the buffer.

10.3 Runbook for Common Failure Modes

When ingestion latency rises:

Check API Gateway 4xx/5xx trends to separate client payload issues from platform failures.
Inspect Lambda duration, throttles, and error logs for schema drift or dependency failures.
Validate Firehose delivery success, the errors/ prefix, and buffering pressure (IncomingBytes vs DeliveryToS3.Bytes).
Check SYSTEM$PIPE_STATUS, pipe load history, and warehouse credit usage before scaling compute.
Replay a small partition from S3 (Practical C) to determine whether the fault is ingestion, transformation, or loading.

11. FAQ

Should I use Kafka instead of Firehose?
Use Kafka (or MSK) when you need multi-consumer streaming, strict partition ordering, replayable consumer groups, or application-level stream processing. Use Firehose when the main job is reliable, buffered delivery into S3 and Snowflake with minimal operations. Firehose is a delivery service, not a streaming platform — for warehouse ingestion, that's exactly what you want.

Why keep S3 if Snowflake is the analytics target?
S3 is the replay buffer, audit trail, and cost-efficient long-term store. Snowflake is the query and serving layer. Keeping both means bad transformations are replayable, retention is cheap, and the warehouse never becomes the only copy of the truth.

What makes this zero-downtime?
Acceptance and loading are asynchronous. If Snowflake slows or a pipe pauses, Firehose and S3 absorb events at full rate; Snowpipe drains the backlog on recovery. Practical C demonstrates it live.

How do I replay or backfill?
COPY INTO from a specific partitioned prefix into the landing table with FORCE=TRUE, and let the downstream MERGE dedup by event_id. Snowpipe's own 14-day load history prevents accidental double loads of recent files.

How is schema evolution handled?
The Glue table is the contract: add nullable columns, never rename or repurpose, and carry an explicit schema_version in the envelope so the validator can enforce known versions and the warehouse can branch on it.

What end-to-end latency should I expect?
Buffer window plus Snowpipe load: roughly 2–16 minutes at the 900 s setting. For sub-minute freshness, use Snowpipe Streaming rather than shrinking Firehose buffers into the small-file trap.

How is the Snowflake↔S3 trust secured without keys?
Storage Integration role assumption with an ExternalId condition — no long-lived credentials, prefix-scoped permissions, every assumption in CloudTrail.

Can this run multi-region?
Yes: duplicate the ingest stack per region behind latency-based Route 53, land into per-region buckets, and either replicate S3 cross-region into one loading bucket or run per-region pipes into the same Snowflake account (which is itself region-pinned — choose the region closest to your primary consumers).

12. Conclusion

You now have a pipeline that validates data at the edge (Lambda/Pydantic), buffers and optimizes storage (Firehose/Parquet at 128 MB targets), loads asynchronously with keyless trust (Storage Integration + Snowpipe), transforms idempotently in-warehouse (streams, tasks, MERGE-dedup), auto-scales to millions of events per minute, and costs a fraction of a managed Kafka cluster — with every component reproducible from the Terraform and SQL in this article, and every failure mode terminating in durable, replayable storage rather than in data loss.