Skip to content

Svix Webhook Migration - ADR

Status: In Progress Date: 2025-11-26 Updated: 2025-11-28 Author: System Architecture Supersedes: Current webhook-service architecture Related: 2025-11-07-websocket-removal-redis-events.md

Architecture Decision

Two separate webhook delivery channels serve different purposes:

ChannelUse CaseDelivery Method
Direct RESTInternal consumers (miniapp, monitor)Fire-and-forget HTTP
SvixExternal/Public API consumersReliable with retries, signatures

Rationale:

  • Internal communication doesn't need the overhead of Svix (extra network hop, cost per message)
  • External consumers need reliable delivery, signature verification, and self-service management
  • Phase 5 will migrate internal communication to Redis Streams for even lower latency

Implementation Progress

PhaseDescriptionStatus
Phase 0Remove webhook-service, add direct REST calls✅ Complete
Phase 1Add Svix SDK to emprops-api✅ Complete
Phase 2Integrate Svix alongside direct REST✅ Complete
Phase 3Make Svix required (not optional)✅ Complete
Phase 4Consumer App Portal + Testing✅ Complete
Phase 5Replace direct REST with Redis Streams (internal)📋 Planned

Phase 4 Completion Notes (2025-11-28)

  • Added Consumer App Portal to Monitor app (/webhook-events page)
  • Created /api/svix-portal endpoint in Monitor for portal URL generation
  • Added svix SDK to @emp/monitor dependencies
  • Environment configuration:
    • SVIX_APP_ID (required) - maps to SVIX_MASTER_APP_ID
    • SVIX_AUTH_TOKEN (secret) - Svix API token
  • Portal embedded as iframe with tabs UI (Events | Manage Endpoints)
  • Unit tests passing: 24 Svix-related tests (17 in svix-webhook-service.test.ts, 7 in miniapp-webhook.test.ts)
  • Live tested: Svix connection working, test message sent successfully

Phase 0 Completion Notes (2025-11-27)

  • Removed apps/webhook-service/ directory
  • Implemented direct REST webhook calls in apps/emprops-api/src/lib/miniapp-webhook.ts
  • Webhook payload format matches legacy webhook-service format
  • Monitor app receives and displays webhook events at /webhook-events
  • Both workflow_completed and workflow_failed events working

Phase 1 & 2 Completion Notes (2025-11-27)

  • Added svix SDK to emprops-api dependencies
  • Created apps/emprops-api/src/lib/svix-webhook-service.ts:
    • SvixWebhookService class with sendWorkflowCompleted() and sendWorkflowFailed()
    • Application and endpoint management helpers
  • Updated miniapp-webhook.ts to deliver via both channels:
    • Direct REST for internal consumers (miniapp, monitor)
    • Svix for external API consumers

Phase 3 Completion Notes (2025-11-27)

  • Made Svix required infrastructure (throws on startup if not configured)
  • Environment variables in config/environments/services/emprops-api.interface.ts:
    • SVIX_AUTH_TOKEN (secret, required) - Svix API token
    • SVIX_APP_ID (required) - Svix application identifier
    • SVIX_SERVER_URL (optional) - For self-hosted Svix
  • Removed isEnabled() checks - Svix is always enabled
  • Added comprehensive unit tests

Phase 5 Plan: Redis Streams for Internal Communication

Goal: Replace direct REST calls AND WebSocket connections with Redis Streams for internal consumers

Updated: 2025-11-28

Architecture Decision

The miniapp will directly consume Redis Streams published by the Job-Q API (worker). This eliminates:

  1. WebSocket connection between EmProps API ↔ Job-Q API
  2. EmProps API as a middleman for completion events
  3. The current pub/sub relay chain

New Architecture

┌─────────────┐     REST      ┌─────────────┐     REST      ┌─────────────┐
│   Miniapp   │ ───────────── │ EmProps API │ ───────────── │  Job-Q API  │
└─────────────┘               └─────────────┘               └─────────────┘
       │                                                           │
       │                                                           │
       │                         Redis                             │
       │                    ┌─────────────┐                        │
       │                    │  Streams    │                        │
       │                    └─────────────┘                        │
       │                           │                               │
       │    XREADGROUP             │         XADD                  │
       │    workflow:{id}          │         workflow:{id}         │
       └───────────────────────────┴───────────────────────────────┘

The flow:

  1. Miniapp → EmProps API (REST): Submit workflow request
  2. EmProps API → Job-Q API (REST): Forward job to queue
  3. Worker completes → writes to DB → XADD workflow:{workflow_id}
  4. Miniapp (XREADGROUP): Directly consumes workflow:{workflow_id} stream

Key Design Decision: Post-DB Persistence Events

Events are published to Redis Streams after DB persistence:

Worker completes → Redis HMSET (persist) → XADD workflow:{id} (notify)

This ensures:

  • What user sees matches what's in DB (consistency)
  • No "ghost results" if DB write fails
  • Clear source of truth

Stream Structure

# Per-workflow completion stream
workflow:events:{workflow_id}
  - entry_id: "*" (auto-generated)
  - fields:
    - event_type: "workflow.completed" | "workflow.failed"
    - workflow_id: string
    - status: "completed" | "failed"
    - result: JSON string (output URLs, etc.)
    - completed_at: ISO timestamp
    - error_message?: string (for failures)

# Optional: Progress stream (separate from completion)
workflow:progress:{workflow_id}
  - entry_id: "*"
  - fields:
    - progress: number (0-100)
    - message: string
    - step: number
    - total_steps: number

What Changes

EmProps API becomes stateless:

  • Pure request forwarder (no WS connections)
  • No longer relays completion events
  • Still handles Svix delivery for external API consumers

Miniapp gains:

  • Direct Redis stream subscription
  • Real-time completion events
  • Can optionally consume progress stream

Job-Q API/Worker:

  • Publishes to per-workflow stream after DB write
  • Same pattern as current redis.publish('complete_job', ...) but with streams

Benefits

  • Sub-millisecond latency (vs 50-200ms for HTTP)
  • Built-in persistence (events survive restarts)
  • Consumer groups for load balancing across miniapp instances
  • Already have Redis infrastructure
  • Simpler architecture (remove WS layer)

Implementation Steps

  1. Add XADD workflow:events:{workflow_id} after DB write in worker
  2. Create stream consumer in miniapp (XREADGROUP with block)
  3. Remove WebSocket subscription code from EmProps API
  4. Remove direct REST webhook calls to miniapp
  5. Keep Svix for external API consumers (unchanged)

Context

The emp-job-queue system currently uses a custom Redis-based webhook service (apps/webhook-service/) for delivering webhook notifications to external consumers. This creates several architectural issues that this ADR addresses.

Current Architecture

┌────────────────────────────────────────────────────────────────────┐
│                       Mini-app (Farcaster Frame)                    │
│  - Submits workflow request to emprops-api                         │
│  - Waits for webhook callback with results                         │
└────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────┐
│                    emprops-api (Next.js)                            │
│  - Entry point for mini-app requests                               │
│  - Submits workflow to emp-job-queue                               │
│  - Stores output when workflow completes                           │
│  - Has full context of original request                            │
└────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────┐
│                  lightweight-api-server (emp-job-queue)             │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  Workflow Completion Detection                                │ │
│  │  - Monitors job completions                                   │ │
│  │  - Queries EmProps API to verify workflow status              │ │
│  │  - Publishes workflow_completed to Redis                      │ │
│  └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘

                    Redis Pub/Sub Events
                    (workflow_completed, job_failed, etc.)

┌────────────────────────────────────────────────────────────────────┐
│                    webhook-service                                  │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  WebhookProcessor (Redis Subscriber)                          │ │
│  │  - Subscribes to: job_submitted, complete_job, job_failed    │ │
│  │  - Subscribes to: workflow_submitted, workflow_completed     │ │
│  └──────────────────────────────────────────────────────────────┘ │
│                                ↓                                    │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  WebhookNotificationService                                   │ │
│  │  - Filters events by registered webhooks                     │ │
│  │  - Creates webhook payloads                                   │ │
│  │  - Manages in-memory delivery queue                          │ │
│  │  - Fire-and-forget HTTP delivery with retries                │ │
│  └──────────────────────────────────────────────────────────────┘ │
│                                ↓                                    │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  HTTP Delivery                                                │ │
│  │  - 3 retry attempts with exponential backoff                 │ │
│  │  - Auto-disconnect after 10 consecutive failures             │ │
│  │  - In-memory queue (lost on restart)                         │ │
│  └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘

                    External Webhook Endpoints
                    (Mini-app callback URL)

Current Problems

1. Wrong Place for Webhook Trigger

  • Webhooks are triggered from emp-job-queue (lightweight-api-server) via Redis pub/sub
  • But emprops-api is the entry point - it has the original request context
  • emprops-api already knows when outputs are saved and available
  • Current flow adds unnecessary indirection through Redis events

2. Redundant Workflow Completion Checking in emp-job-queue

  • emp-job-queue monitors job completions and queries EmProps API to verify workflow status
  • This "checking work" exists solely to trigger Redis events for webhook-service
  • emprops-api already has this information when it saves the output
  • Eliminates unnecessary polling and Redis pub/sub hop

3. In-Memory Queue = Data Loss

  • Webhook delivery queue is in-memory (not persisted)
  • Server restart loses all pending webhooks
  • No dead-letter queue for failed deliveries
  • No replay capability for missed events

4. Limited Retry and Monitoring

  • 3 retry attempts with basic exponential backoff
  • No webhook delivery dashboard
  • No event history or audit trail
  • Manual debugging required for delivery failures

5. No Built-in Features

  • No signature verification for security
  • No rate limiting per endpoint
  • No event type filtering UI
  • No customer-facing webhook portal

6. Operational Overhead

  • Separate webhook-service to deploy and monitor
  • Custom Redis storage patterns for webhook configs
  • Custom metrics and alerting required

Current Webhook Payload Format

typescript
interface WebhookPayload {
  event_type: WebhookEventType;           // 'job_submitted', 'workflow_completed', etc.
  event_id: string;                       // Unique ID: wh_[timestamp]_[random]
  timestamp: number;                      // Unix ms
  webhook_id: string;                     // Registered webhook ID
  data: {
    job_id?: string;
    job_type?: string;
    job_status?: JobStatus;
    worker_id?: string;
    machine_id?: string;
    progress?: number;
    result?: unknown;
    error?: string;
    workflow_id?: string;
    workflow_priority?: number;
    workflow_datetime?: number;
    total_steps?: number;
    current_step?: number;
  };
  metadata?: {
    retry_attempt?: number;
    original_timestamp?: number;
  };
  parent_trace_context?: {
    trace_id?: string;
    span_id?: string;
    traceparent?: string;
  };
}

Current Event Types

  • job_submitted - New job added to queue
  • update_job_progress - Job progress update (0-100%)
  • complete_job - Job completed successfully
  • job_failed - Job failed with error
  • cancel_job - Job was cancelled
  • workflow_submitted - New workflow started
  • workflow_completed - All workflow steps completed
  • workflow_failed - Workflow failed

Decision

Replace the custom webhook-service with Svix, triggered directly from emprops-api when outputs are saved and available. Remove workflow completion checking logic from emp-job-queue that exists solely for Redis-based webhook triggering.

Key Changes

  1. Remove webhook-service entirely - No more custom webhook infrastructure
  2. Call Svix directly from emprops-api - When outputs are saved, send webhook immediately
  3. Remove Redis pub/sub hop for webhooks - No more publishing events just for webhook-service to consume
  4. Remove workflow completion checking from emp-job-queue - emprops-api knows when outputs are ready

Why emprops-api is the Right Place

  1. Entry point for mini-app - emprops-api receives the original request with full context
  2. Knows when outputs are saved - emprops-api stores workflow outputs, so it knows exactly when they're available
  3. Has original request context - Can include miniapp_user_id, callback URL, and original parameters
  4. Simpler flow - No Redis pub/sub, no separate webhook service, no polling

Proposed Architecture

┌────────────────────────────────────────────────────────────────────┐
│                       Mini-app (Farcaster Frame)                    │
│  - Submits workflow request to emprops-api                         │
│  - Receives webhook callback with results                          │
└────────────────────────────────────────────────────────────────────┘
                          ↓                              ↑
                    Submit Request              Svix Webhook Delivery
                          ↓                              ↑
┌────────────────────────────────────────────────────────────────────┐
│                    emprops-api (Next.js)                            │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  Workflow Request Handler                                     │ │
│  │  - Receives mini-app request                                 │ │
│  │  - Submits workflow to emp-job-queue                         │ │
│  │  - Stores original request context                           │ │
│  └──────────────────────────────────────────────────────────────┘ │
│                                ↓                                    │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  Output Handler (when workflow completes)                     │ │
│  │  - Receives output from emp-job-queue                        │ │
│  │  - Saves output to storage                                   │ │
│  │  - Calls Svix SDK to send webhook ← NEW                      │ │
│  └──────────────────────────────────────────────────────────────┘ │
│                                ↓                                    │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  SvixWebhookService (NEW)                                     │ │
│  │  - Svix SDK client initialization                            │ │
│  │  - sendWorkflowCompleted(workflowId, outputs)                │ │
│  │  - sendWorkflowFailed(workflowId, error)                     │ │
│  └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
            ↓                                        ↓
    Submit workflow                           Svix API Call
            ↓                                        ↓
┌───────────────────────────┐    ┌────────────────────────────────────┐
│   emp-job-queue           │    │         Svix Platform               │
│   (lightweight-api)       │    │  - Reliable delivery with retries  │
│   - Process workflow      │    │  - Signature verification (HMAC)   │
│   - Return outputs        │    │  - Event history and replay        │
│   - NO webhook logic      │    │  - Monitoring dashboard            │
└───────────────────────────┘    └────────────────────────────────────┘

What Gets Removed

  1. apps/webhook-service/ - Entire service directory
  2. Redis pub/sub for webhooks - No more workflow_completed, job_failed events
  3. Workflow completion checking in emp-job-queue - The logic that polls EmProps API to verify workflow status and publish Redis events
  4. Redis webhook storage - webhooks:registry, webhooks:active, etc.

What Gets Added

  1. Svix SDK integration in apps/emprops-api/ (or apps/emprops-studio/)
  2. SvixWebhookService class for sending webhooks
  3. Webhook call in output handler - Right after saving output, call Svix
  4. Event type definitions in Svix dashboard

Svix Integration Design

Event Types (Svix)

Map current webhook events to Svix event types:

Current EventSvix Event TypeDescription
workflow_completedworkflow.completedWorkflow finished successfully
workflow_failedworkflow.failedWorkflow failed with error
job_submittedjob.submittedIndividual job queued
complete_jobjob.completedIndividual job finished
job_failedjob.failedIndividual job failed
update_job_progressjob.progressJob progress update

Svix Payload Format

Maintain compatibility with existing payload structure while adding Svix metadata:

typescript
// workflow.completed event
{
  "workflow_id": "wf_abc123",
  "status": "completed",
  "completed_at": "2025-11-26T10:30:00Z",
  "duration_ms": 45000,
  "total_steps": 3,
  "completed_steps": 3,
  "outputs_available": true,
  "outputs_count": 2,
  "outputs": [
    {
      "type": "image",
      "url": "https://..."
    }
  ],
  "metadata": {
    "customer_id": "cust_xyz",
    "miniapp_user_id": "user_123",
    "original_request": { ... }
  }
}

Application Model

Svix uses "Applications" to represent webhook consumers:

typescript
// One Svix Application per customer/tenant
const app = await svix.application.create({
  name: "Customer ABC",
  uid: "customer_abc_123",  // Our internal customer ID
  metadata: {
    environment: "production",
    tier: "premium"
  }
});

// Customers manage their own endpoints via App Portal
const portalUrl = await svix.authentication.appPortalAccess(
  "customer_abc_123",
  { featureFlags: ["read-only"] }
);

SvixWebhookService Implementation

typescript
// apps/emprops-api/src/lib/svix-webhook-service.ts
// (or apps/emprops-studio/src/lib/svix-webhook-service.ts)

import { Svix } from 'svix';

interface WorkflowCompletedPayload {
  workflow_id: string;
  status: 'completed';
  completed_at: string;
  duration_ms?: number;
  outputs: Array<{
    type: string;
    url: string;
    filename?: string;
  }>;
  metadata: {
    miniapp_user_id: string;
    original_request: unknown;
  };
}

interface WorkflowFailedPayload {
  workflow_id: string;
  status: 'failed';
  failed_at: string;
  error: string;
  error_code?: string;
  metadata: {
    miniapp_user_id: string;
    original_request?: unknown;
  };
}

export class SvixWebhookService {
  private svix: Svix;

  constructor() {
    const token = process.env.SVIX_AUTH_TOKEN;
    if (!token) {
      throw new Error('SVIX_AUTH_TOKEN environment variable is required');
    }

    this.svix = new Svix(token, {
      serverUrl: process.env.SVIX_SERVER_URL // Optional: for self-hosted Svix
    });
  }

  /**
   * Send workflow.completed webhook when outputs are saved
   * Called from emprops-api output handler
   */
  async sendWorkflowCompleted(
    applicationId: string,
    payload: WorkflowCompletedPayload
  ): Promise<void> {
    await this.svix.message.create(applicationId, {
      eventType: 'workflow.completed',
      payload: payload,
      // Idempotency key to prevent duplicate webhooks on retry
      eventId: `wfc_${payload.workflow_id}_${Date.now()}`
    });
  }

  /**
   * Send workflow.failed webhook when workflow fails
   */
  async sendWorkflowFailed(
    applicationId: string,
    payload: WorkflowFailedPayload
  ): Promise<void> {
    await this.svix.message.create(applicationId, {
      eventType: 'workflow.failed',
      payload: payload,
      eventId: `wff_${payload.workflow_id}_${Date.now()}`
    });
  }

  /**
   * Ensure application exists for mini-app
   * Call once during setup, not per-request
   */
  async ensureApplication(appId: string, name: string): Promise<void> {
    await this.svix.application.getOrCreate({
      name: name,
      uid: appId
    });
  }

  /**
   * Create endpoint for mini-app webhook callback
   * Call once during mini-app registration
   */
  async createEndpoint(
    appId: string,
    url: string,
    filterTypes: string[] = ['workflow.completed', 'workflow.failed']
  ): Promise<void> {
    await this.svix.endpoint.create(appId, {
      url: url,
      filterTypes: filterTypes,
      description: 'Mini-app webhook callback'
    });
  }
}

// Singleton instance
let svixService: SvixWebhookService | null = null;

export function getSvixWebhookService(): SvixWebhookService {
  if (!svixService) {
    svixService = new SvixWebhookService();
  }
  return svixService;
}

Integration Point in emprops-api

The key integration point is in the output handler where workflow outputs are saved:

typescript
// Example: In emprops-api output handler (pseudo-code)
// This is where you save workflow outputs after completion

async function handleWorkflowOutput(workflowId: string, outputs: OutputData[]) {
  // 1. Save outputs to storage (existing logic)
  const savedOutputs = await saveOutputsToStorage(workflowId, outputs);

  // 2. Get original request context
  const workflowContext = await getWorkflowContext(workflowId);

  // 3. Send webhook via Svix (NEW)
  const svix = getSvixWebhookService();

  await svix.sendWorkflowCompleted('miniapp_emprops', {
    workflow_id: workflowId,
    status: 'completed',
    completed_at: new Date().toISOString(),
    outputs: savedOutputs.map(o => ({
      type: o.type,
      url: o.url,
      filename: o.filename
    })),
    metadata: {
      miniapp_user_id: workflowContext.miniapp_user_id,
      original_request: workflowContext.original_request
    }
  });
}

Implementation Plan

Phase 1: Svix Setup in emprops-api (1 day)

Objective: Add Svix SDK and create basic service in emprops-api.

1.1 Install Svix SDK

bash
cd apps/emprops-api  # or apps/emprops-studio
pnpm add svix

1.2 Create SvixWebhookService

File: apps/emprops-api/src/lib/svix-webhook-service.ts

  • Initialize Svix client with auth token
  • Implement sendWorkflowCompleted(), sendWorkflowFailed()
  • Simple, focused service (no OpenTelemetry initially)

1.3 Create Svix Event Types

Using Svix dashboard or CLI:

bash
svix event-type create '{ "name": "workflow.completed", "description": "Workflow finished with outputs available" }'
svix event-type create '{ "name": "workflow.failed", "description": "Workflow failed with error" }'

1.4 Environment Configuration

env
# .env.secrets.local
SVIX_AUTH_TOKEN=sk_...
SVIX_SERVER_URL=https://api.us.svix.com

1.5 Setup Mini-app Application in Svix

bash
# Create application for mini-app
svix application create '{ "name": "EmProps Mini-App", "uid": "miniapp_emprops" }'

# Create endpoint for mini-app webhook callback
svix endpoint create miniapp_emprops '{
  "url": "https://miniapp.emprops.ai/api/webhook",
  "filterTypes": ["workflow.completed", "workflow.failed"]
}'

Phase 2: Integrate Svix in Output Handler (1 day)

Objective: Call Svix when outputs are saved in emprops-api.

2.1 Identify Integration Point

Find where emprops-api saves workflow outputs. This is the exact point to call Svix:

typescript
// Pseudo-code - find actual location in emprops-api
async function saveWorkflowOutput(workflowId: string, outputs: any[]) {
  // Existing: Save outputs to database/storage
  await db.outputs.create({ workflowId, outputs });

  // NEW: Send webhook via Svix
  const svix = getSvixWebhookService();
  await svix.sendWorkflowCompleted('miniapp_emprops', {
    workflow_id: workflowId,
    status: 'completed',
    completed_at: new Date().toISOString(),
    outputs: outputs.map(o => ({ type: o.type, url: o.url })),
    metadata: {
      miniapp_user_id: workflow.miniapp_user_id,
      original_request: workflow.original_request
    }
  });
}

2.2 Handle Failures

Add Svix call for workflow failures:

typescript
async function handleWorkflowFailure(workflowId: string, error: Error) {
  // Existing: Update workflow status
  await db.workflows.update(workflowId, { status: 'failed', error: error.message });

  // NEW: Send failure webhook via Svix
  const svix = getSvixWebhookService();
  await svix.sendWorkflowFailed('miniapp_emprops', {
    workflow_id: workflowId,
    status: 'failed',
    failed_at: new Date().toISOString(),
    error: error.message,
    metadata: {
      miniapp_user_id: workflow.miniapp_user_id
    }
  });
}

Phase 3: Remove Workflow Checking from emp-job-queue (1 day)

Objective: Remove redundant workflow completion detection from lightweight-api-server.

3.1 Identify Code to Remove

In apps/api/src/lightweight-api-server.ts:

  1. publishWorkflowCompletion() (~line 5404)

    • This function publishes workflow_completed to Redis
    • No longer needed - emprops-api handles webhooks
  2. attemptWorkflowRecovery() (~line 5512)

    • Polls EmProps API to verify workflow status
    • No longer needed - emprops-api knows when outputs are saved
  3. handleWorkflowCompletionWithEmpropsConfirmation()

    • Retries workflow status verification
    • No longer needed
  4. Redis publish calls:

    • redis.publish('workflow_completed', ...)
    • redis.publish('workflow_failed', ...)

3.2 Keep What's Needed

emp-job-queue still needs:

  • Job processing and routing
  • Worker communication
  • Progress updates
  • Machine monitoring

Just remove the webhook-related completion checking.


Phase 4: Remove webhook-service (1 day)

Objective: Delete the entire webhook-service application.

4.1 Remove Directory

bash
rm -rf apps/webhook-service

4.2 Update Configuration

  • Remove from turbo.json
  • Remove from docker-compose.yml
  • Remove from CI/CD pipelines
  • Remove from monitoring

4.3 Clean Up Redis

bash
# Remove webhook-related keys (run in production Redis)
redis-cli KEYS "webhooks:*" | xargs redis-cli DEL

Phase 5: Testing (1 day)

5.1 End-to-End Test

  1. Submit workflow from mini-app
  2. Wait for processing
  3. Verify mini-app receives Svix webhook
  4. Verify payload matches expected format

5.2 Test Webhook Payload

typescript
// Expected webhook payload at mini-app
{
  "workflow_id": "wf_abc123",
  "status": "completed",
  "completed_at": "2025-11-26T10:30:00Z",
  "outputs": [
    { "type": "image", "url": "https://..." }
  ],
  "metadata": {
    "miniapp_user_id": "user_123",
    "original_request": { ... }
  }
}

5.3 Test Failure Webhook

  1. Submit workflow that will fail
  2. Verify mini-app receives workflow.failed webhook
  3. Verify error message is included

5.4 Test Svix Retry

  1. Temporarily make mini-app endpoint return 500
  2. Verify Svix retries delivery
  3. Fix endpoint and verify webhook eventually delivered

Consequences

Benefits

1. Reliable Delivery

  • Automatic retries with configurable backoff
  • Dead letter queue for failed deliveries
  • Replay capability for missed events
  • Guaranteed at-least-once delivery

2. Customer-Facing Features

  • App Portal for customers to manage endpoints
  • Event history and debugging tools
  • Signature verification built-in
  • Filtering by event type

3. Operational Simplicity

  • One less service to deploy and monitor
  • Managed infrastructure (Svix handles scaling, reliability)
  • Built-in dashboard for monitoring
  • API logs for debugging

4. Security

  • HMAC signatures on all webhooks
  • Secret rotation support
  • IP allowlisting options
  • Rate limiting per endpoint

5. Architecture Simplification

  • Remove Redis pub/sub hop for webhooks
  • Direct Svix call from emprops-api - Right where outputs are saved
  • Remove workflow checking from emp-job-queue - No more polling EmProps API
  • Clear responsibility - emprops-api handles webhooks, emp-job-queue handles job processing

Drawbacks

1. External Dependency

Issue: Svix becomes a critical dependency for webhook delivery.

Mitigation:

  • Svix has 99.99% uptime SLA
  • Can self-host Svix if needed
  • Implement circuit breaker for Svix calls
  • Queue webhook calls if Svix unavailable

2. Cost

Issue: Svix has usage-based pricing.

Mitigation:

  • Free tier covers 50k messages/month
  • Paid plans are reasonable for production use
  • Self-hosting option available

3. Migration Effort

Issue: Need to recreate webhook registrations in Svix.

Mitigation:

  • Export existing webhook configs
  • Script to create Svix applications and endpoints
  • Run both systems in parallel during migration

Migration Strategy

Step 1: Svix Setup (Day 1)

  1. Create Svix account
  2. Add Svix SDK to emprops-api
  3. Create event types in Svix dashboard
  4. Create mini-app application and endpoint in Svix
  5. Implement SvixWebhookService in emprops-api

Step 2: Parallel Operation (Days 2-3)

  1. Add Svix call in emprops-api output handler
  2. Keep old webhook-service running (dual delivery)
  3. Verify mini-app receives Svix webhooks
  4. Compare payloads between old and new systems
  5. Monitor for any delivery failures

Step 3: Remove Old System (Days 4-5)

  1. Remove workflow checking from emp-job-queue
  2. Stop webhook-service deployment
  3. Clean up Redis webhook keys
  4. Update documentation

Success Metrics

Quantitative

  • Delivery success rate: > 99.9% (vs current ~95%)
  • Delivery latency: < 5 seconds P95 (vs current ~10s)
  • Retry success rate: > 99% on eventual delivery

Qualitative

  • Reduced operational burden (one less service)
  • Customer self-service via App Portal
  • Better debugging with event history

Alternative Approaches Considered

Option A: Keep Custom webhook-service

Rejected: Doesn't solve reliability, monitoring, or feature gaps.

Option B: Build Enhanced webhook-service

Rejected: Significant engineering effort to rebuild what Svix provides.

Option C: Use AWS EventBridge / SNS

Rejected: More complex setup, less webhook-specific features.

Option D: Use Svix (THIS ADR)

Chosen: Purpose-built webhook platform, managed reliability, customer portal.


Dependencies

Infrastructure

  • Svix account (cloud or self-hosted)
  • SVIX_AUTH_TOKEN environment variable in emprops-api

Code Changes

In emprops-api (or emprops-studio):

  • src/lib/svix-webhook-service.ts (NEW) - Svix SDK wrapper
  • Output handler (MODIFIED) - Add Svix webhook call after saving outputs
  • Failure handler (MODIFIED) - Add Svix webhook call on workflow failure

In emp-job-queue (apps/api):

  • src/lightweight-api-server.ts (MODIFIED) - Remove workflow completion checking for webhooks
  • Remove publishWorkflowCompletion(), attemptWorkflowRecovery(), etc.

Removed entirely:

  • apps/webhook-service/ (REMOVED) - Entire service directory

Package Dependencies

json
// In apps/emprops-api/package.json (or apps/emprops-studio/package.json)
{
  "dependencies": {
    "svix": "^1.40.0"
  }
}

Appendix: Svix Concepts

Applications

One application per customer/tenant. Each application has its own endpoints and event history.

Endpoints

Webhook URLs registered by customers. Can filter by event type.

Messages

Individual webhook deliveries. Svix handles retries and logs all attempts.

Event Types

Schema definitions for webhook payloads. Enables filtering and documentation.

App Portal

Embedded UI for customers to manage their own webhooks without accessing your dashboard.


End of ADR

Released under the MIT License.