Environment Management System

Overview

The emp-job-queue system uses a sophisticated component-based environment management system built on the @emp/env-management package. This system provides type-safe, hierarchical environment configuration across development, staging, and production environments while supporting multiple deployment targets.

Key Design Goals:

Separation of Concerns: Component files define capabilities, service interfaces define requirements
Type Safety: Service interfaces enforce required variables at build time
Flexibility: Profile-based composition allows mixing components for different scenarios
Security: Automatic separation of public and secret variables
Multi-environment: Single codebase supports local dev, staging, and production

Architecture Overview

System Components

1. Component Files (`config/environments/components/`)

Component files define what configuration is available for each component (API, Redis, Machine, Worker, etc.). They use INI format with environment-specific sections and namespace prefixing.

Structure:

ini

NAMESPACE=COMPONENT_NAME

[default]
# Values available in all environments
COMMON_VAR=value

[local]
# Local development overrides
LOCAL_VAR=value

[staging]
# Staging environment
STAGING_VAR=value

[production]
# Production environment
PROD_VAR=value

Example: redis.env

ini

NAMESPACE=REDIS

[default]
DB=0
MAX_CONNECTIONS=200

[local]
INGRESS=:6379
URL=redis://host.docker.internal:6379

[staging]
INGRESS=redis://default:${STAGING_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889
URL=redis://default:${STAGING_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889

[production]
INGRESS=redis://default:${PRODUCTION_REDIS_PASSWORD}@ballast.proxy.rlwy.net:30645
URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@ballast.proxy.rlwy.net:30645

Key Features:

Namespace Prefixing: NAMESPACE=REDIS → all variables prefixed as REDIS_*
Environment Sections: [local], [staging], [production] for environment-specific values
Variable Substitution: ${PRODUCTION_REDIS_PASSWORD} resolves from secrets
Layering: Later sections override earlier ones

Current Components:

api.env - Job queue API configuration
redis.env - Redis connection and settings
database.env - PostgreSQL configuration
machine.env - Machine/container orchestration
worker.env - Worker process configuration
monitor.env - Monitoring UI settings
comfyui.env - ComfyUI integration
ollama.env - Ollama LLM service
storage-provider.env - Cloud storage (S3/Azure/GCP)
telemetry.env - Observability configuration
emprops.env - EmProps platform API
Plus: api-tokens.env, openai.env, gemini.env, ngrok.env, simulation.env, webhook-service.env, telemetry-collector.env

2. Service Interfaces (`config/environments/services/`)

Service interfaces define what configuration is required for each service to run. They enforce type-safe environment validation and map component variables to service-specific names.

Structure:

typescript

export const ServiceNameEnvInterface = {
  name: "service-name",
  location: "apps/service-name",  // Where .env files are generated

  required: {
    "APP_VAR_NAME": "COMPONENT_VAR_NAME",  // Maps app var → component var
  },

  secret: {
    "SECRET_VAR": "COMPONENT_SECRET_VAR",  // Separated into .env.secret
  },

  optional: {
    "OPTIONAL_VAR": "COMPONENT_OPTIONAL_VAR",
  },

  defaults: {
    "VAR_WITH_DEFAULT": "default_value",
  }
};

Example: machine.interface.ts

typescript

export const MachineEnvInterface = {
  name: "machine",
  location: "apps/machine",

  required: {
    "HUB_REDIS_URL": "REDIS_URL",
    "EMPROPS_API_URL": "EMPROPS_JOB_QUEUE_API_URL",
    "GPU_MODE": "MACHINE_GPU_MODE",

    // Telemetry
    "OTEL_COLLECTOR_ENDPOINT": "${MACHINE_EGRESS_GRPC}${TELEMETRY_COLLECTOR_OTEL_COLLECTOR_ENDPOINT}",
    "TELEMETRY_ENV": "TELEMETRY_DASH0_DATASET",

    // Storage
    "CLOUD_PROVIDER": "STORAGE_PROVIDER_CURRENT_SERVICE",
    "CLOUD_STORAGE_CONTAINER": "STORAGE_PROVIDER_CONTAINER",

    // Ollama configuration (if using ollama workers)
    "OLLAMA_HOST": "OLLAMA_HOST",
    "OLLAMA_PORT": "OLLAMA_PORT",
    "OLLAMA_DEFAULT_MODELS": "OLLAMA_DEFAULT_MODELS",
  },

  secret: {
    "AWS_ACCESS_KEY_ID": "STORAGE_PROVIDER_AWS_ACCESS_KEY_ID",
    "AWS_SECRET_ACCESS_KEY_ENCODED": "STORAGE_PROVIDER_AWS_SECRET_ACCESS_KEY_ENCODED",
    "OPENAI_API_KEY": "OPENAI_API_KEY",
    "AUTH_TOKEN": "API_AUTH_TOKEN",
  },

  optional: {
    "MACHINE_HEALTH_PORT": "MACHINE_HEALTH_PORT",
    "MACHINE_LOG_LEVEL": "MACHINE_LOG_LEVEL",
  },

  defaults: {
    "WORKER_MAX_CONCURRENT_JOBS": "1",
    "SERVICE_TYPE": "machine",
    "LOG_TO_FILE": "true"
  }
};

Key Features:

Variable Mapping: Maps friendly service variable names to namespaced component names
Template Substitution: Supports ${VAR} syntax for dynamic composition
Categorization: required, secret, optional, defaults enforce correct handling
Build-time Validation: Fails early if required variables are missing

Current Service Interfaces:

api.interface.ts - Job queue API service
machine.interface.ts - Machine container orchestration
emprops-api.interface.ts - EmProps platform API
database.interface.ts - PostgreSQL database
monitor.interface.ts - Monitoring UI
webhook-service.interface.ts - Webhook delivery service
job-evaluator.interface.ts - Job evaluation service
telemetry-collector.interface.ts - Telemetry collection service

3. Environment Profiles (`config/environments/profiles/`)

Profiles compose components together for specific use cases. They define which component environments to activate.

Structure:

json

{
  "name": "Profile Name",
  "description": "What this profile is for",
  "components": {
    "component-name": ["default", "environment"],
    "another-component": "single-environment"
  },
  "services": {}
}

Example: staging.json

json

{
  "name": "Production Testing",
  "description": "Production-like environment for testing with local Redis",
  "components": {
    "api": ["default", "staging"],
    "redis": ["default", "staging"],
    "machine": ["default", "staging"],
    "worker": ["default", "staging"],
    "database": ["default", "staging"],
    "storage-provider": ["default", "production"],
    "telemetry": ["default", "staging"]
  }
}

Available Profiles:

local-dev.json: Local development with production API + local Redis
staging.json: Production-like testing environment
production.json: Full production configuration
remote-dev.json: Remote development with ngrok tunneling
testrunner.json: Automated testing environment
local-prod.json: Local production simulation

Profile Composition Rules:

Array syntax ["default", "staging"] merges environments in order (later overrides earlier)
String syntax "staging" loads only that environment
All components load from their namespaced .env files
Secrets are automatically loaded from .env.secrets.local (gitignored)

4. Secrets Management (`config/environments/secrets/`)

Location: config/environments/secrets/.env.secrets.local (gitignored)

Secrets are stored in a single .env file with plain key-value pairs (no namespacing). The build system automatically:

Loads all secrets into the variable pool
Identifies which variables are secrets based on service interfaces
Writes secrets to .env.secret.<profile> files per service
Keeps public variables in .env.<profile> files

Example: .env.secrets.local

bash

# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/db

# Redis
PRODUCTION_REDIS_PASSWORD=secret_password_here

# Cloud Storage
STORAGE_PROVIDER_AWS_ACCESS_KEY_ID=AKIAXXXXXXXX
STORAGE_PROVIDER_AWS_SECRET_ACCESS_KEY_ENCODED=base64_encoded_secret

# API Keys
OPENAI_API_KEY=sk-xxxxxxxx
API_AUTH_TOKEN=bearer_token_here

# Telemetry
TELEMETRY_DASH0_AUTH_TOKEN=dash0_token_here

Security Features:

Gitignored: Never committed to repository
Auto-detected: Build system knows which vars are secrets via service interfaces
Per-service: Each service gets only its required secrets
Docker-friendly: .env.secret files mount at runtime, not baked into images

Build Process

Environment Builder (`@emp/env-management`)

The EnvironmentBuilder class orchestrates the entire build process:

Build Flow:

Key Steps:

Profile Loading: Reads profile JSON to determine which components and environments to load

Component Pool Creation: Loads all specified component files and merges their environments

typescript

// Example: api component with ["default", "staging"]
// Loads: api.env[default] + api.env[staging]
// Result: API_NODE_ENV=production, API_PORT=3333, API_LOG_LEVEL=info

Secret Loading: Automatically loads .env.secrets.local (no profile declaration needed)

Variable Resolution: Multi-pass resolution of ${VAR} substitutions

typescript

// Pass 1: REDIS_URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@host:6379
// Pass 2: REDIS_URL=redis://default:secret123@host:6379

Service Validation: For each service interface, validates all required and secret variables exist
File Generation: For each service:
- Maps component variables → service variables via interface
- Splits into public (.env) and secret (.env.secret) based on interface categorization
- Writes files to service location (apps/service-name/.env.<profile>)

Example Output:

bash

# After: pnpm env:build staging

# Generated files:
apps/api/.env.staging            # Public API vars (baked into Docker image)
apps/api/.env.secret.staging     # Secret API vars (Docker runtime inject)
apps/machine/.env.staging        # Public machine vars
apps/machine/.env.secret.staging # Secret machine vars (API keys, storage creds)
apps/worker/.env.staging         # Public worker vars
apps/worker/.env.secret.staging  # Secret worker vars
# ... etc for each service

Variable Resolution

The system supports multi-pass variable substitution to handle nested references:

Resolution Algorithm:

typescript

// Maximum 10 passes to resolve all ${VAR} references
while (hasUnresolvedVars && pass < 10) {
  for each variable:
    Replace ${VAR} with:
      1. Value from resolved pool (if exists)
      2. Value from process.env (if exists)
      3. Keep ${VAR} unchanged (will warn at end)
}

Example Multi-layer Resolution:

bash

# Secrets file:
PRODUCTION_REDIS_PASSWORD=secret123

# redis.env [staging]:
REDIS_URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889

# machine.interface.ts:
"HUB_REDIS_URL": "REDIS_URL"

# Final apps/machine/.env.staging:
HUB_REDIS_URL=redis://default:secret123@switchback.proxy.rlwy.net:48889

Template Composition:

typescript

// machine.interface.ts uses template composition:
"OTEL_COLLECTOR_ENDPOINT": "${MACHINE_EGRESS_GRPC}${TELEMETRY_COLLECTOR_OTEL_COLLECTOR_ENDPOINT}"

// Resolves from:
MACHINE_EGRESS_GRPC=myhost.com  (from machine.env)
TELEMETRY_COLLECTOR_OTEL_COLLECTOR_ENDPOINT=:4317  (from telemetry-collector.env)

// Final result:
OTEL_COLLECTOR_ENDPOINT=myhost.com:4317

Usage Patterns

Building Environments

CLI Commands:

bash

# Build staging environment (most common)
pnpm env:build staging

# Build production environment
pnpm env:build production

# Build local development environment
pnpm env:build local-dev

# List available profiles
pnpm env:list

What Happens:

Loads profile from config/environments/profiles/<profile>.json
Loads all component .env files specified in profile
Loads secrets from config/environments/secrets/.env.secrets.local
Validates all service interfaces have required variables
Generates .env.<profile> and .env.secret.<profile> for each service
Reports validation results and any missing variables

Development Workflow

Local Development:

bash

# 1. Create your secrets file (one-time setup)
cp config/environments/secrets/.env.secrets.local.example \\
   config/environments/secrets/.env.secrets.local

# 2. Edit secrets with your credentials
vim config/environments/secrets/.env.secrets.local

# 3. Build local-dev environment
pnpm env:build local-dev

# 4. Start services with Docker Compose
docker-compose up

# Services automatically load apps/*/env.local-dev files

Staging Deployment:

bash

# 1. Ensure secrets file has staging credentials
# 2. Build staging environment
pnpm env:build staging

# 3. Deploy to staging (exact process varies by deployment target)
# For Docker:
docker-compose -f docker-compose.staging.yml up

# For Kubernetes:
kubectl apply -f k8s/staging/

Production Deployment:

bash

# 1. Build production environment
pnpm env:build production

# 2. Verify no missing variables
# 3. Deploy (secrets injected via secure CI/CD or secret management service)

Adding a New Service

Step 1: Create Service Interface

File: config/environments/services/my-service.interface.ts

typescript

export const MyServiceEnvInterface = {
  name: "my-service",
  location: "apps/my-service",

  required: {
    "SERVICE_PORT": "MY_SERVICE_PORT",
    "REDIS_URL": "REDIS_URL",
  },

  secret: {
    "API_KEY": "MY_SERVICE_API_KEY",
  },

  defaults: {
    "LOG_LEVEL": "info",
  }
};

Step 2: Add Component Configuration (if needed)

File: config/environments/components/my-service.env

ini

NAMESPACE=MY_SERVICE

[default]
PORT=8080

[local]
PORT=8080

[production]
PORT=80

Step 3: Update Profiles

File: config/environments/profiles/staging.json

json

{
  "components": {
    "my-service": ["default", "staging"],
    ...
  }
}

Step 4: Add Secrets (if needed)

File: config/environments/secrets/.env.secrets.local

bash

MY_SERVICE_API_KEY=secret_key_here

Step 5: Rebuild Environment

bash

pnpm env:build staging

Result: apps/my-service/.env.staging and apps/my-service/.env.secret.staging created

Troubleshooting

Missing Variables Error:

❌ Service 'machine' is missing required variables: REDIS_URL, MACHINE_GPU_MODE

Fix:

Check which profile you're building
Verify component is included in profile
Check component file has correct environment section
Verify namespace matches service interface expectation

Variable Not Resolving:

⚠️ Unresolved variables in HUB_REDIS_URL: ${PRODUCTION_REDIS_PASSWORD}

Fix:

Check .env.secrets.local has the variable
Verify variable name matches exactly (case-sensitive)
Check for circular dependencies in variable references

Service Not Getting Variables:

Error: Cannot find .env.staging file

Fix:

Verify service interface has correct location field
Ensure profile includes required components
Re-run pnpm env:build <profile>

Best Practices

Variable Naming Conventions

Component Variables: Use namespace prefix

bash

# Good
REDIS_URL=...
API_PORT=...

# Bad (no namespace)
URL=...
PORT=...

Secret Variables: Use descriptive names indicating sensitivity

bash

# Good
STORAGE_PROVIDER_AWS_SECRET_ACCESS_KEY_ENCODED=...
API_AUTH_TOKEN=...

# Bad
KEY=...
TOKEN=...

Template Variables: Use clear composition

typescript

// Good
"${SERVICE_EGRESS}${API_INGRESS}"

// Bad (unclear what's being composed)
"${A}${B}"

Component Organization

Group Related Settings: Keep all Redis settings in redis.env, not scattered across files
Use Environment Sections Appropriately:
- [default]: Common to all environments
- [local]: Local development overrides
- [staging]: Staging-specific (production-like but separate resources)
- [production]: Production values

Don't Duplicate: Use variable references instead of duplicating values

ini

# Good
[staging]
URL=${REDIS_URL}

# Bad
[staging]
URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889
HOST=switchback.proxy.rlwy.net

Security Practices

Never Commit Secrets: Always use .env.secrets.local (gitignored)

Use Secret Categorization: Mark all sensitive variables in service interfaces

typescript

secret: {
  "DATABASE_URL": "DATABASE_URL",  // ✅ Correct
}

// NOT:
required: {
  "DATABASE_URL": "DATABASE_URL",  // ❌ Wrong - should be secret
}

Rotate Credentials: Update .env.secrets.local and rebuild when rotating credentials
Per-service Secrets: Service interfaces ensure each service only gets its required secrets

Profile Design

Composition Over Duplication: Use component layering

json

{
  "components": {
    "redis": ["default", "staging"]  // ✅ Merges default + staging
  }
}

Meaningful Names: Profile names should describe their purpose clearly
- local-dev - Local development
- staging - Staging environment
- production - Production deployment
Document Profiles: Add clear descriptions in profile JSON

Advanced Topics

Custom Environment Variables

Sometimes you need environment-specific values not in component files:

Option 1: Add to component file

ini

# components/custom.env
NAMESPACE=CUSTOM

[local]
SPECIAL_VALUE=local_value

[production]
SPECIAL_VALUE=prod_value

Option 2: Use process.env passthrough

bash

# Set in shell before build
export SPECIAL_VALUE=runtime_value
pnpm env:build staging

# Variable available for ${SPECIAL_VALUE} substitution

Template Composition Patterns

Egress/Ingress Pattern: Used for services that need to know how to reach other services:

typescript

// Service A needs to reach Service B
"SERVICE_B_URL": "${SERVICE_A_EGRESS}${SERVICE_B_INGRESS}"

// Local:
SERVICE_A_EGRESS=http://host.docker.internal  // How A reaches outside
SERVICE_B_INGRESS=:3000                        // Where B listens
// Result: http://host.docker.internal:3000

// Production:
SERVICE_A_EGRESS=https://api.example.com  // Public endpoint
SERVICE_B_INGRESS=/service-b               // Path prefix
// Result: https://api.example.com/service-b

Docker Compose Integration

The builder can generate Docker Compose files from profiles:

json

{
  "docker": {
    "services": {
      "redis": {
        "image": "redis:7",
        "ports": ["6379:6379"],
        "condition": "components.redis includes 'local'"
      }
    }
  }
}

Features:

Conditional Services: Only include if component is active
Variable Substitution: Use ${VAR} in Docker Compose config
Auto-platform: Adds platform: linux/amd64 for cross-architecture builds

Machine/Worker Architecture: See machine-worker-system.md for how environments are used in deployment
Testing Procedures: See TESTING_PROCEDURES.md for environment-specific testing
CLAUDE.md: See project root for development workflow and conventions

Key Takeaways

Component files define what's available, service interfaces define what's required
Profiles compose components for different use cases
Secrets are automatic - just add to .env.secrets.local
Build fails fast - missing variables caught at build time, not runtime
Multi-pass resolution - complex variable chains resolve automatically
Per-service outputs - each service gets exactly what it needs, nothing more

Environment Management System ​

Overview ​

Architecture Overview ​

System Components ​

1. Component Files (config/environments/components/) ​

2. Service Interfaces (config/environments/services/) ​

3. Environment Profiles (config/environments/profiles/) ​

4. Secrets Management (config/environments/secrets/) ​

Build Process ​

Environment Builder (@emp/env-management) ​

Variable Resolution ​

Usage Patterns ​

Building Environments ​

Development Workflow ​

Adding a New Service ​

Troubleshooting ​

Best Practices ​

Variable Naming Conventions ​

Component Organization ​

Security Practices ​

Profile Design ​

Advanced Topics ​

Custom Environment Variables ​

Template Composition Patterns ​

Docker Compose Integration ​

Related Documentation ​

Key Takeaways ​

Environment Management System

Overview

Architecture Overview

System Components

1. Component Files (`config/environments/components/`)

2. Service Interfaces (`config/environments/services/`)

3. Environment Profiles (`config/environments/profiles/`)

4. Secrets Management (`config/environments/secrets/`)

Build Process

Environment Builder (`@emp/env-management`)

Variable Resolution

Usage Patterns

Building Environments

Development Workflow

Adding a New Service

Troubleshooting

Best Practices

Variable Naming Conventions

Component Organization

Security Practices

Profile Design

Advanced Topics

Custom Environment Variables

Template Composition Patterns

Docker Compose Integration

Related Documentation

Key Takeaways