Environment Management System
Overview
The emp-job-queue system uses a sophisticated component-based environment management system built on the @emp/env-management package. This system provides type-safe, hierarchical environment configuration across development, staging, and production environments while supporting multiple deployment targets.
Key Design Goals:
- Separation of Concerns: Component files define capabilities, service interfaces define requirements
- Type Safety: Service interfaces enforce required variables at build time
- Flexibility: Profile-based composition allows mixing components for different scenarios
- Security: Automatic separation of public and secret variables
- Multi-environment: Single codebase supports local dev, staging, and production
Architecture Overview
System Components
1. Component Files (config/environments/components/)
Component files define what configuration is available for each component (API, Redis, Machine, Worker, etc.). They use INI format with environment-specific sections and namespace prefixing.
Structure:
NAMESPACE=COMPONENT_NAME
[default]
# Values available in all environments
COMMON_VAR=value
[local]
# Local development overrides
LOCAL_VAR=value
[staging]
# Staging environment
STAGING_VAR=value
[production]
# Production environment
PROD_VAR=valueExample: redis.env
NAMESPACE=REDIS
[default]
DB=0
MAX_CONNECTIONS=200
[local]
INGRESS=:6379
URL=redis://host.docker.internal:6379
[staging]
INGRESS=redis://default:${STAGING_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889
URL=redis://default:${STAGING_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889
[production]
INGRESS=redis://default:${PRODUCTION_REDIS_PASSWORD}@ballast.proxy.rlwy.net:30645
URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@ballast.proxy.rlwy.net:30645Key Features:
- Namespace Prefixing:
NAMESPACE=REDIS→ all variables prefixed asREDIS_* - Environment Sections:
[local],[staging],[production]for environment-specific values - Variable Substitution:
${PRODUCTION_REDIS_PASSWORD}resolves from secrets - Layering: Later sections override earlier ones
Current Components:
api.env- Job queue API configurationredis.env- Redis connection and settingsdatabase.env- PostgreSQL configurationmachine.env- Machine/container orchestrationworker.env- Worker process configurationmonitor.env- Monitoring UI settingscomfyui.env- ComfyUI integrationollama.env- Ollama LLM servicestorage-provider.env- Cloud storage (S3/Azure/GCP)telemetry.env- Observability configurationemprops.env- EmProps platform API- Plus:
api-tokens.env,openai.env,gemini.env,ngrok.env,simulation.env,webhook-service.env,telemetry-collector.env
2. Service Interfaces (config/environments/services/)
Service interfaces define what configuration is required for each service to run. They enforce type-safe environment validation and map component variables to service-specific names.
Structure:
export const ServiceNameEnvInterface = {
name: "service-name",
location: "apps/service-name", // Where .env files are generated
required: {
"APP_VAR_NAME": "COMPONENT_VAR_NAME", // Maps app var → component var
},
secret: {
"SECRET_VAR": "COMPONENT_SECRET_VAR", // Separated into .env.secret
},
optional: {
"OPTIONAL_VAR": "COMPONENT_OPTIONAL_VAR",
},
defaults: {
"VAR_WITH_DEFAULT": "default_value",
}
};Example: machine.interface.ts
export const MachineEnvInterface = {
name: "machine",
location: "apps/machine",
required: {
"HUB_REDIS_URL": "REDIS_URL",
"EMPROPS_API_URL": "EMPROPS_JOB_QUEUE_API_URL",
"GPU_MODE": "MACHINE_GPU_MODE",
// Telemetry
"OTEL_COLLECTOR_ENDPOINT": "${MACHINE_EGRESS_GRPC}${TELEMETRY_COLLECTOR_OTEL_COLLECTOR_ENDPOINT}",
"TELEMETRY_ENV": "TELEMETRY_DASH0_DATASET",
// Storage
"CLOUD_PROVIDER": "STORAGE_PROVIDER_CURRENT_SERVICE",
"CLOUD_STORAGE_CONTAINER": "STORAGE_PROVIDER_CONTAINER",
// Ollama configuration (if using ollama workers)
"OLLAMA_HOST": "OLLAMA_HOST",
"OLLAMA_PORT": "OLLAMA_PORT",
"OLLAMA_DEFAULT_MODELS": "OLLAMA_DEFAULT_MODELS",
},
secret: {
"AWS_ACCESS_KEY_ID": "STORAGE_PROVIDER_AWS_ACCESS_KEY_ID",
"AWS_SECRET_ACCESS_KEY_ENCODED": "STORAGE_PROVIDER_AWS_SECRET_ACCESS_KEY_ENCODED",
"OPENAI_API_KEY": "OPENAI_API_KEY",
"AUTH_TOKEN": "API_AUTH_TOKEN",
},
optional: {
"MACHINE_HEALTH_PORT": "MACHINE_HEALTH_PORT",
"MACHINE_LOG_LEVEL": "MACHINE_LOG_LEVEL",
},
defaults: {
"WORKER_MAX_CONCURRENT_JOBS": "1",
"SERVICE_TYPE": "machine",
"LOG_TO_FILE": "true"
}
};Key Features:
- Variable Mapping: Maps friendly service variable names to namespaced component names
- Template Substitution: Supports
${VAR}syntax for dynamic composition - Categorization:
required,secret,optional,defaultsenforce correct handling - Build-time Validation: Fails early if required variables are missing
Current Service Interfaces:
api.interface.ts- Job queue API servicemachine.interface.ts- Machine container orchestrationemprops-api.interface.ts- EmProps platform APIdatabase.interface.ts- PostgreSQL databasemonitor.interface.ts- Monitoring UIwebhook-service.interface.ts- Webhook delivery servicejob-evaluator.interface.ts- Job evaluation servicetelemetry-collector.interface.ts- Telemetry collection service
3. Environment Profiles (config/environments/profiles/)
Profiles compose components together for specific use cases. They define which component environments to activate.
Structure:
{
"name": "Profile Name",
"description": "What this profile is for",
"components": {
"component-name": ["default", "environment"],
"another-component": "single-environment"
},
"services": {}
}Example: staging.json
{
"name": "Production Testing",
"description": "Production-like environment for testing with local Redis",
"components": {
"api": ["default", "staging"],
"redis": ["default", "staging"],
"machine": ["default", "staging"],
"worker": ["default", "staging"],
"database": ["default", "staging"],
"storage-provider": ["default", "production"],
"telemetry": ["default", "staging"]
}
}Available Profiles:
local-dev.json: Local development with production API + local Redisstaging.json: Production-like testing environmentproduction.json: Full production configurationremote-dev.json: Remote development with ngrok tunnelingtestrunner.json: Automated testing environmentlocal-prod.json: Local production simulation
Profile Composition Rules:
- Array syntax
["default", "staging"]merges environments in order (later overrides earlier) - String syntax
"staging"loads only that environment - All components load from their namespaced
.envfiles - Secrets are automatically loaded from
.env.secrets.local(gitignored)
4. Secrets Management (config/environments/secrets/)
Location: config/environments/secrets/.env.secrets.local (gitignored)
Secrets are stored in a single .env file with plain key-value pairs (no namespacing). The build system automatically:
- Loads all secrets into the variable pool
- Identifies which variables are secrets based on service interfaces
- Writes secrets to
.env.secret.<profile>files per service - Keeps public variables in
.env.<profile>files
Example: .env.secrets.local
# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/db
# Redis
PRODUCTION_REDIS_PASSWORD=secret_password_here
# Cloud Storage
STORAGE_PROVIDER_AWS_ACCESS_KEY_ID=AKIAXXXXXXXX
STORAGE_PROVIDER_AWS_SECRET_ACCESS_KEY_ENCODED=base64_encoded_secret
# API Keys
OPENAI_API_KEY=sk-xxxxxxxx
API_AUTH_TOKEN=bearer_token_here
# Telemetry
TELEMETRY_DASH0_AUTH_TOKEN=dash0_token_hereSecurity Features:
- Gitignored: Never committed to repository
- Auto-detected: Build system knows which vars are secrets via service interfaces
- Per-service: Each service gets only its required secrets
- Docker-friendly:
.env.secretfiles mount at runtime, not baked into images
Build Process
Environment Builder (@emp/env-management)
The EnvironmentBuilder class orchestrates the entire build process:
Build Flow:
Key Steps:
Profile Loading: Reads profile JSON to determine which components and environments to load
Component Pool Creation: Loads all specified component files and merges their environments
typescript// Example: api component with ["default", "staging"] // Loads: api.env[default] + api.env[staging] // Result: API_NODE_ENV=production, API_PORT=3333, API_LOG_LEVEL=infoSecret Loading: Automatically loads
.env.secrets.local(no profile declaration needed)Variable Resolution: Multi-pass resolution of
${VAR}substitutionstypescript// Pass 1: REDIS_URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@host:6379 // Pass 2: REDIS_URL=redis://default:secret123@host:6379Service Validation: For each service interface, validates all required and secret variables exist
File Generation: For each service:
- Maps component variables → service variables via interface
- Splits into public (.env) and secret (.env.secret) based on interface categorization
- Writes files to service location (
apps/service-name/.env.<profile>)
Example Output:
# After: pnpm env:build staging
# Generated files:
apps/api/.env.staging # Public API vars (baked into Docker image)
apps/api/.env.secret.staging # Secret API vars (Docker runtime inject)
apps/machine/.env.staging # Public machine vars
apps/machine/.env.secret.staging # Secret machine vars (API keys, storage creds)
apps/worker/.env.staging # Public worker vars
apps/worker/.env.secret.staging # Secret worker vars
# ... etc for each serviceVariable Resolution
The system supports multi-pass variable substitution to handle nested references:
Resolution Algorithm:
// Maximum 10 passes to resolve all ${VAR} references
while (hasUnresolvedVars && pass < 10) {
for each variable:
Replace ${VAR} with:
1. Value from resolved pool (if exists)
2. Value from process.env (if exists)
3. Keep ${VAR} unchanged (will warn at end)
}Example Multi-layer Resolution:
# Secrets file:
PRODUCTION_REDIS_PASSWORD=secret123
# redis.env [staging]:
REDIS_URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889
# machine.interface.ts:
"HUB_REDIS_URL": "REDIS_URL"
# Final apps/machine/.env.staging:
HUB_REDIS_URL=redis://default:secret123@switchback.proxy.rlwy.net:48889Template Composition:
// machine.interface.ts uses template composition:
"OTEL_COLLECTOR_ENDPOINT": "${MACHINE_EGRESS_GRPC}${TELEMETRY_COLLECTOR_OTEL_COLLECTOR_ENDPOINT}"
// Resolves from:
MACHINE_EGRESS_GRPC=myhost.com (from machine.env)
TELEMETRY_COLLECTOR_OTEL_COLLECTOR_ENDPOINT=:4317 (from telemetry-collector.env)
// Final result:
OTEL_COLLECTOR_ENDPOINT=myhost.com:4317Usage Patterns
Building Environments
CLI Commands:
# Build staging environment (most common)
pnpm env:build staging
# Build production environment
pnpm env:build production
# Build local development environment
pnpm env:build local-dev
# List available profiles
pnpm env:listWhat Happens:
- Loads profile from
config/environments/profiles/<profile>.json - Loads all component
.envfiles specified in profile - Loads secrets from
config/environments/secrets/.env.secrets.local - Validates all service interfaces have required variables
- Generates
.env.<profile>and.env.secret.<profile>for each service - Reports validation results and any missing variables
Development Workflow
Local Development:
# 1. Create your secrets file (one-time setup)
cp config/environments/secrets/.env.secrets.local.example \\
config/environments/secrets/.env.secrets.local
# 2. Edit secrets with your credentials
vim config/environments/secrets/.env.secrets.local
# 3. Build local-dev environment
pnpm env:build local-dev
# 4. Start services with Docker Compose
docker-compose up
# Services automatically load apps/*/env.local-dev filesStaging Deployment:
# 1. Ensure secrets file has staging credentials
# 2. Build staging environment
pnpm env:build staging
# 3. Deploy to staging (exact process varies by deployment target)
# For Docker:
docker-compose -f docker-compose.staging.yml up
# For Kubernetes:
kubectl apply -f k8s/staging/Production Deployment:
# 1. Build production environment
pnpm env:build production
# 2. Verify no missing variables
# 3. Deploy (secrets injected via secure CI/CD or secret management service)Adding a New Service
Step 1: Create Service Interface
File: config/environments/services/my-service.interface.ts
export const MyServiceEnvInterface = {
name: "my-service",
location: "apps/my-service",
required: {
"SERVICE_PORT": "MY_SERVICE_PORT",
"REDIS_URL": "REDIS_URL",
},
secret: {
"API_KEY": "MY_SERVICE_API_KEY",
},
defaults: {
"LOG_LEVEL": "info",
}
};Step 2: Add Component Configuration (if needed)
File: config/environments/components/my-service.env
NAMESPACE=MY_SERVICE
[default]
PORT=8080
[local]
PORT=8080
[production]
PORT=80Step 3: Update Profiles
File: config/environments/profiles/staging.json
{
"components": {
"my-service": ["default", "staging"],
...
}
}Step 4: Add Secrets (if needed)
File: config/environments/secrets/.env.secrets.local
MY_SERVICE_API_KEY=secret_key_hereStep 5: Rebuild Environment
pnpm env:build stagingResult: apps/my-service/.env.staging and apps/my-service/.env.secret.staging created
Troubleshooting
Missing Variables Error:
❌ Service 'machine' is missing required variables: REDIS_URL, MACHINE_GPU_MODEFix:
- Check which profile you're building
- Verify component is included in profile
- Check component file has correct environment section
- Verify namespace matches service interface expectation
Variable Not Resolving:
⚠️ Unresolved variables in HUB_REDIS_URL: ${PRODUCTION_REDIS_PASSWORD}Fix:
- Check
.env.secrets.localhas the variable - Verify variable name matches exactly (case-sensitive)
- Check for circular dependencies in variable references
Service Not Getting Variables:
Error: Cannot find .env.staging fileFix:
- Verify service interface has correct
locationfield - Ensure profile includes required components
- Re-run
pnpm env:build <profile>
Best Practices
Variable Naming Conventions
Component Variables: Use namespace prefix
bash# Good REDIS_URL=... API_PORT=... # Bad (no namespace) URL=... PORT=...Secret Variables: Use descriptive names indicating sensitivity
bash# Good STORAGE_PROVIDER_AWS_SECRET_ACCESS_KEY_ENCODED=... API_AUTH_TOKEN=... # Bad KEY=... TOKEN=...Template Variables: Use clear composition
typescript// Good "${SERVICE_EGRESS}${API_INGRESS}" // Bad (unclear what's being composed) "${A}${B}"
Component Organization
Group Related Settings: Keep all Redis settings in
redis.env, not scattered across filesUse Environment Sections Appropriately:
[default]: Common to all environments[local]: Local development overrides[staging]: Staging-specific (production-like but separate resources)[production]: Production values
Don't Duplicate: Use variable references instead of duplicating values
ini# Good [staging] URL=${REDIS_URL} # Bad [staging] URL=redis://default:${PRODUCTION_REDIS_PASSWORD}@switchback.proxy.rlwy.net:48889 HOST=switchback.proxy.rlwy.net
Security Practices
Never Commit Secrets: Always use
.env.secrets.local(gitignored)Use Secret Categorization: Mark all sensitive variables in service interfaces
typescriptsecret: { "DATABASE_URL": "DATABASE_URL", // ✅ Correct } // NOT: required: { "DATABASE_URL": "DATABASE_URL", // ❌ Wrong - should be secret }Rotate Credentials: Update
.env.secrets.localand rebuild when rotating credentialsPer-service Secrets: Service interfaces ensure each service only gets its required secrets
Profile Design
Composition Over Duplication: Use component layering
json{ "components": { "redis": ["default", "staging"] // ✅ Merges default + staging } }Meaningful Names: Profile names should describe their purpose clearly
local-dev- Local developmentstaging- Staging environmentproduction- Production deployment
Document Profiles: Add clear descriptions in profile JSON
Advanced Topics
Custom Environment Variables
Sometimes you need environment-specific values not in component files:
Option 1: Add to component file
# components/custom.env
NAMESPACE=CUSTOM
[local]
SPECIAL_VALUE=local_value
[production]
SPECIAL_VALUE=prod_valueOption 2: Use process.env passthrough
# Set in shell before build
export SPECIAL_VALUE=runtime_value
pnpm env:build staging
# Variable available for ${SPECIAL_VALUE} substitutionTemplate Composition Patterns
Egress/Ingress Pattern: Used for services that need to know how to reach other services:
// Service A needs to reach Service B
"SERVICE_B_URL": "${SERVICE_A_EGRESS}${SERVICE_B_INGRESS}"
// Local:
SERVICE_A_EGRESS=http://host.docker.internal // How A reaches outside
SERVICE_B_INGRESS=:3000 // Where B listens
// Result: http://host.docker.internal:3000
// Production:
SERVICE_A_EGRESS=https://api.example.com // Public endpoint
SERVICE_B_INGRESS=/service-b // Path prefix
// Result: https://api.example.com/service-bDocker Compose Integration
The builder can generate Docker Compose files from profiles:
{
"docker": {
"services": {
"redis": {
"image": "redis:7",
"ports": ["6379:6379"],
"condition": "components.redis includes 'local'"
}
}
}
}Features:
- Conditional Services: Only include if component is active
- Variable Substitution: Use
${VAR}in Docker Compose config - Auto-platform: Adds
platform: linux/amd64for cross-architecture builds
Related Documentation
- Machine/Worker Architecture: See machine-worker-system.md for how environments are used in deployment
- Testing Procedures: See TESTING_PROCEDURES.md for environment-specific testing
- CLAUDE.md: See project root for development workflow and conventions
Key Takeaways
- Component files define what's available, service interfaces define what's required
- Profiles compose components for different use cases
- Secrets are automatic - just add to
.env.secrets.local - Build fails fast - missing variables caught at build time, not runtime
- Multi-pass resolution - complex variable chains resolve automatically
- Per-service outputs - each service gets exactly what it needs, nothing more
