Technical Architecture
Navigation
- Overview - The full story of what we're building
- Development Plan - Phases and milestones
This document provides a deeper technical dive into the systems that power the EmProps Arbitrum integration. It's written for engineers and technical reviewers who want to understand how the pieces fit together.
The Technology Stack
Every technology choice reflects a real constraint or lesson learned. Here's what we use and why.
Frontend: Next.js 14 + TypeScript
EmProps Studio is a Next.js application using the Pages Router (not App Router—we prioritized stability over bleeding-edge features). TypeScript with strict mode catches errors at compile time rather than runtime, which matters when you're handling user funds and blockchain transactions.
Why Next.js: Server-side rendering for SEO, API routes for sensitive operations, excellent Vercel deployment story. The ecosystem is mature—every problem has a solution.
Why not App Router: When we started, the App Router was unstable. Pages Router is proven, well-documented, and our team knows it deeply. We'll migrate when the benefits clearly outweigh the cost.
Key dependencies:
- Jotai for state management (atomic, no boilerplate)
- SWR for data fetching (stale-while-revalidate pattern)
- Radix UI for accessible components (headless, no design lock-in)
- Tailwind CSS for styling (utility-first, consistent)
- Framer Motion for animations (declarative, performant)
Web3: wagmi + viem + Dynamic Labs
For Ethereum-compatible chains (including Arbitrum), we use wagmi and viem—the modern replacement for ethers.js. Viem is lower-level and faster; wagmi provides React hooks on top.
Why wagmi/viem over ethers.js: Better TypeScript support, smaller bundle size, more predictable behavior. The ethers.js v5-to-v6 migration was painful; viem's API is cleaner.
Dynamic Labs handles wallet connection and user onboarding. It's an enterprise Web3 auth solution that supports 200+ wallet types, provides embedded wallets for users without existing wallets, and handles the complexity of multi-chain authentication.
Why Dynamic Labs: Building wallet connection from scratch is deceptively hard. Edge cases (mobile wallets, hardware wallets, network switching, session management) consume engineering time better spent on product. Dynamic Labs handles this and actively maintains compatibility as wallets evolve.
Backend: Express.js + Redis + PostgreSQL
Our API layer uses Express.js—not because it's exciting, but because it's proven. Fifteen years of production use, extensive middleware ecosystem, and every engineer knows it.
Redis serves three purposes:
- Job queue: Sorted sets with priority scores
- State store: Job metadata, worker status, machine health
- Pub/sub: Real-time event distribution
Running one service instead of three (separate queue, cache, pub/sub) reduces operational complexity dramatically. Redis is fast enough (sub-millisecond operations) and reliable enough (Redis Cluster for HA) for our scale.
PostgreSQL stores persistent data: user accounts, collection metadata, generation history, credit balances. We use Prisma as the ORM because it provides type-safe database access and excellent migration tooling.
Why Prisma: The generated TypeScript types match your schema exactly. Queries are validated at compile time. Migrations are versioned and auditable. The developer experience is significantly better than raw SQL or older ORMs.
AI Generation: ComfyUI + Custom Nodes
ComfyUI is a node-based interface for Stable Diffusion. Users (or our backend) compose workflows by connecting nodes: text encoder → sampler → VAE decoder → output. This visual representation maps cleanly to the underlying diffusion pipeline.
We run ComfyUI on GPU workers with 64+ custom nodes pre-installed. These nodes extend functionality: ControlNet for pose/edge guidance, LoRA loading for style fine-tuning, upscalers for resolution enhancement, IP-Adapter for image-guided generation.
Why ComfyUI over alternatives: Flexibility. A1111's WebUI is designed for interactive use; ComfyUI's workflow-as-JSON approach fits API-driven generation. We can version workflows, share them between users, and execute them programmatically.
Custom node installation: Our machine startup script clones node repositories in parallel (5 at a time), installs requirements, and handles environment configuration. The 64 default nodes cover most use cases; users can request additional nodes.
Blockchain Indexing: Ponder
Ponder is a TypeScript-native blockchain indexer. It watches contract events, processes them through handlers you write, and stores results in PostgreSQL. Unlike The Graph, it's self-hosted—you control your infrastructure and data.
Why Ponder over The Graph:
- TypeScript throughout (our whole stack is TypeScript)
- No subgraph deployment complexity
- Built-in WebSocket server for real-time updates
- Direct PostgreSQL access for custom queries
Schema design: We index:
- Apps (deployed collections)
- App tokens (individual NFTs)
- Mint events (who minted what, when)
- Transfer events (ownership history)
- Token upgrades (version tracking)
This covers the queries our UI needs: "show all collections by this creator," "show mint history for this collection," "show NFTs owned by this address."
Infrastructure: Ephemeral GPU Compute
Our workers run on spot instances from vast.ai. These are decentralized GPU marketplaces where individuals rent out idle compute. Prices are 70-90% below AWS/GCP.
The tradeoff: No guarantees. Machines can disappear without warning. There's no shared filesystem. Network quality varies.
Our solutions:
- Job state in Redis: Workers are stateless; job progress survives machine churn
- Heartbeat-based health: Workers send heartbeats; missing heartbeats trigger job requeue
- Pull-based job claiming: Workers request work when ready; no stale worker registries
- Blob storage for outputs: Generated images go to Azure Blob Storage, not local disk
Smart Contract Architecture
Contract Hierarchy
The Arbitrum deployment uses a factory pattern with upgradeable and non-upgradeable components:
Factory (Upgradeable via UUPS)
- Deploys new collections as minimal proxies
- Uses CREATE2 for deterministic addresses
- Registers versioned implementations (Emerge_721 V1, V2, etc.)
- Upgradeable so we can improve deployment logic
Emerge_721 (Non-upgradeable minimal proxy)
- The actual NFT collection contract
- ERC721A for gas-efficient batch minting
- Token upgrade system with version tracking
- Dynamic metadata via HTTP API
- Immutable once deployed (security guarantee for collectors)
- Each deployment is a lightweight proxy pointing to shared implementation
Why This Design
Minimal proxies (ERC1167): Deploying a full contract for each collection would cost significant gas. Minimal proxies store only a pointer to the implementation; all logic is shared. Deployment cost drops to ~45,000 gas regardless of collection complexity.
CREATE2 determinism: Given the same inputs, CREATE2 produces the same address on any EVM chain. This means:
- We can predict addresses before deployment (useful for metadata preparation)
- Cross-chain collections can have the same address (if we deploy to multiple L2s)
- URLs can include contract addresses before deployment
UUPS upgradeability (for factory only): We want the ability to fix bugs and add features without breaking existing collections. UUPS puts upgrade logic in the implementation contract, reducing proxy overhead.
Immutable collection contracts: Once deployed, a collection contract cannot change. This is a security feature—collectors know the rules can't be altered after they mint.
Centralized minting for MVP: All minting is done by the platform admin via mintTo(). Users never pay gas. Payments are handled off-chain (Diamo, credit cards, crypto to platform wallet).
0xSplits Integration
0xSplits provides immutable, trustless revenue distribution. Each collection has two Split contracts:
Primary Sales Split (includes Arbitrum Foundation):
- Creator wallet (percentage TBD)
- Arbitrum Foundation wallet
- Emerge platform wallet
Royalties Split (secondary sales):
- 80% to creator wallet
- 20% to Emerge platform wallet
The primary sales split is used by the platform when forwarding mint payments. The royalties split is stored in the contract and returned by royaltyInfo() for marketplaces.
Why 0xSplits:
- Audited contracts with extensive production use
- Immutable splits (creator can't change terms after launch)
- Clean UX (recipients see pending balance and withdraw when ready)
- Gas-efficient (batch withdrawals, minimal storage)
Gas Optimization
ERC721A's batch minting saves significant gas:
| Tokens Minted | Standard ERC721 | ERC721A | Savings |
|---|---|---|---|
| 1 | ~51,000 | ~51,000 | 0% |
| 5 | ~255,000 | ~53,000 | 79% |
| 10 | ~510,000 | ~55,000 | 89% |
| 20 | ~1,020,000 | ~57,000 | 94% |
The savings come from deferred ownership tracking. Standard ERC721 writes ownership for each token individually. ERC721A writes once and infers ownership for subsequent tokens in the batch.
Tradeoff: transfers cost slightly more (must update ownership that was previously inferred). For mint-heavy use cases like NFT drops, this is an excellent trade.
Generation Pipeline
Job Flow
- User submits generation request (via Studio UI or API)
- API creates job with parameters, stores in Redis
- Job enters priority queue (sorted set with priority score)
- Worker polls for work matching its capabilities
- Redis function atomically matches and claims job
- Worker executes via appropriate connector (ComfyUI, Ollama, etc.)
- Worker reports progress (0-100%) via Redis pub/sub
- Worker uploads output to blob storage
- Worker completes job with result URL
- Client receives completion via WebSocket
Connector Pattern
Different AI services have different protocols (WebSocket, HTTP streaming, REST). The connector pattern normalizes this:
abstract class BaseConnector {
abstract processJob(job: Job, progress: ProgressCallback): Promise<JobResult>
}Implementations:
- ComfyUIConnector: WebSocket to ComfyUI, workflow execution, image download
- OllamaConnector: HTTP streaming for text generation
- OpenAIConnector: REST API for GPT/DALL-E
Each connector handles service-specific retry logic, error classification, and progress reporting. The worker doesn't need to know protocol details.
Batch Generation for NFT Collections
When generating an NFT collection (say, 1,000 pieces), we don't create 1,000 sequential jobs. Instead:
- Prepare trait combinations: Based on collection config, generate the trait matrix
- Create batch of jobs: One job per NFT, all queued simultaneously
- Distributed execution: Jobs spread across available workers
- Aggregate results: As jobs complete, collect output URLs
- Generate metadata: Create ERC721 metadata JSON for each token
- Upload to IPFS: Metadata files pinned to IPFS via Pinata
- Prepare for mint: Return base URI for on-chain deployment
A 1,000-piece collection with 10 available workers completes in ~30 minutes (assuming ~20s per generation). The parallelization is automatic—the job queue handles distribution.
Data Models
PostgreSQL (via Prisma)
Key models for NFT functionality:
Collection: Represents a configured collection template
- name, description, symbol
- maxSupply, mintPrice
- chainId (which blockchain)
- contractAddress (once deployed)
- creatorId (user who created it)
- status (draft, deployed, active)
FlatFile: Generated asset
- collectionId (parent collection)
- ipfsHash (content address)
- metadata (trait values, generation params)
- tokenId (if minted)
JobHistory: Execution log
- jobId, status, duration
- input parameters, output URLs
- error details (if failed)
Redis (Job State)
Job data stored as Redis hashes:
job:{id}
status: pending|claimed|completed|failed
payload: JSON (generation parameters)
priority: number
created_at: timestamp
claimed_by: worker_id (if claimed)
progress: 0-100
result: JSON (output URLs, if completed)Worker data:
worker:{id}
status: idle|busy
capabilities: JSON (services, hardware, models)
current_job: job_id (if busy)
last_heartbeat: timestampPonder Indexed Data
Blockchain events indexed to PostgreSQL:
apps: Deployed collections (Emerge_721 instances) app_tokens: Individual NFTs in collections app_token_mints: Mint events with recipient, quantity, transaction app_token_transfers: Transfer history app_token_upgrades: Token upgrade events with version tracking
Security Considerations
Smart Contract Security
- OpenZeppelin implementations for standard functionality
- ReentrancyGuard on functions that transfer value
- Access control (Ownable) on admin functions
- Rate limiting on mints (configurable per collection)
- Audit planned before mainnet deployment
Backend Security
- API rate limiting per user/IP
- Wallet signature verification for authenticated endpoints
- Input validation on all parameters
- Environment variables for secrets (not committed)
- CORS configuration for allowed origins
Revenue Security
- 0xSplits contracts are immutable (can't change split after deployment)
- Platform wallet uses multi-sig
- Split configuration verified before collection deployment
- On-chain verification (anyone can audit the split)
Observability
OpenTelemetry Integration
Distributed tracing across the entire request path:
HTTP Request → API → Redis → Worker → Connector → External Service
Span 1 └─ Span 2 └─ Span 3 └─ Span 4All spans share a trace ID, enabling end-to-end debugging. When a generation fails, we can trace from the user's request through job matching, worker execution, and external service calls.
Structured Logging
Winston with OTLP exporter. Every log entry includes:
- Trace ID (correlate with spans)
- Service name
- Log level
- Structured context (job ID, user ID, etc.)
Metrics
Key metrics we track:
- Job completion time (by service type, worker)
- Queue depth and wait time
- Worker utilization
- Error rates (by category)
- WebSocket connection count and latency
Performance Characteristics
Latency
| Operation | Typical Latency |
|---|---|
| Job claim | <1ms (Redis Lua function) |
| Progress update delivery | <100ms (WebSocket) |
| Single image generation | 5-20s (depends on complexity) |
| Collection deployment | 1-5s (blockchain confirmation) |
Throughput
Current capacity with 10 workers: ~500 generations/hour With 50 workers: ~2,500 generations/hour
The system scales linearly with workers—no central bottleneck.
Reliability
- 95%+ job success rate
- Automatic retry on transient failures
- Job state survives worker/machine failures
- 24-hour job retention for debugging
Related Documentation:
- Overview - The full story of what we're building
- Current State - Inventory of existing infrastructure
- Legacy Tezos - Production Tezos system documentation
