Open Beta Archipelag.io is in open beta until June 2026. All credits and earnings are virtual. Read the announcement →

System Overview

High-level architecture of the Archipelag.io distributed compute network

System Overview

Archipelag.io is built as a distributed system with three main components: the Coordinator (control plane), Islands (running the Island software), and the Message Fabric (NATS) connecting them.

Repository Structure

archipelag-io/
├── app/                  # Coordinator (Phoenix LiveView)
├── node-agent/           # Island software (Rust)
├── proto-contracts/      # Protobuf definitions
├── workload-containers/  # Docker images for Cargos
├── website/              # Marketing site (Zola)
├── docs/                 # This documentation
└── infra/                # Docker Compose, Terraform

Component Diagram

                                      ┌───────────────┐
                                      │  website      │  (public)
                                      └──────┬────────┘
                                             │ links/docs/API
┌──────────────────┐        publishes        │
│ proto-contracts  │◄────────────────────────┘
│ (gRPC/Protobuf)  │
└──────┬───────────┘
       │ versioned schema (tags)
       │
       │            consumes                   consumes
┌──────▼───────────┐    ┌─────────────────┐    ┌─────────────────┐
│   app/           │    │   node-agent    │    │   sdk-js/python │
│ (Phoenix/Elixir) │    │   (Rust)        │    │  (client APIs)  │
└───┬──────────────┘    └───┬─────────────┘    └────────┬────────┘
    │ uses images           │ pulls images               │
    │ and manifests         │                            │ used by
    │                       │                            │ websites/apps
┌───▼───────────────────────▼───┐
│     workload-containers       │ (OCI images: vLLM, SD, etc.)
└───────────────────────────────┘

Runtime Data Flow

[ End User ]
    │  HTTP/WS
    ▼
┌───────────────┐
│  Coordinator  │  (app/)
│ - LiveView UI │
│ - Oban jobs   │
│ - Placement   │
└─┬───────────┬─┘
  │           │ gRPC (schemas from proto-contracts)
  │           ▼
  │     ┌──────────────┐              telemetry/metrics
  │     │ NATS/JetStrm │◄──────────────┐
  │     └──────┬───────┘               │
  │            │ jobs/heartbeats       │
  │            ▼                       │
  │     ┌───────────────┐   pulls OCI  │
  │     │  node-agent   │──────────────┘
  │     │  (on Island)  │   images from workload-containers
  │     └──────┬────────┘
  │            │ containerd/NVIDIA
  │            ▼
  │     ┌───────────────┐
  │     │    Cargos     │  (vLLM / SD / FFmpeg / WASM)
  │     └───────────────┘
  │
  │  PostgreSQL  ⇦ coordinator owns schema & billing
  │  ──────────────────────────────────────────────
  └─► Prometheus/Grafana for observability

Core Components

Coordinator (app/)

The coordinator is the control plane, built with Phoenix LiveView.

Responsibilities:

  • Web UI and API
  • User authentication and authorization
  • Job placement and dispatch
  • Island health monitoring
  • Billing and credit management
  • Background jobs via Oban
  • Database migrations
GET/api/v1/health
Health check endpoint for load balancers and monitoring.
POST/api/v1/chat/completionsRequired
Submit a chat completion request. Returns streaming SSE response.

Island Software (node-agent/)

The Island software is a Rust binary that runs on Islands.

Responsibilities:

  • Hardware capability detection (GPU, CPU, memory)
  • Heartbeat and health reporting
  • Container image management
  • Cargo execution and resource isolation
  • Output streaming back to coordinator
  • Resource metering for billing

Message Fabric (NATS)

NATS with JetStream provides the messaging layer.

Key Features:

  • Durable message delivery for job dispatch
  • Subject-based routing for Island targeting
  • Streaming for real-time output
  • Automatic reconnection handling

Cargo Containers

Pre-built, signed Docker images for specific Cargos.

Current Cargos:

CargoImageDescription
llm-chatghcr.io/archipelag-io/llm-chat:v1LLM inference with llama.cpp
image-genghcr.io/archipelag-io/image-gen:v1Image generation with SD
Controlled Cargos
Version 1 only runs network-approved, signed containers. Consumers cannot execute arbitrary code. This is a fundamental security constraint.

Technology Stack

ComponentTechnologyRationale
CoordinatorPhoenix LiveView (Elixir)Real-time native, fault-tolerant
Job QueueOban (Elixir)Persistent, reliable
Message FabricNATS JetStreamDurable, fast, reconnection
Island SoftwareRustMemory-safe, single binary
DatabasePostgreSQLReliable, Ecto migrations
Container RuntimeDocker/containerdUbiquitous, GPU support
Cargo ProtocolgRPC (Protobuf)Typed contracts, streaming

Design Principles

Security First
Consumers don't trust Islands. Islands don't trust consumer code. The coordinator is the single source of authority. All Cargos are signed and verified.
Assume Failure
Islands run on consumer hardware with unreliable connections. The system is designed for retry, requeue, and failover at every layer.
Local-First
Prefer nearby Islands. Measure real RTT, not just geography. Lower latency improves the experience and reduces costs.
Streaming
LLM responses stream token-by-token. All APIs are designed for incremental output delivery.

Next Steps

{% card(title="Coordinator Details", href="/architecture/coordinator/") %} Deep dive into the Phoenix application architecture.

Island

How the Island software manages Cargos.

Cargos

Container specifications and execution model.

{% end %}