Open Beta Archipelag.io is in open beta until June 2026. All credits and earnings are virtual. Read the announcement →

Placement & Fit Scoring

How the coordinator selects the best Island for each job — multi-dimensional scoring, GPU bandwidth estimation, and runtime-specific strategies

Placement & Fit Scoring

When a job is submitted, the coordinator doesn’t just find any qualifying Island — it ranks candidates by how well they can run the Cargo and picks the best one. This is the fit scoring system.

Overview

Traditional placement checks binary requirements: “does this Island have enough VRAM?” Fit scoring goes further by estimating how well the Cargo will perform on each candidate, producing a composite 0–100 score across multiple dimensions.

Job submitted
    │
    ▼
Filter: online, approved, not in cooldown,
        runtime compatible, meets min requirements
    │
    ▼
Score: up to 10 candidates ranked by fit
    │
    ▼
Select: highest composite score wins
    │
    ▼
Region tiebreaker: same region > adjacent > any

The system is implemented in Coordinator.Placement.FitScoring and Coordinator.Placement.GpuSpecs.

Scoring Dimensions

Every candidate Island is scored on three dimensions, each 0–100:

DimensionWhat it measuresExample
SpeedEstimated throughput relative to a targetGPU with 100 tok/s scores higher than one with 20 tok/s
FitResource utilization sweet spot — not too tight, not wasted60% VRAM utilization scores higher than 95% or 20%
HeadroomCapacity for concurrent workAn idle Island scores higher than one running 3 jobs

The composite score is a weighted average. Weights vary by runtime type:

RuntimeSpeedFitHeadroom
llmcpp50%35%15%
container30%40%30%
wasm25%40%35%
coreml / onnx45%40%15%
Why different weights?
LLM Cargos prioritize speed because users are waiting for token-by-token responses. Container Cargos prioritize fit and headroom because they need reliable resources and often run concurrently. WASM Cargos are lightweight and benefit most from headroom.

Fit Levels

Based on dimension scores, each Island gets a fit level:

LevelMeaningWhen assigned
PerfectOptimal match — GPU-accelerated with headroomFit ≥ 80, Speed ≥ 60, has GPU (for GPU Cargos)
GoodSolid match with 20%+ headroomFit ≥ 50, Speed ≥ 30
MarginalWill work but may be slow or unstableFit ≥ 20
Too tightDoes not meet minimum requirementsFiltered out before scoring

Islands scored as “Too tight” are never selected. The remaining candidates are sorted by composite score.

Runtime-Specific Scoring

LLM Cargos (llmcpp)

The LLM scorer estimates tokens per second using GPU memory bandwidth:

tok/s = gpu_bandwidth_GB/s / model_size_GB × 0.55 × mode_factor

Where:

  • gpu_bandwidth is looked up from a 120+ GPU model table (or reported by the Island in heartbeats)
  • model_size is estimated from the Cargo’s VRAM requirement (~70% of required VRAM)
  • 0.55 is an empirical efficiency factor
  • mode_factor is 1.0 for full GPU, 0.5 for CPU offload, 0.3 for CPU-only

Speed score: (tok_s / 40) × 100 — targeting 40 tok/s for interactive chat.

Fit score: VRAM utilization sweet spot:

  • 50–80% utilization → 100
  • Under 50% → 60–100 (wasting resources)
  • Over 90% → 20–70 (risk of OOM, KV cache pressure)

Container Cargos

Speed: CPU ratio — how many times the Island exceeds the requirement. An Island with 16 cores running a 4-core Cargo scores higher than one with 4 cores.

Fit: Average of CPU and RAM utilization. The sweet spot is 30–70% utilization.

Headroom: Based on available concurrent slots: max(cpu_cores / required_cores, 1) - active_jobs.

WASM Cargos

WASM modules are lightweight and single-threaded. Scoring emphasizes:

  • Speed: CPU core count as proxy for quality
  • Fit: RAM utilization (WASM linear memory vs available)
  • Headroom: Many can run concurrently — scored generously

Mobile Cargos (coreml, onnx)

Mobile scoring accounts for hardware accelerators and device health:

  • Speed: Neural Engine (CoreML) or NPU (ONNX) presence is the primary factor. Metal GPU adds a bonus.
  • Fit: Available device memory vs model size. Considers memory_used_mb if reported.
  • Penalties: Thermal state (critical = -30, serious = -15) and low battery without charging (-10 to -20).
Battery and thermal awareness
A device in `critical` thermal state or below 20% battery (not charging) will score significantly lower, reducing the chance it receives jobs. This protects user devices from overheating or unexpected shutdowns.

GPU Bandwidth Table

The coordinator maintains a lookup table mapping GPU model names to memory bandwidth in GB/s. This is the primary input for LLM performance estimation.

CategoryExamplesBandwidth Range
NVIDIA Data CenterH100, A100, L40S300–3350 GB/s
NVIDIA ConsumerRTX 4090, 3090, 3060224–1008 GB/s
AMD RadeonRX 7900 XTX, 6800 XT224–960 GB/s
AMD InstinctMI300X, MI250X1228–5300 GB/s
Apple SiliconM1–M4 (all tiers)68–819 GB/s
Intel ArcA770, A750186–560 GB/s

When a GPU model isn’t found in the table, the system falls back to a conservative estimate based on VRAM size (~40 GB/s per GB of VRAM).

Island-reported bandwidth
Islands running the node agent report their GPU bandwidth in heartbeats. When available, the coordinator prefers this value over the lookup table — it accounts for the actual hardware detected on the machine.

Performance Estimates in Heartbeats

The Island node agent computes and reports performance estimates with every heartbeat (every 10 seconds):

FieldTypeDescription
gpu_bandwidth_gb_sfloatGPU memory bandwidth (looked up from model)
estimated_llm_tok_sfloatEstimated tok/s for a reference 7B Q4 model
max_concurrent_containersintBased on CPU cores and RAM
wasm_memory_limit_mbintAvailable WASM linear memory
supported_runtimesarrayRuntime types this Island can serve

These estimates are stored in the performance_estimates JSONB column on the hosts table and used by the fit scorer to improve accuracy over time.

Integration Points

Job Dispatch (find_host_for_workload)

The main entry point is Coordinator.Hosts.find_host_for_workload/2. It:

  1. Filters candidates by binary requirements (online, approved, runtime, VRAM, CPU, RAM)
  2. Fetches up to 10 qualifying candidates
  3. Scores all candidates with FitScoring.rank_hosts/2
  4. Selects the highest composite score
  5. Falls back to DB-ordered first candidate if all score :too_tight

Regional preference is applied as a tiered fallback: same region → adjacent regions → any region.

Cargo Registry

Each Cargo card in the Cargo Registry shows network availability:

  • Number of Islands ready to serve it
  • Best fit level and composite score
  • Estimated throughput
  • Fit level breakdown across available Islands

Island Dashboard

The Island dashboard shows a “Compatible Cargos” section listing every Cargo the Island can run, sorted by fit score. This helps Island operators understand what their hardware is best suited for.

Pricing

Fit scoring is separate from pricing. The hardware tier system determines pricing multipliers (enterprise 2.0×, high-end 1.5×, etc.), while fit scoring determines which Island gets the job. A more capable Island earns more per job through tier multipliers, and fit scoring ensures it gets matched to appropriate Cargos.

Source Code

ModulePurpose
Coordinator.Placement.GpuSpecsGPU bandwidth lookup table (120+ models)
Coordinator.Placement.FitScoringMulti-dimensional scoring engine
Coordinator.Hosts.find_host_for_workload/2Placement entry point
node-agent/src/metrics/gpu.rsIsland-side bandwidth lookup and estimation

See Also

  • Cargos — Runtime types, requirements matching, trust levels
  • Karma System — How reputation affects Island selection priority
  • Credits & Pricing — Hardware tiers, pricing multipliers, billing