Placement & Fit Scoring
How the coordinator selects the best Island for each job — multi-dimensional scoring, GPU bandwidth estimation, and runtime-specific strategies
Placement & Fit Scoring
When a job is submitted, the coordinator doesn’t just find any qualifying Island — it ranks candidates by how well they can run the Cargo and picks the best one. This is the fit scoring system.
Overview
Traditional placement checks binary requirements: “does this Island have enough VRAM?” Fit scoring goes further by estimating how well the Cargo will perform on each candidate, producing a composite 0–100 score across multiple dimensions.
Job submitted
│
▼
Filter: online, approved, not in cooldown,
runtime compatible, meets min requirements
│
▼
Score: up to 10 candidates ranked by fit
│
▼
Select: highest composite score wins
│
▼
Region tiebreaker: same region > adjacent > any
The system is implemented in Coordinator.Placement.FitScoring and Coordinator.Placement.GpuSpecs.
Scoring Dimensions
Every candidate Island is scored on three dimensions, each 0–100:
| Dimension | What it measures | Example |
|---|---|---|
| Speed | Estimated throughput relative to a target | GPU with 100 tok/s scores higher than one with 20 tok/s |
| Fit | Resource utilization sweet spot — not too tight, not wasted | 60% VRAM utilization scores higher than 95% or 20% |
| Headroom | Capacity for concurrent work | An idle Island scores higher than one running 3 jobs |
The composite score is a weighted average. Weights vary by runtime type:
| Runtime | Speed | Fit | Headroom |
|---|---|---|---|
llmcpp | 50% | 35% | 15% |
container | 30% | 40% | 30% |
wasm | 25% | 40% | 35% |
coreml / onnx | 45% | 40% | 15% |
Fit Levels
Based on dimension scores, each Island gets a fit level:
| Level | Meaning | When assigned |
|---|---|---|
| Perfect | Optimal match — GPU-accelerated with headroom | Fit ≥ 80, Speed ≥ 60, has GPU (for GPU Cargos) |
| Good | Solid match with 20%+ headroom | Fit ≥ 50, Speed ≥ 30 |
| Marginal | Will work but may be slow or unstable | Fit ≥ 20 |
| Too tight | Does not meet minimum requirements | Filtered out before scoring |
Islands scored as “Too tight” are never selected. The remaining candidates are sorted by composite score.
Runtime-Specific Scoring
LLM Cargos (llmcpp)
The LLM scorer estimates tokens per second using GPU memory bandwidth:
tok/s = gpu_bandwidth_GB/s / model_size_GB × 0.55 × mode_factor
Where:
- gpu_bandwidth is looked up from a 120+ GPU model table (or reported by the Island in heartbeats)
- model_size is estimated from the Cargo’s VRAM requirement (~70% of required VRAM)
- 0.55 is an empirical efficiency factor
- mode_factor is 1.0 for full GPU, 0.5 for CPU offload, 0.3 for CPU-only
Speed score: (tok_s / 40) × 100 — targeting 40 tok/s for interactive chat.
Fit score: VRAM utilization sweet spot:
- 50–80% utilization → 100
- Under 50% → 60–100 (wasting resources)
- Over 90% → 20–70 (risk of OOM, KV cache pressure)
Container Cargos
Speed: CPU ratio — how many times the Island exceeds the requirement. An Island with 16 cores running a 4-core Cargo scores higher than one with 4 cores.
Fit: Average of CPU and RAM utilization. The sweet spot is 30–70% utilization.
Headroom: Based on available concurrent slots: max(cpu_cores / required_cores, 1) - active_jobs.
WASM Cargos
WASM modules are lightweight and single-threaded. Scoring emphasizes:
- Speed: CPU core count as proxy for quality
- Fit: RAM utilization (WASM linear memory vs available)
- Headroom: Many can run concurrently — scored generously
Mobile Cargos (coreml, onnx)
Mobile scoring accounts for hardware accelerators and device health:
- Speed: Neural Engine (CoreML) or NPU (ONNX) presence is the primary factor. Metal GPU adds a bonus.
- Fit: Available device memory vs model size. Considers
memory_used_mbif reported. - Penalties: Thermal state (
critical= -30,serious= -15) and low battery without charging (-10 to -20).
GPU Bandwidth Table
The coordinator maintains a lookup table mapping GPU model names to memory bandwidth in GB/s. This is the primary input for LLM performance estimation.
| Category | Examples | Bandwidth Range |
|---|---|---|
| NVIDIA Data Center | H100, A100, L40S | 300–3350 GB/s |
| NVIDIA Consumer | RTX 4090, 3090, 3060 | 224–1008 GB/s |
| AMD Radeon | RX 7900 XTX, 6800 XT | 224–960 GB/s |
| AMD Instinct | MI300X, MI250X | 1228–5300 GB/s |
| Apple Silicon | M1–M4 (all tiers) | 68–819 GB/s |
| Intel Arc | A770, A750 | 186–560 GB/s |
When a GPU model isn’t found in the table, the system falls back to a conservative estimate based on VRAM size (~40 GB/s per GB of VRAM).
Performance Estimates in Heartbeats
The Island node agent computes and reports performance estimates with every heartbeat (every 10 seconds):
| Field | Type | Description |
|---|---|---|
gpu_bandwidth_gb_s | float | GPU memory bandwidth (looked up from model) |
estimated_llm_tok_s | float | Estimated tok/s for a reference 7B Q4 model |
max_concurrent_containers | int | Based on CPU cores and RAM |
wasm_memory_limit_mb | int | Available WASM linear memory |
supported_runtimes | array | Runtime types this Island can serve |
These estimates are stored in the performance_estimates JSONB column on the hosts table and used by the fit scorer to improve accuracy over time.
Integration Points
Job Dispatch (find_host_for_workload)
The main entry point is Coordinator.Hosts.find_host_for_workload/2. It:
- Filters candidates by binary requirements (online, approved, runtime, VRAM, CPU, RAM)
- Fetches up to 10 qualifying candidates
- Scores all candidates with
FitScoring.rank_hosts/2 - Selects the highest composite score
- Falls back to DB-ordered first candidate if all score
:too_tight
Regional preference is applied as a tiered fallback: same region → adjacent regions → any region.
Cargo Registry
Each Cargo card in the Cargo Registry shows network availability:
- Number of Islands ready to serve it
- Best fit level and composite score
- Estimated throughput
- Fit level breakdown across available Islands
Island Dashboard
The Island dashboard shows a “Compatible Cargos” section listing every Cargo the Island can run, sorted by fit score. This helps Island operators understand what their hardware is best suited for.
Pricing
Fit scoring is separate from pricing. The hardware tier system determines pricing multipliers (enterprise 2.0×, high-end 1.5×, etc.), while fit scoring determines which Island gets the job. A more capable Island earns more per job through tier multipliers, and fit scoring ensures it gets matched to appropriate Cargos.
Source Code
| Module | Purpose |
|---|---|
Coordinator.Placement.GpuSpecs | GPU bandwidth lookup table (120+ models) |
Coordinator.Placement.FitScoring | Multi-dimensional scoring engine |
Coordinator.Hosts.find_host_for_workload/2 | Placement entry point |
node-agent/src/metrics/gpu.rs | Island-side bandwidth lookup and estimation |
See Also
- Cargos — Runtime types, requirements matching, trust levels
- Karma System — How reputation affects Island selection priority
- Credits & Pricing — Hardware tiers, pricing multipliers, billing
