Federated Fine-Tuning

Train models across distributed Islands without centralizing data — your data never leaves your machines

Federated Fine-Tuning

Experimental

Federated fine-tuning code exists and the API endpoints are functional, but this feature has not been verified with real distributed training workloads. Expect breaking changes during beta.

Fine-tune AI models without centralizing your data. Each participating Island trains on its own local data and sends only gradient updates (mathematical deltas, not raw data) to the coordinator. The coordinator aggregates these updates into an improved model. After N rounds, you have a fine-tuned model — and the training data never left the Islands.

How It Works

Your data stays on your Islands:

  Island A (hospital data)  Island B (clinic data)  Island C (research data)
       │                         │                        │
       ├── Train locally ───────┼── Train locally ───────┤
       │                         │                        │
       ├── Send gradients ──────┼── Send gradients ──────┤
       │     (NOT data)          │     (NOT data)         │
       ▼                         ▼                        ▼
                    Coordinator aggregates
                    (weighted average)
                           │
                    Sends updated model
                    back to all Islands
                           │
                    Repeat for N rounds
                           │
                    Fine-tuned model ready

Starting a Training Session

POST /api/v1/federated/sessions

{
  "name": "Customer support fine-tune",
  "base_workload_id": 42,
  "total_rounds": 10,
  "config": {
    "algorithm": "fed_avg",
    "local_epochs": 3,
    "learning_rate": 0.001,
    "batch_size": 32,
    "min_participants": 3
  }
}

The coordinator automatically finds eligible Islands, distributes the base model, and starts the training loop.

Privacy Guarantees

Protection	How It Works
Data stays local	Islands train on their own data — raw data is never transmitted
Gradient exchange only	Only mathematical weight deltas cross the network
Differential privacy	Optional noise injection (`dp_sigma`) makes it mathematically impossible to reconstruct individual data points from gradients

Set dp_sigma in the config to enable differential privacy. Higher values = stronger privacy guarantees, at the cost of some model accuracy.

Aggregation Algorithms

Algorithm	Best For
FedAvg (default)	Most use cases — weighted average proportional to each Island’s data size
FedProx	When Islands have very different data distributions — adds regularization to prevent divergence

Monitoring Progress

GET /api/v1/federated/sessions/{id}

{
  "id": "session-uuid",
  "name": "Customer support fine-tune",
  "status": "training",
  "current_round": 4,
  "total_rounds": 10,
  "participants": [
    {"host_id": "island-a", "status": "gradient_sent", "rounds_completed": 4, "total_samples": 5000},
    {"host_id": "island-b", "status": "training", "rounds_completed": 3, "total_samples": 3200},
    {"host_id": "island-c", "status": "gradient_sent", "rounds_completed": 4, "total_samples": 4100}
  ]
}

Or use the training dashboard at app.archipelag.io/training for visual monitoring with progress bars and participant status cards.

Secure Aggregation

For maximum privacy, gradients can be masked so that even the coordinator cannot see individual participant updates — only the aggregate. Pairwise masks cancel out when summed, ensuring mathematical privacy without sacrificing model quality.

Model Versioning

Completed training sessions produce a new Cargo (fine-tuned model) linked to its training lineage. You can see which base model was used, how many rounds of training ran, and how many participants contributed. Roll back to a previous version if needed.

Fault Tolerance

If an Island disconnects during training, the session continues with the remaining participants. The session fails only if more than half the participants drop out.

Billing

Each participant earns credits proportional to the training rounds completed and samples processed — the same way Islands earn for inference jobs. The session creator pays the total training cost.

Use Cases

Scenario	Why Federated
Healthcare	Train on patient data across hospitals without HIPAA violations
Finance	Fine-tune on transaction data without exposing sensitive records
Multi-tenant SaaS	Each customer’s data trains the shared model without cross-contamination
Edge devices	Fine-tune on mobile data without uploading to cloud
Legal	Train on case documents without breaching attorney-client privilege