SynapseX Training System
Fine-tune models with your data using LoRA adapters, Supervised Fine-Tuning (SFT), and Direct Preference Optimization (DPO).
Overview
SynapseX training enables:
- Per-Tenant Customization - Each customer gets their own model adaptation
- Feedback-Driven Learning - Automatic training from user corrections
- Lightweight Adapters - LoRA enables efficient fine-tuning
- HPC Infrastructure - LUMI supercomputer for fast training
Training Methods
LoRA (Low-Rank Adaptation)
Efficient fine-tuning that adds small trainable matrices to the base model:
- Memory Efficient - Train with minimal GPU memory
- Fast Training - Complete in hours, not days
- Easy Switching - Load different adapters per tenant
- Preserves Base Knowledge - Only modifies target behaviors
SFT (Supervised Fine-Tuning)
Train on input-output pairs from your data:
Input: "What are the store hours?"
Output: "Our store is open Monday-Friday 9am-6pm, Saturday 10am-4pm."
DPO (Direct Preference Optimization)
Train on preference pairs to align with user preferences:
Prompt: "Explain our return policy"
Chosen: "You can return items within 30 days with receipt..." ✓
Rejected: "Returns are complicated and require manager approval..." ✗
Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ SYNAPSEX TRAINING SYSTEM │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ DATA COLLECTION │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Feedback │ │ Datasets │ │Corrections│ │ Ratings │ │ │
│ │ │ API │ │ (DataHub)│ │ │ │ │ │ │
│ │ └────┬─────┘ └────┬─────┘ └─────┬─────┘ └────┬─────┘ │ │
│ │ │ │ │ │ │ │
│ │ └─────────────┴────────────── ┴─────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────┐ │ │
│ │ │ Training Queue │ │ │
│ │ └────────┬─────────┘ │ │
│ └─────────────────────────────┼───────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────▼───────────────────────────────────────┐ │
│ │ LUMI HPC CLUSTER │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ MI250X │ │ MI250X │ │ MI250X │ │ MI250X │ │ │
│ │ │ 64GB │ │ 64GB │ │ 64GB │ │ 64GB │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ MI250X │ │ MI250X │ │ MI250X │ │ MI250X │ │ │
│ │ │ 64GB │ │ 64GB │ │ 64GB │ │ 64GB │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ │ │
│ │ Total: 512GB VRAM | DeepSpeed ZeRO-3 | ROCm 6.x │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ LoRA ADAPTERS │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────── ──────┐ │ │
│ │ │ tenant_001 │ │ tenant_002 │ │ tenant_003 │ │ │
│ │ │ lora_v3.bin │ │ lora_v5.bin │ │ lora_v2.bin │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Feedback Collection
Collecting User Feedback
Feedback drives automatic training:
import requests
# Submit feedback via API
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/feedback",
headers={"Authorization": "Bearer sk-xxx"},
json={
"conversation_id": "conv_123",
"message_id": "msg_456",
"type": "correction",
"correction": "The correct answer is 500mg, not 250mg",
"metadata": {
"user_id": "user_789",
"category": "dosage"
}
}
)
Feedback Types
| Type | Description | Training Use |
|---|---|---|
thumbs_up | Positive signal | Positive example for DPO |
thumbs_down | Negative signal | Negative example for DPO |
correction | User-provided fix | Direct training example |
rating | 1-5 score | Weight for training |
report | Content issue | Exclusion from training |
Check Training Readiness
response = requests.get(
"https://api.synapsex.ai/v1/tenants/my_tenant/feedback/stats",
headers={"Authorization": "Bearer sk-xxx"}
)
stats = response.json()
print(f"Total feedback: {stats['total']}")
print(f"Ready for training: {stats['ready_for_training']}")
# Ready when: corrections >= 50 or (thumbs_up + thumbs_down) >= 100
Triggering Training
LoRA Training
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/train/lora",
headers={"Authorization": "Bearer sk-xxx"},
json={
"min_feedback_count": 100,
"lora_r": 16,
"lora_alpha": 32,
"epochs": 3,
"learning_rate": 2e-4,
"batch_size": 4
}
)
job = response.json()
print(f"Training job: {job['job_id']}")
print(f"Estimated time: {job['estimated_duration_minutes']} minutes")
Training Parameters
| Parameter | Default | Description |
|---|---|---|
min_feedback_count | 50 | Minimum feedback samples required |
lora_r | 16 | LoRA rank (higher = more capacity) |
lora_alpha | 32 | LoRA scaling factor |
epochs | 3 | Training epochs |
learning_rate | 2e-4 | Learning rate |
batch_size | 4 | Training batch size |
warmup_ratio | 0.1 | Warmup ratio |
weight_decay | 0.01 | L2 regularization |
DPO Training
For preference-based alignment:
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/train/dpo",
headers={"Authorization": "Bearer sk-xxx"},
json={
"min_preference_pairs": 100,
"beta": 0.1,
"epochs": 2,
"learning_rate": 5e-5
}
)
Monitoring Training
Check Job Status
response = requests.get(
f"https://api.synapsex.ai/v1/tenants/my_tenant/train/{job_id}",
headers={"Authorization": "Bearer sk-xxx"}
)
status = response.json()
print(f"Status: {status['status']}")
print(f"Progress: {status['progress'] * 100:.1f}%")
print(f"Current epoch: {status['current_epoch']}/{status['total_epochs']}")
print(f"Loss: {status['metrics']['loss']:.4f}")
Job Statuses
| Status | Description |
|---|---|
queued | Waiting for resources |
preparing | Loading data and model |
running | Training in progress |
completed | Successfully finished |
failed | Training failed |
cancelled | Cancelled by user |
Training Metrics
{
"job_id": "train_abc123",
"status": "running",
"progress": 0.65,
"current_epoch": 2,
"total_epochs": 3,
"metrics": {
"loss": 0.234,
"learning_rate": 1.8e-4,
"grad_norm": 0.45,
"samples_processed": 1500,
"tokens_processed": 450000
},
"resources": {
"gpus": 4,
"gpu_memory_used_gb": 180,
"training_speed": "1200 tokens/sec"
}
}
LoRA Management
List LoRA Versions
response = requests.get(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora",
headers={"Authorization": "Bearer sk-xxx"}
)
loras = response.json()
for lora in loras['versions']:
print(f"Version: {lora['version']}")
print(f" Created: {lora['created_at']}")
print(f" Samples: {lora['training_samples']}")
print(f" Active: {lora['is_active']}")
Activate LoRA Version
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora/v3/activate",
headers={"Authorization": "Bearer sk-xxx"}
)
Rollback to Previous Version
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora/v2/activate",
headers={"Authorization": "Bearer sk-xxx"}
)
A/B Testing
Route traffic between LoRA versions:
response = requests.patch(
"https://api.synapsex.ai/v1/tenants/my_tenant/settings",
headers={"Authorization": "Bearer sk-xxx"},
json={
"lora_ab_test": {
"enabled": True,
"versions": {
"v2": 0.3, # 30% traffic
"v3": 0.7 # 70% traffic
}
}
}
)
Data Preparation
Using Datasets (DataHub)
Upload training data via DataHub:
# 1. Register dataset
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/datasets/register",
headers={"Authorization": "Bearer sk-xxx"},
json={
"name": "training_conversations",
"type": "conversations",
"data_use": {
"allow_rag": False,
"allow_private_training": True
}
}
)
dataset_id = response.json()['id']
# 2. Upload conversations
response = requests.post(
f"https://api.synapsex.ai/v1/tenants/my_tenant/datasets/{dataset_id}/upload/conversations",
headers={"Authorization": "Bearer sk-xxx"},
json={
"conversations": [
{
"messages": [
{"role": "user", "content": "What's your return policy?"},
{"role": "assistant", "content": "Items can be returned within 30 days..."}
]
}
]
}
)
Data Format
SFT Format
{
"conversations": [
{
"messages": [
{"role": "system", "content": "You are a helpful pharmacy assistant."},
{"role": "user", "content": "What is the dosage for ibuprofen?"},
{"role": "assistant", "content": "For adults, the typical dosage is 200-400mg every 4-6 hours..."}
]
}
]
}
DPO Format
{
"preferences": [
{
"prompt": "Explain our shipping policy",
"chosen": "We offer free shipping on orders over $50. Standard shipping takes 3-5 business days...",
"rejected": "Check the website for shipping info."
}
]
}
Best Practices
Data Quality
✅ Do:
- Use high-quality, diverse examples
- Include edge cases and corrections
- Balance positive and negative examples for DPO
- Validate data format before upload
❌ Don't:
- Use synthetic/generated data only
- Include PII in training data
- Overtrain on small datasets (causes overfitting)
Training Configuration
| Use Case | LoRA r | Epochs | Learning Rate |
|---|---|---|---|
| Small corrections | 8 | 2 | 3e-4 |
| Domain adaptation | 16 | 3 | 2e-4 |
| Major behavior change | 32 | 5 | 1e-4 |
| Continuous learning | 16 | 1 | 5e-5 |
Monitoring
- Check loss curves for convergence
- Validate on held-out examples
- A/B test before full deployment
- Monitor production metrics after activation
Pricing
| Plan | Training Jobs/Month | LoRA Storage |
|---|---|---|
| Free | 0 | - |
| Pro | 2 | 1 version |
| Hybrid | 10 | 5 versions |
| Enterprise | Unlimited | Unlimited |
Training costs:
- SFT: $0.10 per 1000 training samples
- DPO: $0.15 per 1000 preference pairs
- GPU time included in job cost
Next Steps
- 📖 API Reference - Training endpoints
- 🔄 Multi-Tenancy - Per-tenant models
- ⚛️ Quantum Reranking - Enhance trained models