SynapseX Training System
Fine-tune models with your data using LoRA adapters, Supervised Fine-Tuning (SFT), and Direct Preference Optimization (DPO).
Overviewβ
SynapseX training enables:
- Per-Tenant Customization - Each customer gets their own model adaptation
- Feedback-Driven Learning - Automatic training from user corrections
- Lightweight Adapters - LoRA enables efficient fine-tuning
- HPC Infrastructure - LUMI supercomputer for fast training
Training Methodsβ
LoRA (Low-Rank Adaptation)β
Efficient fine-tuning that adds small trainable matrices to the base model:
- Memory Efficient - Train with minimal GPU memory
- Fast Training - Complete in hours, not days
- Easy Switching - Load different adapters per tenant
- Preserves Base Knowledge - Only modifies target behaviors
SFT (Supervised Fine-Tuning)β
Train on input-output pairs from your data:
Input: "What are the store hours?"
Output: "Our store is open Monday-Friday 9am-6pm, Saturday 10am-4pm."
DPO (Direct Preference Optimization)β
Train on preference pairs to align with user preferences:
Prompt: "Explain our return policy"
Chosen: "You can return items within 30 days with receipt..." β
Rejected: "Returns are complicated and require manager approval..." β
Architectureβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SYNAPSEX TRAINING SYSTEM β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DATA COLLECTION β β
β β β β
β β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β β
β β β Feedback β β Datasets β βCorrectionsβ β Ratings β β β
β β β API β β (DataHub)β β β β β β β
β β ββββββ¬ββββββ ββββββ¬ββββββ βββββββ¬ββββββ ββββββ¬ββββββ β β
β β β β β β β β
β β βββββββββββββββ΄βββββββββββββββ΄ββββββββββββββ β β
β β β β β
β β βΌ β β
β β ββββββββββββββββββββ β β
β β β Training Queue β β β
β β ββββββββββ¬ββββββββββ β β
β βββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β LUMI HPC CLUSTER β β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β MI250X β β MI250X β β MI250X β β MI250X β β β
β β β 64GB β β 64GB β β 64GB β β 64GB β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β MI250X β β MI250X β β MI250X β β MI250X β β β
β β β 64GB β β 64GB β β 64GB β β 64GB β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β β
β β Total: 512GB VRAM | DeepSpeed ZeRO-3 | ROCm 6.x β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β LoRA ADAPTERS β β
β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β β
β β β tenant_001 β β tenant_002 β β tenant_003 β β β
β β β lora_v3.bin β β lora_v5.bin β β lora_v2.bin β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Feedback Collectionβ
Collecting User Feedbackβ
Feedback drives automatic training:
import requests
# Submit feedback via API
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/feedback",
headers={"Authorization": "Bearer sk-xxx"},
json={
"conversation_id": "conv_123",
"message_id": "msg_456",
"type": "correction",
"correction": "The correct answer is 500mg, not 250mg",
"metadata": {
"user_id": "user_789",
"category": "dosage"
}
}
)
Feedback Typesβ
| Type | Description | Training Use |
|---|---|---|
thumbs_up | Positive signal | Positive example for DPO |
thumbs_down | Negative signal | Negative example for DPO |
correction | User-provided fix | Direct training example |
rating | 1-5 score | Weight for training |
report | Content issue | Exclusion from training |
Check Training Readinessβ
response = requests.get(
"https://api.synapsex.ai/v1/tenants/my_tenant/feedback/stats",
headers={"Authorization": "Bearer sk-xxx"}
)
stats = response.json()
print(f"Total feedback: {stats['total']}")
print(f"Ready for training: {stats['ready_for_training']}")
# Ready when: corrections >= 50 or (thumbs_up + thumbs_down) >= 100
Triggering Trainingβ
LoRA Trainingβ
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/train/lora",
headers={"Authorization": "Bearer sk-xxx"},
json={
"min_feedback_count": 100,
"lora_r": 16,
"lora_alpha": 32,
"epochs": 3,
"learning_rate": 2e-4,
"batch_size": 4
}
)
job = response.json()
print(f"Training job: {job['job_id']}")
print(f"Estimated time: {job['estimated_duration_minutes']} minutes")
Training Parametersβ
| Parameter | Default | Description |
|---|---|---|
min_feedback_count | 50 | Minimum feedback samples required |
lora_r | 16 | LoRA rank (higher = more capacity) |
lora_alpha | 32 | LoRA scaling factor |
epochs | 3 | Training epochs |
learning_rate | 2e-4 | Learning rate |
batch_size | 4 | Training batch size |
warmup_ratio | 0.1 | Warmup ratio |
weight_decay | 0.01 | L2 regularization |
DPO Trainingβ
For preference-based alignment:
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/train/dpo",
headers={"Authorization": "Bearer sk-xxx"},
json={
"min_preference_pairs": 100,
"beta": 0.1,
"epochs": 2,
"learning_rate": 5e-5
}
)
Monitoring Trainingβ
Check Job Statusβ
response = requests.get(
f"https://api.synapsex.ai/v1/tenants/my_tenant/train/{job_id}",
headers={"Authorization": "Bearer sk-xxx"}
)
status = response.json()
print(f"Status: {status['status']}")
print(f"Progress: {status['progress'] * 100:.1f}%")
print(f"Current epoch: {status['current_epoch']}/{status['total_epochs']}")
print(f"Loss: {status['metrics']['loss']:.4f}")
Job Statusesβ
| Status | Description |
|---|---|
queued | Waiting for resources |
preparing | Loading data and model |
running | Training in progress |
completed | Successfully finished |
failed | Training failed |
cancelled | Cancelled by user |
Training Metricsβ
{
"job_id": "train_abc123",
"status": "running",
"progress": 0.65,
"current_epoch": 2,
"total_epochs": 3,
"metrics": {
"loss": 0.234,
"learning_rate": 1.8e-4,
"grad_norm": 0.45,
"samples_processed": 1500,
"tokens_processed": 450000
},
"resources": {
"gpus": 4,
"gpu_memory_used_gb": 180,
"training_speed": "1200 tokens/sec"
}
}
LoRA Managementβ
List LoRA Versionsβ
response = requests.get(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora",
headers={"Authorization": "Bearer sk-xxx"}
)
loras = response.json()
for lora in loras['versions']:
print(f"Version: {lora['version']}")
print(f" Created: {lora['created_at']}")
print(f" Samples: {lora['training_samples']}")
print(f" Active: {lora['is_active']}")
Activate LoRA Versionβ
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora/v3/activate",
headers={"Authorization": "Bearer sk-xxx"}
)
Rollback to Previous Versionβ
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora/v2/activate",
headers={"Authorization": "Bearer sk-xxx"}
)
A/B Testingβ
Route traffic between LoRA versions:
response = requests.patch(
"https://api.synapsex.ai/v1/tenants/my_tenant/settings",
headers={"Authorization": "Bearer sk-xxx"},
json={
"lora_ab_test": {
"enabled": True,
"versions": {
"v2": 0.3, # 30% traffic
"v3": 0.7 # 70% traffic
}
}
}
)
Data Preparationβ
Using Datasets (DataHub)β
Upload training data via DataHub:
# 1. Register dataset
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/datasets/register",
headers={"Authorization": "Bearer sk-xxx"},
json={
"name": "training_conversations",
"type": "conversations",
"data_use": {
"allow_rag": False,
"allow_private_training": True
}
}
)
dataset_id = response.json()['id']
# 2. Upload conversations
response = requests.post(
f"https://api.synapsex.ai/v1/tenants/my_tenant/datasets/{dataset_id}/upload/conversations",
headers={"Authorization": "Bearer sk-xxx"},
json={
"conversations": [
{
"messages": [
{"role": "user", "content": "What's your return policy?"},
{"role": "assistant", "content": "Items can be returned within 30 days..."}
]
}
]
}
)
Data Formatβ
SFT Formatβ
{
"conversations": [
{
"messages": [
{"role": "system", "content": "You are a helpful pharmacy assistant."},
{"role": "user", "content": "What is the dosage for ibuprofen?"},
{"role": "assistant", "content": "For adults, the typical dosage is 200-400mg every 4-6 hours..."}
]
}
]
}
DPO Formatβ
{
"preferences": [
{
"prompt": "Explain our shipping policy",
"chosen": "We offer free shipping on orders over $50. Standard shipping takes 3-5 business days...",
"rejected": "Check the website for shipping info."
}
]
}
Best Practicesβ
Data Qualityβ
β Do:
- Use high-quality, diverse examples
- Include edge cases and corrections
- Balance positive and negative examples for DPO
- Validate data format before upload
β Don't:
- Use synthetic/generated data only
- Include PII in training data
- Overtrain on small datasets (causes overfitting)
Training Configurationβ
| Use Case | LoRA r | Epochs | Learning Rate |
|---|---|---|---|
| Small corrections | 8 | 2 | 3e-4 |
| Domain adaptation | 16 | 3 | 2e-4 |
| Major behavior change | 32 | 5 | 1e-4 |
| Continuous learning | 16 | 1 | 5e-5 |
Monitoringβ
- Check loss curves for convergence
- Validate on held-out examples
- A/B test before full deployment
- Monitor production metrics after activation
Pricingβ
| Plan | Training Jobs/Month | LoRA Storage |
|---|---|---|
| Free | 0 | - |
| Pro | 2 | 1 version |
| Hybrid | 10 | 5 versions |
| Enterprise | Unlimited | Unlimited |
Training costs:
- SFT: $0.10 per 1000 training samples
- DPO: $0.15 per 1000 preference pairs
- GPU time included in job cost
Next Stepsβ
- π API Reference - Training endpoints
- π Multi-Tenancy - Per-tenant models
- βοΈ Quantum Reranking - Enhance trained models