Skip to main content

SynapseX Training System

Fine-tune models with your data using LoRA adapters, Supervised Fine-Tuning (SFT), and Direct Preference Optimization (DPO).

Overview

SynapseX training enables:

  • Per-Tenant Customization - Each customer gets their own model adaptation
  • Feedback-Driven Learning - Automatic training from user corrections
  • Lightweight Adapters - LoRA enables efficient fine-tuning
  • HPC Infrastructure - LUMI supercomputer for fast training

Training Methods

LoRA (Low-Rank Adaptation)

Efficient fine-tuning that adds small trainable matrices to the base model:

  • Memory Efficient - Train with minimal GPU memory
  • Fast Training - Complete in hours, not days
  • Easy Switching - Load different adapters per tenant
  • Preserves Base Knowledge - Only modifies target behaviors

SFT (Supervised Fine-Tuning)

Train on input-output pairs from your data:

Input: "What are the store hours?"
Output: "Our store is open Monday-Friday 9am-6pm, Saturday 10am-4pm."

DPO (Direct Preference Optimization)

Train on preference pairs to align with user preferences:

Prompt: "Explain our return policy"
Chosen: "You can return items within 30 days with receipt..." ✓
Rejected: "Returns are complicated and require manager approval..." ✗

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│ SYNAPSEX TRAINING SYSTEM │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ DATA COLLECTION │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Feedback │ │ Datasets │ │Corrections│ │ Ratings │ │ │
│ │ │ API │ │ (DataHub)│ │ │ │ │ │ │
│ │ └────┬─────┘ └────┬─────┘ └─────┬─────┘ └────┬─────┘ │ │
│ │ │ │ │ │ │ │
│ │ └─────────────┴──────────────┴─────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────┐ │ │
│ │ │ Training Queue │ │ │
│ │ └────────┬─────────┘ │ │
│ └─────────────────────────────┼───────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────▼───────────────────────────────────────┐ │
│ │ LUMI HPC CLUSTER │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ MI250X │ │ MI250X │ │ MI250X │ │ MI250X │ │ │
│ │ │ 64GB │ │ 64GB │ │ 64GB │ │ 64GB │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ MI250X │ │ MI250X │ │ MI250X │ │ MI250X │ │ │
│ │ │ 64GB │ │ 64GB │ │ 64GB │ │ 64GB │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ │ │
│ │ Total: 512GB VRAM | DeepSpeed ZeRO-3 | ROCm 6.x │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ LoRA ADAPTERS │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ tenant_001 │ │ tenant_002 │ │ tenant_003 │ │ │
│ │ │ lora_v3.bin │ │ lora_v5.bin │ │ lora_v2.bin │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

Feedback Collection

Collecting User Feedback

Feedback drives automatic training:

import requests

# Submit feedback via API
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/feedback",
headers={"Authorization": "Bearer sk-xxx"},
json={
"conversation_id": "conv_123",
"message_id": "msg_456",
"type": "correction",
"correction": "The correct answer is 500mg, not 250mg",
"metadata": {
"user_id": "user_789",
"category": "dosage"
}
}
)

Feedback Types

TypeDescriptionTraining Use
thumbs_upPositive signalPositive example for DPO
thumbs_downNegative signalNegative example for DPO
correctionUser-provided fixDirect training example
rating1-5 scoreWeight for training
reportContent issueExclusion from training

Check Training Readiness

response = requests.get(
"https://api.synapsex.ai/v1/tenants/my_tenant/feedback/stats",
headers={"Authorization": "Bearer sk-xxx"}
)

stats = response.json()
print(f"Total feedback: {stats['total']}")
print(f"Ready for training: {stats['ready_for_training']}")
# Ready when: corrections >= 50 or (thumbs_up + thumbs_down) >= 100

Triggering Training

LoRA Training

response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/train/lora",
headers={"Authorization": "Bearer sk-xxx"},
json={
"min_feedback_count": 100,
"lora_r": 16,
"lora_alpha": 32,
"epochs": 3,
"learning_rate": 2e-4,
"batch_size": 4
}
)

job = response.json()
print(f"Training job: {job['job_id']}")
print(f"Estimated time: {job['estimated_duration_minutes']} minutes")

Training Parameters

ParameterDefaultDescription
min_feedback_count50Minimum feedback samples required
lora_r16LoRA rank (higher = more capacity)
lora_alpha32LoRA scaling factor
epochs3Training epochs
learning_rate2e-4Learning rate
batch_size4Training batch size
warmup_ratio0.1Warmup ratio
weight_decay0.01L2 regularization

DPO Training

For preference-based alignment:

response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/train/dpo",
headers={"Authorization": "Bearer sk-xxx"},
json={
"min_preference_pairs": 100,
"beta": 0.1,
"epochs": 2,
"learning_rate": 5e-5
}
)

Monitoring Training

Check Job Status

response = requests.get(
f"https://api.synapsex.ai/v1/tenants/my_tenant/train/{job_id}",
headers={"Authorization": "Bearer sk-xxx"}
)

status = response.json()
print(f"Status: {status['status']}")
print(f"Progress: {status['progress'] * 100:.1f}%")
print(f"Current epoch: {status['current_epoch']}/{status['total_epochs']}")
print(f"Loss: {status['metrics']['loss']:.4f}")

Job Statuses

StatusDescription
queuedWaiting for resources
preparingLoading data and model
runningTraining in progress
completedSuccessfully finished
failedTraining failed
cancelledCancelled by user

Training Metrics

{
"job_id": "train_abc123",
"status": "running",
"progress": 0.65,
"current_epoch": 2,
"total_epochs": 3,
"metrics": {
"loss": 0.234,
"learning_rate": 1.8e-4,
"grad_norm": 0.45,
"samples_processed": 1500,
"tokens_processed": 450000
},
"resources": {
"gpus": 4,
"gpu_memory_used_gb": 180,
"training_speed": "1200 tokens/sec"
}
}

LoRA Management

List LoRA Versions

response = requests.get(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora",
headers={"Authorization": "Bearer sk-xxx"}
)

loras = response.json()
for lora in loras['versions']:
print(f"Version: {lora['version']}")
print(f" Created: {lora['created_at']}")
print(f" Samples: {lora['training_samples']}")
print(f" Active: {lora['is_active']}")

Activate LoRA Version

response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora/v3/activate",
headers={"Authorization": "Bearer sk-xxx"}
)

Rollback to Previous Version

response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/lora/v2/activate",
headers={"Authorization": "Bearer sk-xxx"}
)

A/B Testing

Route traffic between LoRA versions:

response = requests.patch(
"https://api.synapsex.ai/v1/tenants/my_tenant/settings",
headers={"Authorization": "Bearer sk-xxx"},
json={
"lora_ab_test": {
"enabled": True,
"versions": {
"v2": 0.3, # 30% traffic
"v3": 0.7 # 70% traffic
}
}
}
)

Data Preparation

Using Datasets (DataHub)

Upload training data via DataHub:

# 1. Register dataset
response = requests.post(
"https://api.synapsex.ai/v1/tenants/my_tenant/datasets/register",
headers={"Authorization": "Bearer sk-xxx"},
json={
"name": "training_conversations",
"type": "conversations",
"data_use": {
"allow_rag": False,
"allow_private_training": True
}
}
)

dataset_id = response.json()['id']

# 2. Upload conversations
response = requests.post(
f"https://api.synapsex.ai/v1/tenants/my_tenant/datasets/{dataset_id}/upload/conversations",
headers={"Authorization": "Bearer sk-xxx"},
json={
"conversations": [
{
"messages": [
{"role": "user", "content": "What's your return policy?"},
{"role": "assistant", "content": "Items can be returned within 30 days..."}
]
}
]
}
)

Data Format

SFT Format

{
"conversations": [
{
"messages": [
{"role": "system", "content": "You are a helpful pharmacy assistant."},
{"role": "user", "content": "What is the dosage for ibuprofen?"},
{"role": "assistant", "content": "For adults, the typical dosage is 200-400mg every 4-6 hours..."}
]
}
]
}

DPO Format

{
"preferences": [
{
"prompt": "Explain our shipping policy",
"chosen": "We offer free shipping on orders over $50. Standard shipping takes 3-5 business days...",
"rejected": "Check the website for shipping info."
}
]
}

Best Practices

Data Quality

Do:

  • Use high-quality, diverse examples
  • Include edge cases and corrections
  • Balance positive and negative examples for DPO
  • Validate data format before upload

Don't:

  • Use synthetic/generated data only
  • Include PII in training data
  • Overtrain on small datasets (causes overfitting)

Training Configuration

Use CaseLoRA rEpochsLearning Rate
Small corrections823e-4
Domain adaptation1632e-4
Major behavior change3251e-4
Continuous learning1615e-5

Monitoring

  • Check loss curves for convergence
  • Validate on held-out examples
  • A/B test before full deployment
  • Monitor production metrics after activation

Pricing

PlanTraining Jobs/MonthLoRA Storage
Free0-
Pro21 version
Hybrid105 versions
EnterpriseUnlimitedUnlimited

Training costs:

  • SFT: $0.10 per 1000 training samples
  • DPO: $0.15 per 1000 preference pairs
  • GPU time included in job cost

Next Steps