SynapseX Training System

Fine-tune models with your data using LoRA adapters, Supervised Fine-Tuning (SFT), and Direct Preference Optimization (DPO).

Overview

SynapseX training enables:

Per-Tenant Customization - Each customer gets their own model adaptation
Feedback-Driven Learning - Automatic training from user corrections
Lightweight Adapters - LoRA enables efficient fine-tuning
HPC Infrastructure - LUMI supercomputer for fast training

Training Methods

LoRA (Low-Rank Adaptation)

Efficient fine-tuning that adds small trainable matrices to the base model:

Memory Efficient - Train with minimal GPU memory
Fast Training - Complete in hours, not days
Easy Switching - Load different adapters per tenant
Preserves Base Knowledge - Only modifies target behaviors

SFT (Supervised Fine-Tuning)

Train on input-output pairs from your data:

Input: "What are the store hours?"
Output: "Our store is open Monday-Friday 9am-6pm, Saturday 10am-4pm."

DPO (Direct Preference Optimization)

Train on preference pairs to align with user preferences:

Prompt: "Explain our return policy"
Chosen: "You can return items within 30 days with receipt..."  ✓
Rejected: "Returns are complicated and require manager approval..."  ✗

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                         SYNAPSEX TRAINING SYSTEM                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │                        DATA COLLECTION                              │  │
│   │                                                                     │  │
│   │   ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐          │  │
│   │   │ Feedback │  │ Datasets │  │Corrections│  │ Ratings  │          │  │
│   │   │ API      │  │ (DataHub)│  │           │  │          │          │  │
│   │   └────┬─────┘  └────┬─────┘  └─────┬─────┘  └────┬─────┘          │  │
│   │        │             │              │             │                 │  │
│   │        └─────────────┴──────────────┴─────────────┘                 │  │
│   │                              │                                       │  │
│   │                              ▼                                       │  │
│   │                    ┌──────────────────┐                             │  │
│   │                    │  Training Queue  │                             │  │
│   │                    └────────┬─────────┘                             │  │
│   └─────────────────────────────┼───────────────────────────────────────┘  │
│                                 │                                          │
│   ┌─────────────────────────────▼───────────────────────────────────────┐  │
│   │                        LUMI HPC CLUSTER                             │  │
│   │                                                                     │  │
│   │   ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐                 │  │
│   │   │ MI250X  │ │ MI250X  │ │ MI250X  │ │ MI250X  │                 │  │
│   │   │  64GB   │ │  64GB   │ │  64GB   │ │  64GB   │                 │  │
│   │   └─────────┘ └─────────┘ └─────────┘ └─────────┘                 │  │
│   │   ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐                 │  │
│   │   │ MI250X  │ │ MI250X  │ │ MI250X  │ │ MI250X  │                 │  │
│   │   │  64GB   │ │  64GB   │ │  64GB   │ │  64GB   │                 │  │
│   │   └─────────┘ └─────────┘ └─────────┘ └─────────┘                 │  │
│   │                                                                     │  │
│   │   Total: 512GB VRAM | DeepSpeed ZeRO-3 | ROCm 6.x                  │  │
│   │                                                                     │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                 │                                          │
│                                 ▼                                          │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │                        LoRA ADAPTERS                                │  │
│   │                                                                     │  │
│   │   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐            │  │
│   │   │ tenant_001   │  │ tenant_002   │  │ tenant_003   │            │  │
│   │   │ lora_v3.bin  │  │ lora_v5.bin  │  │ lora_v2.bin  │            │  │
│   │   └──────────────┘  └──────────────┘  └──────────────┘            │  │
│   │                                                                     │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Feedback Collection

Collecting User Feedback

Feedback drives automatic training:

import requests

# Submit feedback via API
response = requests.post(
    "https://api.synapsex.ai/v1/tenants/my_tenant/feedback",
    headers={"Authorization": "Bearer sk-xxx"},
    json={
        "conversation_id": "conv_123",
        "message_id": "msg_456",
        "type": "correction",
        "correction": "The correct answer is 500mg, not 250mg",
        "metadata": {
            "user_id": "user_789",
            "category": "dosage"
        }
    }
)

Feedback Types

Type	Description	Training Use
`thumbs_up`	Positive signal	Positive example for DPO
`thumbs_down`	Negative signal	Negative example for DPO
`correction`	User-provided fix	Direct training example
`rating`	1-5 score	Weight for training
`report`	Content issue	Exclusion from training

Check Training Readiness

response = requests.get(
    "https://api.synapsex.ai/v1/tenants/my_tenant/feedback/stats",
    headers={"Authorization": "Bearer sk-xxx"}
)

stats = response.json()
print(f"Total feedback: {stats['total']}")
print(f"Ready for training: {stats['ready_for_training']}")
# Ready when: corrections >= 50 or (thumbs_up + thumbs_down) >= 100

Triggering Training

LoRA Training

response = requests.post(
    "https://api.synapsex.ai/v1/tenants/my_tenant/train/lora",
    headers={"Authorization": "Bearer sk-xxx"},
    json={
        "min_feedback_count": 100,
        "lora_r": 16,
        "lora_alpha": 32,
        "epochs": 3,
        "learning_rate": 2e-4,
        "batch_size": 4
    }
)

job = response.json()
print(f"Training job: {job['job_id']}")
print(f"Estimated time: {job['estimated_duration_minutes']} minutes")

Training Parameters

Parameter	Default	Description
`min_feedback_count`	50	Minimum feedback samples required
`lora_r`	16	LoRA rank (higher = more capacity)
`lora_alpha`	32	LoRA scaling factor
`epochs`	3	Training epochs
`learning_rate`	2e-4	Learning rate
`batch_size`	4	Training batch size
`warmup_ratio`	0.1	Warmup ratio
`weight_decay`	0.01	L2 regularization

DPO Training

For preference-based alignment:

response = requests.post(
    "https://api.synapsex.ai/v1/tenants/my_tenant/train/dpo",
    headers={"Authorization": "Bearer sk-xxx"},
    json={
        "min_preference_pairs": 100,
        "beta": 0.1,
        "epochs": 2,
        "learning_rate": 5e-5
    }
)

Monitoring Training

Check Job Status

response = requests.get(
    f"https://api.synapsex.ai/v1/tenants/my_tenant/train/{job_id}",
    headers={"Authorization": "Bearer sk-xxx"}
)

status = response.json()
print(f"Status: {status['status']}")
print(f"Progress: {status['progress'] * 100:.1f}%")
print(f"Current epoch: {status['current_epoch']}/{status['total_epochs']}")
print(f"Loss: {status['metrics']['loss']:.4f}")

Job Statuses

Status	Description
`queued`	Waiting for resources
`preparing`	Loading data and model
`running`	Training in progress
`completed`	Successfully finished
`failed`	Training failed
`cancelled`	Cancelled by user

Training Metrics

{
  "job_id": "train_abc123",
  "status": "running",
  "progress": 0.65,
  "current_epoch": 2,
  "total_epochs": 3,
  "metrics": {
    "loss": 0.234,
    "learning_rate": 1.8e-4,
    "grad_norm": 0.45,
    "samples_processed": 1500,
    "tokens_processed": 450000
  },
  "resources": {
    "gpus": 4,
    "gpu_memory_used_gb": 180,
    "training_speed": "1200 tokens/sec"
  }
}

LoRA Management

List LoRA Versions

response = requests.get(
    "https://api.synapsex.ai/v1/tenants/my_tenant/lora",
    headers={"Authorization": "Bearer sk-xxx"}
)

loras = response.json()
for lora in loras['versions']:
    print(f"Version: {lora['version']}")
    print(f"  Created: {lora['created_at']}")
    print(f"  Samples: {lora['training_samples']}")
    print(f"  Active: {lora['is_active']}")

Activate LoRA Version

response = requests.post(
    "https://api.synapsex.ai/v1/tenants/my_tenant/lora/v3/activate",
    headers={"Authorization": "Bearer sk-xxx"}
)

Rollback to Previous Version

response = requests.post(
    "https://api.synapsex.ai/v1/tenants/my_tenant/lora/v2/activate",
    headers={"Authorization": "Bearer sk-xxx"}
)

A/B Testing

Route traffic between LoRA versions:

response = requests.patch(
    "https://api.synapsex.ai/v1/tenants/my_tenant/settings",
    headers={"Authorization": "Bearer sk-xxx"},
    json={
        "lora_ab_test": {
            "enabled": True,
            "versions": {
                "v2": 0.3,  # 30% traffic
                "v3": 0.7   # 70% traffic
            }
        }
    }
)

Data Preparation

Using Datasets (DataHub)

Upload training data via DataHub:

# 1. Register dataset
response = requests.post(
    "https://api.synapsex.ai/v1/tenants/my_tenant/datasets/register",
    headers={"Authorization": "Bearer sk-xxx"},
    json={
        "name": "training_conversations",
        "type": "conversations",
        "data_use": {
            "allow_rag": False,
            "allow_private_training": True
        }
    }
)

dataset_id = response.json()['id']

# 2. Upload conversations
response = requests.post(
    f"https://api.synapsex.ai/v1/tenants/my_tenant/datasets/{dataset_id}/upload/conversations",
    headers={"Authorization": "Bearer sk-xxx"},
    json={
        "conversations": [
            {
                "messages": [
                    {"role": "user", "content": "What's your return policy?"},
                    {"role": "assistant", "content": "Items can be returned within 30 days..."}
                ]
            }
        ]
    }
)

Data Format

SFT Format

{
  "conversations": [
    {
      "messages": [
        {"role": "system", "content": "You are a helpful pharmacy assistant."},
        {"role": "user", "content": "What is the dosage for ibuprofen?"},
        {"role": "assistant", "content": "For adults, the typical dosage is 200-400mg every 4-6 hours..."}
      ]
    }
  ]
}

DPO Format

{
  "preferences": [
    {
      "prompt": "Explain our shipping policy",
      "chosen": "We offer free shipping on orders over $50. Standard shipping takes 3-5 business days...",
      "rejected": "Check the website for shipping info."
    }
  ]
}

Best Practices

Data Quality

✅ Do:

Use high-quality, diverse examples
Include edge cases and corrections
Balance positive and negative examples for DPO
Validate data format before upload

❌ Don't:

Use synthetic/generated data only
Include PII in training data
Overtrain on small datasets (causes overfitting)

Training Configuration

Use Case	LoRA r	Epochs	Learning Rate
Small corrections	8	2	3e-4
Domain adaptation	16	3	2e-4
Major behavior change	32	5	1e-4
Continuous learning	16	1	5e-5

Monitoring

Check loss curves for convergence
Validate on held-out examples
A/B test before full deployment
Monitor production metrics after activation

Pricing

Plan	Training Jobs/Month	LoRA Storage
Free	0	-
Pro	2	1 version
Hybrid	10	5 versions
Enterprise	Unlimited	Unlimited

Training costs:

SFT: $0.10 per 1000 training samples
DPO: $0.15 per 1000 preference pairs
GPU time included in job cost

Next Steps

📖 API Reference - Training endpoints
🔄 Multi-Tenancy - Per-tenant models
⚛️ Quantum Reranking - Enhance trained models

Overview​

Training Methods​

LoRA (Low-Rank Adaptation)​

SFT (Supervised Fine-Tuning)​

DPO (Direct Preference Optimization)​

Architecture​

Feedback Collection​

Collecting User Feedback​

Feedback Types​

Check Training Readiness​

Triggering Training​

LoRA Training​

Training Parameters​

DPO Training​

Monitoring Training​

Check Job Status​

Job Statuses​

Training Metrics​

LoRA Management​

List LoRA Versions​

Activate LoRA Version​

Rollback to Previous Version​

A/B Testing​

Data Preparation​

Using Datasets (DataHub)​

Data Format​

SFT Format​

DPO Format​

Best Practices​

Data Quality​

Training Configuration​

Monitoring​

Pricing​

Next Steps​