LLM Providers

LYDOS operates a multi-provider LLM system with intelligent routing, automatic failover, and per-task model selection. The Q43 Adaptive Routing engine and Q159 Universal LLM Gateway manage all provider interactions transparently — you interact through a single CLI or API surface.

Supported Providers

Configure any subset of providers via environment variables. LYDOS automatically skips providers without API keys.

ProviderKey ModelsStatus
Groqllama-3.3-70b, qwen3-32bPrimary
Anthropicclaude-sonnet-4-6, claude-opus-4Backup
Z.AIglm-4.5, glm-5Secondary
Mistralcodestral-latest, mistral-largeActive
OpenAIgpt-4o, o3-miniAvailable
Google Geminigemini-2.0-flashAvailable
DeepSeekdeepseek-chat, deepseek-reasonerAvailable
Qwenqwen-max, qwen-plusAvailable
OllamaLocal modelsAvailable

Intelligent Routing

LYDOS routes each request to the best available model based on task complexity, cost budget, and latency requirements. Use --mode to explicitly target a routing profile, or let auto decide.

routing-modes.sh
Bash
# Auto-routes to the best available model for the task
lydos ask "Explain this codebase"
 
# Deep analysis — routes to Claude / OpenAI
lydos ask --mode deep "Analyze this architectural design"
 
# Code generation — routes to Codestral
lydos ask --mode code "Refactor this Python module"
 
# Cost-effective — routes to Groq free tier
lydos ask --mode cheap "Summarize these logs"
 
# Local inference only — never leaves your machine
lydos ask --mode local "Analyze this sensitive code"
 
# Research mode — multi-step DeerFlow pipeline
lydos ask --mode research "Survey recent LLM papers"
Available Routing Modes
--mode autoQ43 decides based on task complexity
--mode fastPrioritize lowest latency
--mode deepClaude / OpenAI for complex reasoning
--mode codeCodestral or DeepSeek for code tasks
--mode architectHigh-context architecture planning
--mode debugStep-by-step debugging assistance
--mode securitySecurity-aware analysis pipeline
--mode multilingualZ.AI / Qwen for non-English content
--mode cheapFree-tier Groq or Qwen cloud
--mode localOllama / LM Studio — never leaves machine
--mode researchMulti-step DeerFlow research pipeline

Fallback Graph

When a provider hits a rate limit, auth failure, or timeout, LYDOS automatically promotes the next provider in the chain. All fallbacks are logged and visible via lydos trace.

Automatic Fallback Chains
Deep Reasoning
ClaudeOpenAIGeminiZ.AI
Code Tasks
CodestralClaudeDeepSeekGroq
Fast Chat
GroqQwenZ.AI
Local / Private
OllamaLM Studiocloud fallback
fallback.sh
Bash
# Fallback chains are automatic — no configuration required.
# LYDOS detects rate limits, timeouts, and auth failures,
# then promotes the next provider in the chain.
 
# Override default chain for a single request:
lydos ask --fallback "groq,anthropic,openai" "Your prompt here"
 
# Force a specific provider:
lydos ask --provider mistral/codestral-latest "Refactor this module"

CLI Commands

All LLM interactions are available through the lydos CLI. Task-specific subcommands route to the most appropriate provider automatically.

cli-reference.sh
Bash
# Direct LLM interaction
lydos ask "What is the purpose of this function?"
lydos ask --mode deep "Review this architecture document"
lydos ask --mode code "Add error handling to this module"
 
# Task-specific commands
lydos code "Add unit tests for the auth module"
lydos debug "Why is this async function blocking?"
lydos review path/to/file.py
lydos architect "Design a distributed cache layer"
lydos explain path/to/complex_module.py
lydos research "Best practices for agent memory systems"
 
# Provider management
lydos providers # Health overview for all providers
lydos models # Full model catalog
lydos models recommend "code review" # Get a model recommendation
lydos llm list # Detailed provider + model list
lydos llm use groq/llama-3.3-70b-versatile # Set default provider/model
lydos llm test # Test current provider (round-trip)
lydos llm test --provider anthropic # Test a specific provider
 
# Observability
lydos trace <task-id> # Inspect full execution trace, model used, latency
lydos usage # Token usage and cost summary per provider
Command
lydos ask
lydos code
lydos debug
lydos review
lydos architect
lydos explain
lydos providers
lydos models
lydos models recommend "…"
lydos models test
lydos llm list
lydos llm use <provider/model>
lydos llm test
lydos config show
lydos config set <key> <val>
lydos trace <task-id>
lydos usage

Configuration

Configure LYDOS via a YAML config file and set provider API keys as environment variables. Never commit API keys to version control.

Config file

~/.config/lydos/config.yaml
YAML
# ~/.config/lydos/config.yaml
api_url: http://localhost:8888
default_provider: groq
default_model: llama-3.3-70b-versatile
routing_mode: auto
 
# Optional: per-mode overrides
routing_overrides:
deep: anthropic/claude-sonnet-4-6
code: mistral/codestral-latest
cheap: groq/llama-3.3-70b-versatile
local: ollama/llama3.2:3b

Environment variables

.env
Bash
# .env — place in LYDOS project root (never commit)
 
# Primary (required for most tasks)
GROQ_API_KEY=gsk_...
 
# Deep analysis / architecture
ANTHROPIC_API_KEY=sk-ant-...
 
# General reasoning / fallback
OPENAI_API_KEY=sk-...
 
# Code generation
MISTRAL_API_KEY=...
CODESTRAL_API_KEY=...
 
# Multilingual / long context
ZAI_API_KEY=...
 
# Long context (1M token window)
GOOGLE_API_KEY=...
 
# Cost-effective coding
DEEPSEEK_API_KEY=sk-...
 
# Free cloud tier
DASHSCOPE_API_KEY=... # Qwen via Alibaba Cloud
 
# Local inference (no key required)
OLLAMA_API_BASE=http://localhost:11434
Security: Keep your .env file in .gitignore. LYDOS never logs or transmits raw API keys. Providers without keys are automatically marked as unconfigured and skipped during routing.

Local Runtime

Use Ollama for privacy-first local inference. With --mode local, LYDOS guarantees that no data leaves your machine — the request never reaches any cloud provider.

ollama-setup.sh
Bash
# Step 1: Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
 
# Step 2: Pull a model
ollama pull llama3.2:3b # Fast, 3B params, CPU-friendly
ollama pull codellama:7b # Code-focused, 7B params
ollama pull mistral:7b # General purpose
 
# Step 3: Point LYDOS to your Ollama instance
export OLLAMA_API_BASE=http://localhost:11434
 
# Step 4: Run in local-only mode
lydos ask --mode local "Analyze this code — never leaves this machine"
 
# Step 5: Verify local routing
lydos providers # Should show Ollama as healthy
Recommended Local Models
llama3.2:3b2.0 GBFast, CPU-friendly general tasksfast
llama3.1:8b4.7 GBBalanced quality and speedbalanced
codellama:7b3.8 GBCode generation and completioncode
mistral:7b4.1 GBGeneral reasoning, instruction followinggeneral
deepseek-coder:6.7b3.8 GBCode review and debuggingcode

LM Studio is also supported as a local runtime alternative. Set OLLAMA_API_BASE=http://localhost:1234/v1 to point LYDOS at your LM Studio instance. The --mode local flag uses whichever local runtime is reachable.

API Reference

All LLM operations are available via the LYDOS REST API. The /api/llm namespace is powered by Q159 Universal LLM Gateway.

MethodEndpoint
POST/api/llm/chat
POST/api/llm/stream
GET/api/llm/providers
GET/api/llm/models
POST/api/llm/test
GET/api/q43/route
GET/api/q159/llm/health