LLM Providers
LYDOS operates a multi-provider LLM system with intelligent routing, automatic failover, and per-task model selection. The Q43 Adaptive Routing engine and Q159 Universal LLM Gateway manage all provider interactions transparently — you interact through a single CLI or API surface.
Supported Providers
Configure any subset of providers via environment variables. LYDOS automatically skips providers without API keys.
| Provider | Key Models | Status | ||
|---|---|---|---|---|
| Groq | llama-3.3-70b, qwen3-32b | Fast inference, dev chat | Free tier | Primary |
| Anthropic | claude-sonnet-4-6, claude-opus-4 | Deep analysis, architecture | Premium | Backup |
| Z.AI | glm-4.5, glm-5 | Multilingual, long context | Low | Secondary |
| Mistral | codestral-latest, mistral-large | Code generation | Medium | Active |
| OpenAI | gpt-4o, o3-mini | General reasoning | High | Available |
| Google Gemini | gemini-2.0-flash | Long context (1M tokens) | Medium | Available |
| DeepSeek | deepseek-chat, deepseek-reasoner | Cost-effective coding | Low | Available |
| Qwen | qwen-max, qwen-plus | Multilingual assistant | Free cloud | Available |
| Ollama | Local models | Privacy, offline inference | Free (local) | Available |
Intelligent Routing
LYDOS routes each request to the best available model based on task complexity, cost budget, and latency requirements. Use --mode to explicitly target a routing profile, or let auto decide.
| # Auto-routes to the best available model for the task |
| lydos ask "Explain this codebase" |
| # Deep analysis — routes to Claude / OpenAI |
| lydos ask --mode deep "Analyze this architectural design" |
| # Code generation — routes to Codestral |
| lydos ask --mode code "Refactor this Python module" |
| # Cost-effective — routes to Groq free tier |
| lydos ask --mode cheap "Summarize these logs" |
| # Local inference only — never leaves your machine |
| lydos ask --mode local "Analyze this sensitive code" |
| # Research mode — multi-step DeerFlow pipeline |
| lydos ask --mode research "Survey recent LLM papers" |
Fallback Graph
When a provider hits a rate limit, auth failure, or timeout, LYDOS automatically promotes the next provider in the chain. All fallbacks are logged and visible via lydos trace.
| # Fallback chains are automatic — no configuration required. |
| # LYDOS detects rate limits, timeouts, and auth failures, |
| # then promotes the next provider in the chain. |
| # Override default chain for a single request: |
| lydos ask --fallback "groq,anthropic,openai" "Your prompt here" |
| # Force a specific provider: |
| lydos ask --provider mistral/codestral-latest "Refactor this module" |
CLI Commands
All LLM interactions are available through the lydos CLI. Task-specific subcommands route to the most appropriate provider automatically.
| # Direct LLM interaction |
| lydos ask "What is the purpose of this function?" |
| lydos ask --mode deep "Review this architecture document" |
| lydos ask --mode code "Add error handling to this module" |
| # Task-specific commands |
| lydos code "Add unit tests for the auth module" |
| lydos debug "Why is this async function blocking?" |
| lydos review path/to/file.py |
| lydos architect "Design a distributed cache layer" |
| lydos explain path/to/complex_module.py |
| lydos research "Best practices for agent memory systems" |
| # Provider management |
| lydos providers # Health overview for all providers |
| lydos models # Full model catalog |
| lydos models recommend "code review" # Get a model recommendation |
| lydos llm list # Detailed provider + model list |
| lydos llm use groq/llama-3.3-70b-versatile # Set default provider/model |
| lydos llm test # Test current provider (round-trip) |
| lydos llm test --provider anthropic # Test a specific provider |
| # Observability |
| lydos trace <task-id> # Inspect full execution trace, model used, latency |
| lydos usage # Token usage and cost summary per provider |
| Command | ||
|---|---|---|
| lydos ask | Direct LLM interaction | auto |
| lydos code | Code generation and modification | code |
| lydos debug | Debugging assistance | debug |
| lydos review | Code review with structured output | deep |
| lydos architect | Architecture planning | architect |
| lydos explain | Code explanation and documentation | auto |
| lydos providers | View provider health | — |
| lydos models | Browse model catalog | — |
| lydos models recommend "…" | Get model recommendation for a task | — |
| lydos models test | Smoke-test all configured providers | — |
| lydos llm list | List all providers and their status | — |
| lydos llm use <provider/model> | Set default provider and model | — |
| lydos llm test | Test current provider (round-trip) | — |
| lydos config show | Display merged configuration (env + file) | — |
| lydos config set <key> <val> | Persist a config setting to disk | — |
| lydos trace <task-id> | Inspect full execution trace | — |
| lydos usage | View token usage and cost per provider | — |
Configuration
Configure LYDOS via a YAML config file and set provider API keys as environment variables. Never commit API keys to version control.
Config file
| # ~/.config/lydos/config.yaml |
| api_url: http://localhost:8888 |
| default_provider: groq |
| default_model: llama-3.3-70b-versatile |
| routing_mode: auto |
| # Optional: per-mode overrides |
| routing_overrides: |
| deep: anthropic/claude-sonnet-4-6 |
| code: mistral/codestral-latest |
| cheap: groq/llama-3.3-70b-versatile |
| local: ollama/llama3.2:3b |
Environment variables
| # .env — place in LYDOS project root (never commit) |
| # Primary (required for most tasks) |
| GROQ_API_KEY=gsk_... |
| # Deep analysis / architecture |
| ANTHROPIC_API_KEY=sk-ant-... |
| # General reasoning / fallback |
| OPENAI_API_KEY=sk-... |
| # Code generation |
| MISTRAL_API_KEY=... |
| CODESTRAL_API_KEY=... |
| # Multilingual / long context |
| ZAI_API_KEY=... |
| # Long context (1M token window) |
| GOOGLE_API_KEY=... |
| # Cost-effective coding |
| DEEPSEEK_API_KEY=sk-... |
| # Free cloud tier |
| DASHSCOPE_API_KEY=... # Qwen via Alibaba Cloud |
| # Local inference (no key required) |
| OLLAMA_API_BASE=http://localhost:11434 |
.env file in .gitignore. LYDOS never logs or transmits raw API keys. Providers without keys are automatically marked as unconfigured and skipped during routing.Local Runtime
Use Ollama for privacy-first local inference. With --mode local, LYDOS guarantees that no data leaves your machine — the request never reaches any cloud provider.
| # Step 1: Install Ollama |
| curl -fsSL https://ollama.ai/install.sh | sh |
| # Step 2: Pull a model |
| ollama pull llama3.2:3b # Fast, 3B params, CPU-friendly |
| ollama pull codellama:7b # Code-focused, 7B params |
| ollama pull mistral:7b # General purpose |
| # Step 3: Point LYDOS to your Ollama instance |
| export OLLAMA_API_BASE=http://localhost:11434 |
| # Step 4: Run in local-only mode |
| lydos ask --mode local "Analyze this code — never leaves this machine" |
| # Step 5: Verify local routing |
| lydos providers # Should show Ollama as healthy |
llama3.2:3b2.0 GBFast, CPU-friendly general tasksfastllama3.1:8b4.7 GBBalanced quality and speedbalancedcodellama:7b3.8 GBCode generation and completioncodemistral:7b4.1 GBGeneral reasoning, instruction followinggeneraldeepseek-coder:6.7b3.8 GBCode review and debuggingcodeLM Studio is also supported as a local runtime alternative. Set OLLAMA_API_BASE=http://localhost:1234/v1 to point LYDOS at your LM Studio instance. The --mode local flag uses whichever local runtime is reachable.
API Reference
All LLM operations are available via the LYDOS REST API. The /api/llm namespace is powered by Q159 Universal LLM Gateway.
| Method | Endpoint | |
|---|---|---|
| POST | /api/llm/chat | Send a message, auto-route provider |
| POST | /api/llm/stream | Streaming chat response (SSE) |
| GET | /api/llm/providers | List all providers with health status |
| GET | /api/llm/models | Full model catalog |
| POST | /api/llm/test | Run a connectivity test for a provider |
| GET | /api/q43/route | Q43 routing decision for a prompt |
| GET | /api/q159/llm/health | Q159 gateway health across all backends |