OpenAI Setup
AxonFlow supports OpenAI's GPT models for LLM routing and orchestration. OpenAI is available in the Community edition.
Prerequisites
- OpenAI account
- API key from OpenAI Platform
Quick Start
1. Get API Key
- Go to OpenAI Platform
- Click "Create new secret key"
- Copy the generated key
2. Configure AxonFlow
# Required
export OPENAI_API_KEY=sk-your-api-key-here
# Optional: Specify model (default: gpt-4o)
export OPENAI_MODEL=gpt-4o-mini
3. Start AxonFlow
docker compose up -d
Supported Models
| Model | Context Window | Best For |
|---|---|---|
gpt-4o | 128K tokens | Latest flagship model (default) |
gpt-4o-mini | 128K tokens | Cost-effective, fast |
gpt-4-turbo | 128K tokens | Previous flagship |
gpt-3.5-turbo | 16K tokens | Budget-friendly |
o1-preview | 128K tokens | Advanced reasoning |
o1-mini | 128K tokens | Fast reasoning |
Configuration Options
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY | Yes | - | OpenAI API key |
OPENAI_MODEL | No | gpt-4o | Default model |
OPENAI_ENDPOINT | No | https://api.openai.com | API endpoint |
OPENAI_TIMEOUT_SECONDS | No | 120 | Request timeout (seconds) |
Always set OPENAI_MODEL explicitly if your API key has limited model access. AxonFlow uses the configured model for all requests to this provider. If not set, it defaults to gpt-4o.
YAML Configuration
For more control, use YAML configuration:
# axonflow.yaml
llm_providers:
openai:
enabled: true
config:
model: gpt-4o
max_tokens: 4096
timeout: 120s
credentials:
api_key: ${OPENAI_API_KEY}
priority: 10
weight: 0.5
Capabilities
The OpenAI provider supports:
- Chat completions - Conversational AI
- Streaming responses - Real-time token streaming
- Function calling - Tool use and structured output
- Vision - Image understanding (GPT-4o, GPT-4-turbo)
- JSON mode - Structured output
- Code generation - Programming assistance
Usage Examples
Proxy Mode (Python SDK)
Proxy mode routes requests through AxonFlow for simple integration:
from axonflow import AxonFlow
async with AxonFlow(agent_url="http://localhost:8080") as client:
response = await client.execute_query(
user_token="user-123",
query="Explain machine learning",
request_type="chat",
context={"provider": "openai", "model": "gpt-4o"}
)
print(response.content)
Proxy Mode (cURL)
curl -X POST http://localhost:8080/api/request \
-H "Content-Type: application/json" \
-H "X-User-Token: user-123" \
-d '{
"query": "What is machine learning?",
"provider": "openai",
"model": "gpt-4o",
"max_tokens": 500
}'
Gateway Mode (TypeScript SDK)
Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:
import { AxonFlow } from '@axonflow/sdk';
import OpenAI from 'openai';
const axonflow = new AxonFlow({
endpoint: 'http://localhost:8080',
apiKey: 'your-axonflow-key'
});
// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
userToken: 'user-123',
query: 'Explain machine learning'
});
if (!ctx.approved) {
throw new Error(`Blocked: ${ctx.blockReason}`);
}
// 2. Call OpenAI directly
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: ctx.approvedData.query }]
});
const response = completion.choices[0].message.content;
// 3. Audit the call
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.substring(0, 100),
provider: 'openai',
model: 'gpt-4o',
tokenUsage: {
promptTokens: completion.usage.prompt_tokens,
completionTokens: completion.usage.completion_tokens,
totalTokens: completion.usage.total_tokens
},
latencyMs: 250
});
Pricing
OpenAI pricing (as of December 2025):
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| GPT-4o | $2.50 | $10.00 |
| GPT-4o-mini | $0.15 | $0.60 |
| GPT-4-turbo | $10.00 | $30.00 |
| GPT-3.5-turbo | $0.50 | $1.50 |
| o1-preview | $15.00 | $60.00 |
| o1-mini | $3.00 | $12.00 |
AxonFlow provides cost estimation via the /api/cost/estimate endpoint.
Multi-Provider Routing
Configure OpenAI alongside other providers for intelligent routing:
llm_providers:
openai:
enabled: true
config:
model: gpt-4o
credentials:
api_key: ${OPENAI_API_KEY}
priority: 100
anthropic:
enabled: true
config:
model: claude-3-5-sonnet-20241022
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 50
routing:
strategy: priority
fallback_enabled: true
Health Checks
Check the OpenAI provider health status:
# Check specific provider health
curl http://localhost:8081/api/v1/llm-providers/openai/health
Response:
{
"provider": "openai",
"status": "healthy",
"latency_ms": 45,
"model": "gpt-4o"
}
To check all configured providers at once:
curl http://localhost:8081/api/v1/llm-providers/status
Response:
{
"providers": {
"openai": {
"status": "healthy",
"latency_ms": 45,
"model": "gpt-4o"
},
"anthropic": {
"status": "healthy",
"latency_ms": 52,
"model": "claude-3-5-sonnet-20241022"
}
}
}
Error Handling
Common error codes from OpenAI:
| Status | Reason | Action |
|---|---|---|
| 401 | Invalid API key | Verify OPENAI_API_KEY |
| 429 | Rate limit exceeded | Implement backoff/retry |
| 500 | Server error | Retry with exponential backoff |
| 503 | Service unavailable | Retry or failover to another provider |
AxonFlow automatically handles retries for transient errors (429, 500, 503).
Best Practices
- Use appropriate models - GPT-4o for quality, GPT-4o-mini for cost
- Set reasonable timeouts - 120s default is good for most use cases
- Enable fallback providers - Configure Anthropic/Gemini as backup
- Monitor costs - Use AxonFlow's cost dashboard to track usage
- Handle rate limits - Implement client-side retry logic for high-volume apps
Troubleshooting
"Invalid API key"
- Verify the key at OpenAI Platform
- Ensure the key hasn't been revoked
- Check for leading/trailing whitespace
"Model not found"
- Verify model name (e.g.,
gpt-4o, notgpt-4-o) - Check if your account has access to the model
- Some models require specific API access
"Rate limit exceeded"
- Check usage at OpenAI Usage
- Consider upgrading your plan
- Implement exponential backoff
Next Steps
- LLM Providers Overview - All supported providers
- Anthropic Setup - Alternative provider
- Google Gemini Setup - Multimodal capabilities
- Custom Provider SDK - Build custom providers