Skip to main content

Anthropic Claude Setup

AxonFlow supports Anthropic's Claude models for LLM routing and orchestration. Claude is available in the Community edition and is known for its safety-focused design and long context capabilities.

Prerequisites

Quick Start

1. Get API Key

  1. Go to Anthropic Console
  2. Click "Create Key"
  3. Copy the generated key

2. Configure AxonFlow

# Required
export ANTHROPIC_API_KEY=sk-ant-your-api-key-here

# Optional: Specify model (default: claude-3-5-sonnet-20241022)
export ANTHROPIC_MODEL=claude-3-5-haiku-20241022

3. Start AxonFlow

docker compose up -d

Supported Models

ModelContext WindowBest For
claude-sonnet-4-20250514200K tokensLatest, balanced quality/speed
claude-opus-4-20250514200K tokensHighest capability
claude-3-5-sonnet-20241022200K tokensFast, high quality (default)
claude-3-5-haiku-20241022200K tokensFastest, cost-effective
claude-3-opus-20240229200K tokensComplex reasoning
claude-3-sonnet-20240229200K tokensBalanced (legacy)
claude-3-haiku-20240307200K tokensFast responses (legacy)

Configuration Options

Environment Variables

VariableRequiredDefaultDescription
ANTHROPIC_API_KEYYes-Anthropic API key
ANTHROPIC_MODELNoclaude-3-5-sonnet-20241022Default model
ANTHROPIC_ENDPOINTNohttps://api.anthropic.comAPI endpoint
ANTHROPIC_TIMEOUT_SECONDSNo120Request timeout (seconds)
Model Configuration

Always set ANTHROPIC_MODEL explicitly if your API key has limited model access (e.g., only claude-3-haiku-20240307). AxonFlow uses the configured model for all requests to this provider.

YAML Configuration

For more control, use YAML configuration:

# axonflow.yaml
llm_providers:
anthropic:
enabled: true
config:
model: claude-3-5-sonnet-20241022
max_tokens: 8192
timeout: 120s
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 10
weight: 0.5

Capabilities

The Anthropic provider supports:

  • Chat completions - Conversational AI
  • Streaming responses - Real-time token streaming
  • Long context - Up to 200K tokens
  • Vision - Image understanding
  • Tool use - Function calling
  • Code generation - Programming assistance
  • Constitutional AI - Built-in safety alignment

Usage Examples

Proxy Mode (Python SDK)

Proxy mode routes requests through AxonFlow for simple integration:

from axonflow import AxonFlow

async with AxonFlow(agent_url="http://localhost:8080") as client:
response = await client.execute_query(
user_token="user-123",
query="Explain quantum computing",
request_type="chat",
context={"provider": "anthropic", "model": "claude-3-5-sonnet-20241022"}
)
print(response.content)

Proxy Mode (cURL)

curl -X POST http://localhost:8080/api/request \
-H "Content-Type: application/json" \
-H "X-User-Token: user-123" \
-d '{
"query": "What is quantum computing?",
"provider": "anthropic",
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 500
}'

Gateway Mode (TypeScript SDK)

Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:

import { AxonFlow } from '@axonflow/sdk';
import Anthropic from '@anthropic-ai/sdk';

const axonflow = new AxonFlow({
endpoint: 'http://localhost:8080',
apiKey: 'your-axonflow-key'
});

// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
userToken: 'user-123',
query: 'Explain quantum computing'
});

if (!ctx.approved) {
throw new Error(`Blocked: ${ctx.blockReason}`);
}

// 2. Call Anthropic directly
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const message = await anthropic.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: ctx.approvedData.query }]
});
const response = message.content[0].text;

// 3. Audit the call
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.substring(0, 100),
provider: 'anthropic',
model: 'claude-3-5-sonnet-20241022',
tokenUsage: {
promptTokens: message.usage.input_tokens,
completionTokens: message.usage.output_tokens,
totalTokens: message.usage.input_tokens + message.usage.output_tokens
},
latencyMs: 250
});

Streaming

Anthropic supports server-sent events (SSE) for streaming responses:

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const stream = await anthropic.messages.stream({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Write a story' }]
});

for await (const chunk of stream) {
if (chunk.type === 'content_block_delta') {
process.stdout.write(chunk.delta.text);
}
}

Pricing

Anthropic pricing (as of December 2025):

ModelInput (per 1M tokens)Output (per 1M tokens)
Claude Sonnet 4$3.00$15.00
Claude Opus 4$15.00$75.00
Claude 3.5 Sonnet$3.00$15.00
Claude 3.5 Haiku$0.80$4.00
Claude 3 Opus$15.00$75.00

AxonFlow provides cost estimation via the /api/cost/estimate endpoint.

Multi-Provider Routing

Configure Anthropic alongside other providers for intelligent routing:

llm_providers:
anthropic:
enabled: true
config:
model: claude-3-5-sonnet-20241022
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 100

openai:
enabled: true
config:
model: gpt-4o
credentials:
api_key: ${OPENAI_API_KEY}
priority: 50

routing:
strategy: priority
fallback_enabled: true

Health Checks

Check the Anthropic provider health status:

# Check specific provider health
curl http://localhost:8081/api/v1/llm-providers/anthropic/health

Response:

{
"provider": "anthropic",
"status": "healthy",
"latency_ms": 52,
"model": "claude-3-5-sonnet-20241022"
}

To check all configured providers at once:

curl http://localhost:8081/api/v1/llm-providers/status

Response:

{
"providers": {
"anthropic": {
"status": "healthy",
"latency_ms": 52,
"model": "claude-3-5-sonnet-20241022"
},
"openai": {
"status": "healthy",
"latency_ms": 45,
"model": "gpt-4o"
}
}
}

Error Handling

Common error codes from Anthropic:

StatusReasonAction
400Invalid requestCheck request format
401Invalid API keyVerify ANTHROPIC_API_KEY
429Rate limit exceededImplement backoff/retry
500Server errorRetry with exponential backoff
529API overloadedRetry with backoff

AxonFlow automatically handles retries for transient errors (429, 500, 529).

Best Practices

  1. Use appropriate models - Sonnet for most tasks, Haiku for speed, Opus for complex reasoning
  2. Set reasonable timeouts - 120s default is good for most use cases
  3. Enable fallback providers - Configure OpenAI/Gemini as backup
  4. Monitor costs - Use AxonFlow's cost dashboard to track usage
  5. Leverage long context - Claude handles up to 200K tokens well

Troubleshooting

"Invalid API key"

  • Verify the key at Anthropic Console
  • Ensure the key hasn't been disabled
  • Check for leading/trailing whitespace

"Model not found"

  • Verify model name format (e.g., claude-3-5-sonnet-20241022)
  • Check if model is available in your region
  • Note: Older model versions may be deprecated

"Rate limit exceeded"

  • Check usage at Anthropic Console
  • Request a rate limit increase
  • Implement exponential backoff

Next Steps