Proxy Mode - Zero-Code AI Governance
Proxy Mode is the simplest way to add governance to your AI applications. Send your queries to AxonFlow and it handles everything: policy enforcement, LLM routing, PII detection, and audit logging.
How It Works
Key Benefits:
- You don't manage LLM API keys - AxonFlow routes to configured providers
- Automatic audit trail for every request
- Response filtering for PII before it reaches your app
- One API call for policy check + LLM execution + audit
Quick Start
TypeScript
import { AxonFlow } from '@axonflow/sdk'; // v1.4.0+
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_AGENT_URL,
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
});
// Single call handles: policy check → LLM routing → audit
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'What are the benefits of AI governance?',
requestType: 'llm_chat',
context: {
provider: 'openai',
model: 'gpt-4',
},
});
if (response.blocked) {
console.log('Blocked:', response.blockReason);
} else if (response.success) {
console.log('Response:', response.data);
}
Python
from axonflow import AxonFlow # v0.3.0+
import os
async with AxonFlow(
agent_url=os.environ["AXONFLOW_AGENT_URL"],
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
) as client:
response = await client.execute_query(
user_token="user-123",
query="What are the benefits of AI governance?",
request_type="llm_chat",
context={
"provider": "openai",
"model": "gpt-4"
}
)
if response.blocked:
print(f"Blocked: {response.block_reason}")
else:
print(f"Response: {response.data}")
Go
import "github.com/getaxonflow/axonflow-sdk-go" // v1.5.0+
client := axonflow.NewClient(axonflow.AxonFlowConfig{
AgentURL: os.Getenv("AXONFLOW_AGENT_URL"),
LicenseKey: os.Getenv("AXONFLOW_LICENSE_KEY"),
})
response, err := client.ExecuteQuery(
"user-123", // userToken
"What are the benefits of AI governance?", // query
"llm_chat", // requestType
map[string]interface{}{ // context
"provider": "openai",
"model": "gpt-4",
},
)
if err != nil {
log.Fatal(err)
}
if response.Blocked {
fmt.Printf("Blocked: %s\n", response.BlockReason)
} else {
fmt.Printf("Response: %v\n", response.Data)
}
Java
import com.getaxonflow.sdk.AxonFlow; // v1.1.0+
import com.getaxonflow.sdk.ExecuteQueryRequest;
import com.getaxonflow.sdk.ExecuteQueryResponse;
AxonFlow client = AxonFlow.builder()
.agentUrl(System.getenv("AXONFLOW_AGENT_URL"))
.licenseKey(System.getenv("AXONFLOW_LICENSE_KEY"))
.build();
ExecuteQueryResponse response = client.executeQuery(
ExecuteQueryRequest.builder()
.userToken("user-123")
.query("What are the benefits of AI governance?")
.requestType("llm_chat")
.context(Map.of(
"provider", "openai",
"model", "gpt-4"
))
.build()
);
if (response.isBlocked()) {
System.out.println("Blocked: " + response.getBlockReason());
} else {
System.out.println("Response: " + response.getData());
}
When to Use Proxy Mode
Best For
- Greenfield projects - Starting fresh with AI governance
- Simple integrations - Single API call for everything
- Response filtering - Automatic PII detection and redaction
- 100% audit coverage - Every call automatically logged
- Multi-provider routing - AxonFlow handles LLM provider selection
Example Use Cases
| Scenario | Why Proxy Mode |
|---|---|
| Customer support chatbot | Simple, automatic audit trail |
| Internal Q&A assistant | Zero-code governance |
| Document summarization | Response filtering for PII |
| Code generation | Block prompt injection attacks |
Response Handling
All SDKs return a consistent response structure:
interface ExecuteQueryResponse {
success: boolean; // True if LLM call succeeded
blocked: boolean; // True if blocked by policy
blockReason?: string; // Why it was blocked
data?: string; // LLM response content
policyInfo?: {
policiesEvaluated: string[]; // Policies that were checked
contextId: string; // Audit context ID
};
tokenUsage?: {
promptTokens: number;
completionTokens: number;
totalTokens: number;
};
}
Example: Handling Different Cases
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: userInput,
requestType: 'llm_chat',
context: { provider: 'openai', model: 'gpt-4' },
});
if (response.blocked) {
// Policy violation - show user-friendly message
console.log('Request blocked:', response.blockReason);
// Log which policies triggered
if (response.policyInfo?.policiesEvaluated) {
console.log('Policies:', response.policyInfo.policiesEvaluated.join(', '));
}
} else if (response.success) {
// Success - display the response
console.log('AI Response:', response.data);
// Track token usage for billing
if (response.tokenUsage) {
console.log(`Tokens used: ${response.tokenUsage.totalTokens}`);
}
} else {
// Error (network, LLM provider issue, etc.)
console.error('Request failed');
}
LLM Provider Configuration
In Proxy Mode, AxonFlow routes requests to your configured LLM providers. Specify the provider in the context:
// OpenAI
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'openai', model: 'gpt-4' },
});
// Anthropic
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'anthropic', model: 'claude-3-sonnet' },
});
// Google Gemini
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});
// Ollama (self-hosted)
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'ollama', model: 'llama2' },
});
// AWS Bedrock
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'bedrock', model: 'anthropic.claude-3-sonnet' },
});
Features
1. Automatic Policy Enforcement
All requests are checked against your policies before reaching the LLM:
// PII is blocked before reaching OpenAI
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'Process payment for SSN 123-45-6789',
requestType: 'llm_chat',
context: { provider: 'openai', model: 'gpt-4' },
});
// response.blocked = true
// response.blockReason = "PII detected: US Social Security Number"
2. Automatic Audit Logging
Every request is logged for compliance - no additional code needed:
const response = await axonflow.executeQuery({
userToken: 'user-123', // User identifier for audit
query: 'Analyze this data...',
requestType: 'llm_chat',
context: { provider: 'openai', model: 'gpt-4' },
});
// Audit automatically includes:
// - Timestamp
// - User token
// - Request content (sanitized)
// - Response summary
// - Policies evaluated
// - Token usage
// - Latency
3. Response Filtering (Enterprise)
PII in LLM responses can be automatically redacted:
const response = await axonflow.executeQuery({
query: "What is John Smith's email?",
context: { provider: 'openai', model: 'gpt-4' },
});
// response.data = "The customer's email is [EMAIL REDACTED]"
4. Multi-Provider Routing
Route to different providers based on request type or load:
// Fast queries to Gemini
const fastResponse = await axonflow.executeQuery({
query: 'Quick question',
context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});
// Complex reasoning to Claude
const complexResponse = await axonflow.executeQuery({
query: 'Analyze this complex document...',
context: { provider: 'anthropic', model: 'claude-3-opus' },
});
Latency Considerations
Proxy Mode adds latency because requests go through AxonFlow:
| Deployment | Additional Latency |
|---|---|
| SaaS endpoint | ~50-100ms |
| In-VPC deployment | ~10-20ms |
For latency-sensitive applications, consider Gateway Mode.
Comparison with Gateway Mode
| Feature | Proxy Mode | Gateway Mode |
|---|---|---|
| Integration Effort | Minimal | Moderate |
| Code Changes | Single executeQuery() call | Pre-check + LLM call + Audit |
| Latency Overhead | Higher (~50-100ms) | Lower (~10-20ms) |
| Response Filtering | Yes (automatic) | No |
| Audit Coverage | 100% automatic | Manual |
| LLM API Keys | Managed by AxonFlow | Managed by you |
| LLM Control | AxonFlow routes | You call directly |
| Best For | Simple apps, full governance | Frameworks, lowest latency |
See Choosing a Mode for detailed guidance.
Error Handling
try {
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'Hello!',
requestType: 'llm_chat',
context: { provider: 'openai', model: 'gpt-4' },
});
if (response.blocked) {
// Policy violation - handle gracefully
showUserMessage('Your request was blocked: ' + response.blockReason);
} else if (response.success) {
// Display the response
displayResponse(response.data);
}
} catch (error) {
// Network/SDK errors
if (error.code === 'ECONNREFUSED') {
console.error('Cannot reach AxonFlow - check your endpoint');
} else if (error.code === 'TIMEOUT') {
console.error('Request timed out');
} else {
console.error('Unexpected error:', error.message);
}
}
Configuration
TypeScript
const axonflow = new AxonFlow({
endpoint: 'https://your-axonflow.example.com',
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
tenant: 'your-tenant-id',
debug: false,
timeout: 30000,
});
Python
client = AxonFlow(
agent_url="https://your-axonflow.example.com",
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
tenant="your-tenant-id",
timeout=30.0,
)
Go
client := axonflow.NewClient(axonflow.AxonFlowConfig{
AgentURL: "https://your-axonflow.example.com",
LicenseKey: os.Getenv("AXONFLOW_LICENSE_KEY"),
Tenant: "your-tenant-id",
Timeout: 30 * time.Second,
})
Java
AxonFlow client = AxonFlow.builder()
.agentUrl("https://your-axonflow.example.com")
.licenseKey(System.getenv("AXONFLOW_LICENSE_KEY"))
.tenant("your-tenant-id")
.timeout(Duration.ofSeconds(30))
.build();
Next Steps
- Gateway Mode - For lowest latency with your own LLM calls
- Choosing a Mode - Decision guide
- LLM Interceptors - Wrapper functions for LLM clients (Python, Go, Java)
- TypeScript SDK - Full TypeScript documentation
- Python SDK - Full Python documentation
- Go SDK - Full Go documentation
- Java SDK - Full Java documentation