Proxy Mode - Zero-Code AI Governance

Proxy Mode is the simplest way to add governance to your AI applications. Send your queries to AxonFlow and it handles everything: policy enforcement, LLM routing, PII detection, and audit logging.

How It Works

Key Benefits:

You don't manage LLM API keys - AxonFlow routes to configured providers
Automatic audit trail for every request
Response filtering for PII before it reaches your app
One API call for policy check + LLM execution + audit

Quick Start

TypeScript

import { AxonFlow } from '@axonflow/sdk';  // v1.4.0+

const axonflow = new AxonFlow({
  endpoint: process.env.AXONFLOW_AGENT_URL,
  licenseKey: process.env.AXONFLOW_LICENSE_KEY,
});

// Single call handles: policy check → LLM routing → audit
const response = await axonflow.executeQuery({
  userToken: 'user-123',
  query: 'What are the benefits of AI governance?',
  requestType: 'llm_chat',
  context: {
    provider: 'openai',
    model: 'gpt-4',
  },
});

if (response.blocked) {
  console.log('Blocked:', response.blockReason);
} else if (response.success) {
  console.log('Response:', response.data);
}

Python

from axonflow import AxonFlow  # v0.3.0+
import os

async with AxonFlow(
    agent_url=os.environ["AXONFLOW_AGENT_URL"],
    client_id=os.environ["AXONFLOW_CLIENT_ID"],
    client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
) as client:
    response = await client.execute_query(
        user_token="user-123",
        query="What are the benefits of AI governance?",
        request_type="llm_chat",
        context={
            "provider": "openai",
            "model": "gpt-4"
        }
    )

    if response.blocked:
        print(f"Blocked: {response.block_reason}")
    else:
        print(f"Response: {response.data}")

Go

import "github.com/getaxonflow/axonflow-sdk-go"  // v1.5.0+

client := axonflow.NewClient(axonflow.AxonFlowConfig{
    AgentURL:   os.Getenv("AXONFLOW_AGENT_URL"),
    LicenseKey: os.Getenv("AXONFLOW_LICENSE_KEY"),
})

response, err := client.ExecuteQuery(
    "user-123",                                  // userToken
    "What are the benefits of AI governance?",   // query
    "llm_chat",                                  // requestType
    map[string]interface{}{                      // context
        "provider": "openai",
        "model":    "gpt-4",
    },
)

if err != nil {
    log.Fatal(err)
}

if response.Blocked {
    fmt.Printf("Blocked: %s\n", response.BlockReason)
} else {
    fmt.Printf("Response: %v\n", response.Data)
}

Java

import com.getaxonflow.sdk.AxonFlow;  // v1.1.0+
import com.getaxonflow.sdk.ExecuteQueryRequest;
import com.getaxonflow.sdk.ExecuteQueryResponse;

AxonFlow client = AxonFlow.builder()
    .agentUrl(System.getenv("AXONFLOW_AGENT_URL"))
    .licenseKey(System.getenv("AXONFLOW_LICENSE_KEY"))
    .build();

ExecuteQueryResponse response = client.executeQuery(
    ExecuteQueryRequest.builder()
        .userToken("user-123")
        .query("What are the benefits of AI governance?")
        .requestType("llm_chat")
        .context(Map.of(
            "provider", "openai",
            "model", "gpt-4"
        ))
        .build()
);

if (response.isBlocked()) {
    System.out.println("Blocked: " + response.getBlockReason());
} else {
    System.out.println("Response: " + response.getData());
}

When to Use Proxy Mode

Best For

Greenfield projects - Starting fresh with AI governance
Simple integrations - Single API call for everything
Response filtering - Automatic PII detection and redaction
100% audit coverage - Every call automatically logged
Multi-provider routing - AxonFlow handles LLM provider selection

Example Use Cases

Scenario	Why Proxy Mode
Customer support chatbot	Simple, automatic audit trail
Internal Q&A assistant	Zero-code governance
Document summarization	Response filtering for PII
Code generation	Block prompt injection attacks

Response Handling

All SDKs return a consistent response structure:

interface ExecuteQueryResponse {
  success: boolean;       // True if LLM call succeeded
  blocked: boolean;       // True if blocked by policy
  blockReason?: string;   // Why it was blocked
  data?: string;          // LLM response content
  policyInfo?: {
    policiesEvaluated: string[];  // Policies that were checked
    contextId: string;            // Audit context ID
  };
  tokenUsage?: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
}

Example: Handling Different Cases

const response = await axonflow.executeQuery({
  userToken: 'user-123',
  query: userInput,
  requestType: 'llm_chat',
  context: { provider: 'openai', model: 'gpt-4' },
});

if (response.blocked) {
  // Policy violation - show user-friendly message
  console.log('Request blocked:', response.blockReason);

  // Log which policies triggered
  if (response.policyInfo?.policiesEvaluated) {
    console.log('Policies:', response.policyInfo.policiesEvaluated.join(', '));
  }
} else if (response.success) {
  // Success - display the response
  console.log('AI Response:', response.data);

  // Track token usage for billing
  if (response.tokenUsage) {
    console.log(`Tokens used: ${response.tokenUsage.totalTokens}`);
  }
} else {
  // Error (network, LLM provider issue, etc.)
  console.error('Request failed');
}

LLM Provider Configuration

In Proxy Mode, AxonFlow routes requests to your configured LLM providers. Specify the provider in the context:

// OpenAI
const response = await axonflow.executeQuery({
  query: 'Hello!',
  context: { provider: 'openai', model: 'gpt-4' },
});

// Anthropic
const response = await axonflow.executeQuery({
  query: 'Hello!',
  context: { provider: 'anthropic', model: 'claude-3-sonnet' },
});

// Google Gemini
const response = await axonflow.executeQuery({
  query: 'Hello!',
  context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});

// Ollama (self-hosted)
const response = await axonflow.executeQuery({
  query: 'Hello!',
  context: { provider: 'ollama', model: 'llama2' },
});

// AWS Bedrock
const response = await axonflow.executeQuery({
  query: 'Hello!',
  context: { provider: 'bedrock', model: 'anthropic.claude-3-sonnet' },
});

Features

1. Automatic Policy Enforcement

All requests are checked against your policies before reaching the LLM:

// PII is blocked before reaching OpenAI
const response = await axonflow.executeQuery({
  userToken: 'user-123',
  query: 'Process payment for SSN 123-45-6789',
  requestType: 'llm_chat',
  context: { provider: 'openai', model: 'gpt-4' },
});

// response.blocked = true
// response.blockReason = "PII detected: US Social Security Number"

2. Automatic Audit Logging

Every request is logged for compliance - no additional code needed:

const response = await axonflow.executeQuery({
  userToken: 'user-123',  // User identifier for audit
  query: 'Analyze this data...',
  requestType: 'llm_chat',
  context: { provider: 'openai', model: 'gpt-4' },
});

// Audit automatically includes:
// - Timestamp
// - User token
// - Request content (sanitized)
// - Response summary
// - Policies evaluated
// - Token usage
// - Latency

3. Response Filtering (Enterprise)

PII in LLM responses can be automatically redacted:

const response = await axonflow.executeQuery({
  query: "What is John Smith's email?",
  context: { provider: 'openai', model: 'gpt-4' },
});

// response.data = "The customer's email is [EMAIL REDACTED]"

4. Multi-Provider Routing

Route to different providers based on request type or load:

// Fast queries to Gemini
const fastResponse = await axonflow.executeQuery({
  query: 'Quick question',
  context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});

// Complex reasoning to Claude
const complexResponse = await axonflow.executeQuery({
  query: 'Analyze this complex document...',
  context: { provider: 'anthropic', model: 'claude-3-opus' },
});

Latency Considerations

Proxy Mode adds latency because requests go through AxonFlow:

Deployment	Additional Latency
SaaS endpoint	~50-100ms
In-VPC deployment	~10-20ms

For latency-sensitive applications, consider Gateway Mode.

Comparison with Gateway Mode

Feature	Proxy Mode	Gateway Mode
Integration Effort	Minimal	Moderate
Code Changes	Single `executeQuery()` call	Pre-check + LLM call + Audit
Latency Overhead	Higher (~50-100ms)	Lower (~10-20ms)
Response Filtering	Yes (automatic)	No
Audit Coverage	100% automatic	Manual
LLM API Keys	Managed by AxonFlow	Managed by you
LLM Control	AxonFlow routes	You call directly
Best For	Simple apps, full governance	Frameworks, lowest latency

See Choosing a Mode for detailed guidance.

Error Handling

try {
  const response = await axonflow.executeQuery({
    userToken: 'user-123',
    query: 'Hello!',
    requestType: 'llm_chat',
    context: { provider: 'openai', model: 'gpt-4' },
  });

  if (response.blocked) {
    // Policy violation - handle gracefully
    showUserMessage('Your request was blocked: ' + response.blockReason);
  } else if (response.success) {
    // Display the response
    displayResponse(response.data);
  }
} catch (error) {
  // Network/SDK errors
  if (error.code === 'ECONNREFUSED') {
    console.error('Cannot reach AxonFlow - check your endpoint');
  } else if (error.code === 'TIMEOUT') {
    console.error('Request timed out');
  } else {
    console.error('Unexpected error:', error.message);
  }
}

Configuration

TypeScript

const axonflow = new AxonFlow({
  endpoint: 'https://your-axonflow.example.com',
  licenseKey: process.env.AXONFLOW_LICENSE_KEY,
  tenant: 'your-tenant-id',
  debug: false,
  timeout: 30000,
});

Python

client = AxonFlow(
    agent_url="https://your-axonflow.example.com",
    client_id=os.environ["AXONFLOW_CLIENT_ID"],
    client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
    tenant="your-tenant-id",
    timeout=30.0,
)

Go

client := axonflow.NewClient(axonflow.AxonFlowConfig{
    AgentURL:   "https://your-axonflow.example.com",
    LicenseKey: os.Getenv("AXONFLOW_LICENSE_KEY"),
    Tenant:     "your-tenant-id",
    Timeout:    30 * time.Second,
})

Java

AxonFlow client = AxonFlow.builder()
    .agentUrl("https://your-axonflow.example.com")
    .licenseKey(System.getenv("AXONFLOW_LICENSE_KEY"))
    .tenant("your-tenant-id")
    .timeout(Duration.ofSeconds(30))
    .build();

Next Steps

Gateway Mode - For lowest latency with your own LLM calls
Choosing a Mode - Decision guide
LLM Interceptors - Wrapper functions for LLM clients (Python, Go, Java)
TypeScript SDK - Full TypeScript documentation
Python SDK - Full Python documentation
Go SDK - Full Go documentation
Java SDK - Full Java documentation

How It Works​

Quick Start​

TypeScript​

Python​

Go​

Java​

When to Use Proxy Mode​

Best For​

Example Use Cases​

Response Handling​

Example: Handling Different Cases​

LLM Provider Configuration​

Features​

1. Automatic Policy Enforcement​

2. Automatic Audit Logging​

3. Response Filtering (Enterprise)​

4. Multi-Provider Routing​

Latency Considerations​

Comparison with Gateway Mode​

Error Handling​

Configuration​

TypeScript​

Python​

Go​

Java​

Next Steps​

How It Works

Quick Start

TypeScript

Python

Go

Java

When to Use Proxy Mode

Best For

Example Use Cases

Response Handling

Example: Handling Different Cases

LLM Provider Configuration

Features

1. Automatic Policy Enforcement

2. Automatic Audit Logging

3. Response Filtering (Enterprise)

4. Multi-Provider Routing

Latency Considerations

Comparison with Gateway Mode

Error Handling

Configuration

TypeScript

Python

Go

Java

Next Steps