响应速度取决于:1)当前网络线路质量;2)所选模型本身的响应时间;3)上游 API 负载。
建议:优先使用 Claude 3.5 Haiku 或 GPT-4o Mini 等轻量模型;如速度持续异常,请联系管理员检查 IP 池状态。
5-Minute Setup Guide
AI-Bridge provides a unified API compatible with OpenAI format. Connect to Claude Code, Codex, Cursor and more with native API channels — no code changes required.
Sign up and choose a plan
Go to Sign Up to create an account, then purchase a token plan from Products. Tokens never expire.
Create an API key
After login, go to API Keys and click "Create New Key". Copy the generated ab-... key.
Configure your client
Point your client's Base URL to AI-Bridge and use your ab-... key. That's it.
That's it! Direct native channels. No VPN needed. No overseas credit card. Instantly connect to Claude Code, Codex, Cursor workflows.
Client Configuration
Claude Code / Claude Desktop
Set the following environment variables in your terminal or .env file:
# Claude client configANTHROPIC_BASE_URL=https://your-server.com:3001
ANTHROPIC_API_KEY=ab-xxxxxxxxxxxxxxxx
Note: Replace your-server.com with the actual server address and ab-xxxxxxxxxxxxxxxx with your AI-Bridge key.
Supported Models
AI-Bridge supports top-tier international models through a unified interface with automatic protocol conversion and failover.
Model
Provider
Strength
Context
Status
Claude 3.5 Sonnet
Anthropic
Best for code, accurate reasoning
200K
Recommended
Claude 3.5 Haiku
Anthropic
Fast responses, low cost
200K
Available
Claude 3 Opus
Anthropic
Strongest general ability
200K
Available
GPT-4o
OpenAI
Multimodal, strong general purpose
128K
Backup
GPT-4o Mini
OpenAI
High value, fast
128K
Backup
Gemini 2.0 Pro
Google
Ultra-long context, large codebases
1M+
Backup
Gemini 2.0 Flash
Google
Ultra-fast, low latency
1M+
Backup
Note: Available models depend on admin configuration. If a model is unavailable, the system automatically fails over to alternatives.
Claude Models
Claude is AI-Bridge's primary model, with world-class code generation and ultra-long context support, ideal for Claude Code and other programming tools.
Available Model IDs
🟠
Claude 3.5 Sonnet
Best for code, recommended default
claude-3-5-sonnet-20241022
⚡
Claude 3.5 Haiku
Fastest, ideal for simple tasks
claude-3-5-haiku-20241022
🏆
Claude 3 Opus
Strongest general, complex reasoning
claude-3-opus-20240229
GPT Models (OpenAI)
GPT-4o is OpenAI's flagship model with strong multimodal capabilities, serving as a backup option to Claude.
Available Model IDs
🟢
GPT-4o
Flagship model, multimodal
gpt-4o
💡
GPT-4o Mini
High value, fast responses
gpt-4o-mini
Gemini Models (Google)
Gemini's biggest advantage is ultra-long context (up to 2M tokens), ideal for analyzing large codebases or long documents.
Available Model IDs
🔵
Gemini 2.0 Pro
1M+ context, large codebases
gemini-2.0-pro-exp
⚡
Gemini 2.0 Flash
Ultra-fast, low latency
gemini-2.0-flash-exp
Streaming (SSE)
AI-Bridge fully supports Server-Sent Events streaming. Claude Code and similar tools use streaming by default.
Enable Streaming
Add "stream": true to your request body:
# Streaming output, use -N for real-time display
curl -N -X POST https://your-server.com:3001/v1/chat/completions \
-H "Authorization: Bearer ab-xxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"model":"claude-3-5-sonnet-20241022","messages":[{"role":"user","content":"Hello"}],"stream":true}'
Python Streaming Example
import anthropic
client = anthropic.Anthropic(
base_url="https://your-server.com:3001",
api_key="ab-xxxxxxxxxxxxxxxx",
)
with client.messages.stream(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{"role": "user", "content": "Tell a story"}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Auto Failover
When a model API is unavailable, AI-Bridge automatically switches to a backup model to ensure 99%+ service availability.
Failover Chain
User requests Claude 3.5 Sonnet
↓
Claude API timeout / failure
↓
Auto switch to GPT-4o (backup 1)
↓
GPT-4o failure
↓
Auto switch to Gemini 2.0 Pro (backup 2)
↓
Return response (includes actual model used)
Response includes actual model used
// Response example: requested claude, system used gpt-4o
{
"id": "chatcmpl-xxx",
"model": "gpt-4o", // actual model used"provider": "openai", // actual provider"choices": [...]
}
Billing: During failover, tokens are billed based on the model actually used. Different models have different token multipliers. See Products for details.
Codex CLI Integration
AI-Bridge provides a dedicated passthrough route /v1/openai for Codex CLI and other native OpenAI tools, fully compatible with all OpenAI APIs.
Configuration
# Codex CLI configurationexportOPENAI_API_KEY=ab-xxxxxxxxxxxxxxxx
exportOPENAI_BASE_URL=https://your-server.com:3001/v1/openai
# Then use codex normally
codex "fix this bug"
Verify Connection
# Test models endpoint (Codex calls this on startup)
curl https://your-server.com:3001/v1/openai/models \
-H "Authorization: Bearer ab-xxxxxxxxxxxxxxxx"# Test streaming chat
curl -N https://your-server.com:3001/v1/openai/chat/completions \
-H "Authorization: Bearer ab-xxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"hi"}],"stream":true}'
Prerequisite: Admin must configure OPENAI_API_KEY in the system settings. Contact your service provider to confirm.
FAQ
Do tokens expire?
▼
No. Purchased tokens are permanent with no time limit. As long as your account is active, tokens remain in your balance.
How is token usage calculated?
▼
Based on the actual usage field returned by the API, including input (prompt) and output (completion) tokens. After each request, the system deducts precisely based on the returned usage. You can view detailed consumption in the "Quota Management" page.
Do different models consume tokens at the same rate?
▼
No. Different models have different token multipliers. Premium models like Claude and GPT-4o have higher rates, while lightweight models like GPT-4o Mini have lower rates. See the Products page for details.
Why is the response model different from what I requested?
▼
This means auto failover was triggered. When your specified model is temporarily unavailable, the system automatically switches to a backup model. The model field in the response reflects the actual model used. Contact admin to disable this if needed.
What's the difference between AI-Bridge key and Anthropic key?
▼
AI-Bridge API key (ab-...): Used to authenticate with AI-Bridge, consuming your purchased tokens. This is the key you use in your client.
Upstream Anthropic key: If you have your own official Anthropic API key, you can add it in Settings for priority routing with independent quota.
How do I request a refund?
▼
Unused token balance can be refunded. Find the order in "My Orders" page, or contact support with your order number for manual processing.
What if responses are slow?
▼
Response speed depends on: 1) Network quality; 2) The model's inherent response time; 3) Upstream API load.
Tips: Use lightweight models like Claude 3.5 Haiku or GPT-4o Mini. If speed issues persist, contact admin to check IP pool status.