Why AI gateway?
Instead of managing API keys per-model and per-environment, the AI Gateway provides a centralized, authenticated entry point for all LLM interactions. It handles authentication, agent ID validation, and policy enforcement - while maintaining full OpenAI SDK compatibility.
The gateway wraps the OpenAI SDK - any framework that works with OpenAI works with AI Gateway . No code changes required.
Prerequisite : Ensure you’ve installed the SDK and configured your environment before proceeding.
Quick start
1. Set up environment variables
Before using the gateway, configure your credentials:
# Required for AI gateway
AI_GATEWAY_API_KEY = your-api-key
AI_GATEWAY_ENDPOINT = your-ai-gateway-endpoint
Never commit API keys to version control. Add .env to your .gitignore.
2. Make your first call
from bb_ai_sdk.ai_gateway import AIGateway
gateway = AIGateway.create(
model_id = "gpt-4o" ,
agent_id = "550e8400-e29b-41d4-a716-446655440000"
)
response = gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
print (response.choices[ 0 ].message.content)
That’s it - you’re making LLM calls through the Backbase AI Platform.
Tracing with observability
Call configure_observability() before creating the gateway (integration steps ):
from bb_ai_sdk.observability import configure_observability
from bb_ai_sdk.ai_gateway import AIGateway
configure_observability( service_name = "my-agent" )
gateway = AIGateway.create( model_id = "gpt-4o" , agent_id = "..." )
response = gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
FastAPI apps: pass fastapi_app=app. Agno/LangChain/LangGraph: set framework= and install the matching [instrument-*] extra.
Sync vs Async
Choose based on your application architecture:
Use AIGateway for synchronous applications (scripts, simple APIs): from bb_ai_sdk.ai_gateway import AIGateway
gateway = AIGateway.create(
model_id = "gpt-4o" ,
agent_id = "550e8400-e29b-41d4-a716-446655440000"
)
response = gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Use AsyncAIGateway for async applications (FastAPI, LangGraph): from bb_ai_sdk.ai_gateway import AsyncAIGateway
gateway = AsyncAIGateway.create(
model_id = "gpt-4o" ,
agent_id = "550e8400-e29b-41d4-a716-446655440000"
)
response = await gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Common use cases
Streaming responses
For real-time responses (chatbots, interactive UIs), enable streaming:
Sync streaming
Async streaming
stream = gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Write a story" }],
stream = True
)
for chunk in stream:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
stream = await gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Write a story" }],
stream = True
)
async for chunk in stream:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
Framework adapters
The AI Gateway is OpenAI-compatible out of the box, but if you’re using LangChain , LangGraph , or Agno , adapters convert the gateway into framework-native objects - no manual configuration required.
LangChain
from bb_ai_sdk.ai_gateway import AIGateway
from bb_ai_sdk.ai_gateway.adapters.langchain import to_langchain
gateway = AIGateway.create( model_id = "gpt-4o" , agent_id = "..." )
model = to_langchain(gateway) # Returns a ChatOpenAI-compatible model
# Use with LangChain components
from langchain.schema.output_parser import StrOutputParser
chain = model | StrOutputParser()
response = chain.invoke( "Tell me a joke" )
LangGraph
from bb_ai_sdk.ai_gateway import AsyncAIGateway
from bb_ai_sdk.ai_gateway.adapters.langchain import to_langchain_async
gateway = AsyncAIGateway.create( model_id = "gpt-4o" , agent_id = "..." )
model = to_langchain_async(gateway) # Returns async-compatible model
# Use in LangGraph nodes
async def generate ( state ):
response = await model.ainvoke(state[ "messages" ])
return { "messages" : [response]}
Agno
from bb_ai_sdk.ai_gateway import AIGateway
from bb_ai_sdk.ai_gateway.adapters.agno import to_agno
from agno import Agent
gateway = AIGateway.create( model_id = "gpt-4o" , agent_id = "..." )
model = to_agno(gateway) # Returns Agno-compatible model
agent = Agent(
name = "Assistant" ,
model = model,
instructions = "You are helpful."
)
response = agent.run( "Hello!" )
Common patterns
Token usage tracking
Extract token consumption from responses:
from bb_ai_sdk.ai_gateway import get_token_usage
response = gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
usage = get_token_usage(response)
if usage:
print ( f "Prompt tokens: { usage.prompt_tokens } " )
print ( f "Completion tokens: { usage.completion_tokens } " )
print ( f "Total tokens: { usage.total_tokens } " )
Error handling
Handle errors gracefully using the SDK’s specific exception types for different failure scenarios:
from bb_ai_sdk.ai_gateway import (
AIGateway,
InvalidAgentIdError,
ConfigurationError,
AuthenticationError,
RateLimitError,
)
# Handle creation errors
try :
gateway = AIGateway.create(
model_id = "gpt-4o" ,
agent_id = "invalid"
)
except InvalidAgentIdError:
print ( "Invalid agent ID format - must be UUID v4" )
except ConfigurationError:
print ( "Missing API key or gateway URL" )
# Handle request errors
try :
response = gateway.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
except AuthenticationError:
print ( "Invalid API key" )
except RateLimitError:
print ( "Rate limit exceeded - implement backoff" )
Error types reference
Error HTTP Code Description InvalidAgentIdError- Agent ID not in UUID v4 format ConfigurationError- Missing API key or invalid gateway URL AuthenticationError401 Invalid or expired API key AuthorizationError403 Insufficient permissions for this operation RateLimitError429 Rate limit exceeded ValidationError400 Invalid request parameters ModelNotFoundError404 Requested model not available ServiceError500+ Server-side error NetworkError- Connection failed
Configuration
Environment variables
Configure credentials via environment variables (recommended):
# Required
AI_GATEWAY_API_KEY =your-api-key
AI_GATEWAY_ENDPOINT =your-ai-gateway-endpoint
Never commit API keys to version control. Add .env to your .gitignore.
Create parameters
Model identifier (e.g., gpt-4o, gpt-4o-mini, gpt-4-turbo).
Your agent’s unique identifier in UUID v4 format. Obtained from the platform when you register your agent.
API key for authentication. Falls back to AI_GATEWAY_API_KEY, then AZURE_OPENAI_API_KEY, if not provided.
Gateway URL. Falls back to AI_GATEWAY_ENDPOINT environment variable or platform default.
api_version
string
default: "2024-10-21"
API version for the gateway.
Advanced: Accessing the underlying client
Access the underlying OpenAI client or raw configuration for advanced use cases:
Get_client()
Get the underlying OpenAI client for direct SDK access:
client = gateway.get_client()
# Returns OpenAI (Sync) or asyncopenai (Async) instance
Get_config()
Get configuration dictionary for manual framework setup:
config = gateway.get_config()
# Returns:
# {
# "api_key": "...",
# "base_url": "...",
# "default_headers": {"x-agent-id": "...", "api-key": "..."},
# "default_query": {"api-version": "2024-10-21"},
# "model": "gpt-4o"
# }
API reference
Aigateway
Property/Method Returns Description chatChat interface OpenAI-compatible chat completions model_idstrConfigured model ID agent_idstrValidated agent ID get_client()OpenAIUnderlying OpenAI client get_config()dictConfiguration dictionary create()AIGatewayFactory method (class method)
Asyncaigateway
Same interface as AIGateway but returns AsyncOpenAI client and supports async operations.
Next steps
Observability Add tracing and monitoring to your agents
Starter kits See AI Gateway integrated in production templates
Get started Build your first agent end-to-end
Examples View complete working examples