API Reference
Complete reference documentation for all BrainTwin.ai API endpoints.
Base URL
https://inference.braintwin.ai/v1All API endpoints are relative to this base URL.
Authorization: Bearer YOUR_API_KEY header./v1/chat/completionsCreate Chat Completion
Creates a model response for the given chat conversation. This is the main endpoint for generating AI responses.
Parameters
modelstringrequiredID of the model to use. Available models: gpt-3.5-turbo, gpt-4
messagesarrayrequiredA list of messages comprising the conversation so far.
temperaturenumberSampling temperature between 0 and 2. Higher values make output more random. Default: 1
max_tokensintegerMaximum number of tokens to generate. Default: 16
streambooleanWhether to stream back partial progress. Default: false
presence_penaltynumberPenalty for new tokens based on presence in text. Range: -2.0 to 2.0
frequency_penaltynumberPenalty for new tokens based on frequency in text. Range: -2.0 to 2.0
Examples
cURL Example
curlcurl https://inference.braintwin.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 150,
"stream": false
}'Available Models
Choose the right model for your use case:
gpt-3.5-turbo
Most PopularFast and efficient model optimized for chat and general tasks. Best balance of speed and capability.
gpt-4
Most CapableMore capable model with superior reasoning, analysis, and complex task handling.
Streaming Responses
Enable streaming to receive partial responses as they're generated:
Streaming Example
curl https://inference.braintwin.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Write a haiku"}],
"stream": true
}'data: field with partial completion data.Error Codes
The API uses standard HTTP status codes to indicate success or failure:
Request successful
Invalid request parameters or malformed JSON
Invalid or missing API key
Rate limit exceeded
Server error - please try again later
Rate Limiting
API requests are rate limited based on your subscription plan. Rate limit information is included in response headers:
X-RateLimit-Limit: Total requests allowed per time windowX-RateLimit-Remaining: Remaining requests in current windowX-RateLimit-Reset: Unix timestamp when the rate limit resets429 Too Many Requests error. Consider implementing exponential backoff in your applications.