Base URL: https://ai.izitechnologies.com
All endpoints require Authorization: Bearer YOUR_API_KEY unless marked as public.
Authenticate and receive a session cookie. Public
| Parameter | Type | Description | |
|---|---|---|---|
| string | required | User email | |
| password | string | required | User password |
curl -X POST /auth/login -H "Content-Type: application/json" \
-d '{"email":"user@example.com","password":"secret"}'
# Response: {"user": {"id":1,"email":"...","role":"customer",...}}
# Sets session cookie
Clear session cookie.
Get the authenticated user's profile.
# Response: {"id":1,"email":"...","name":"...","role":"customer","rate_limit":60,...}
Service health check. Public
curl https://ai.izitechnologies.com/health
# {"status":"ok","services":{"ollama":true,"chatterbox":true,"whisper":true}}
Streaming chat completion with tool/function calling. Returns Server-Sent Events (SSE). If the user has a Knowledge Base, relevant context is auto-injected.
| Parameter | Type | Description | |
|---|---|---|---|
| messages | array | required | Array of message objects with role (system/user/assistant/tool) and content |
| tools | array | optional | Tool/function definitions for function calling |
curl -N -X POST /api/chat \
-H "Authorization: Bearer aigw_..." \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}
}
}
]
}'
# Token streaming
event: token
data: {"content": "Hi", "done": false}
# Tool/function call (LLM wants to invoke a tool)
event: tool_call
data: {"tool_calls": [{"function": {"name": "get_weather", "arguments": {"city": "Paris"}}}]}
# Completion
event: done
data: {"content": "", "done": true, "total_duration": 1234567, "eval_count": 42}
# Error
event: error
data: {"error": "something went wrong"}
After receiving tool_call, execute the tool and send the result back as a message with role: "tool".
Synthesize speech from text. Returns audio/wav. Supports 26+ languages, 6 preset voices, and custom cloned voices.
| Parameter | Type | Description | |
|---|---|---|---|
| text | string | required | Text to synthesize (max ~300 chars recommended) |
| voice_id | string | optional | Voice ID. Default: preset_aria |
| language | string | optional | ISO language code. Default: en |
| exaggeration | float | optional | Expressiveness 0.25-2.0. Default: 0.5 |
| temperature | float | optional | Variation 0.05-2.0. Default: 0.8 |
| cfg_weight | float | optional | Stability 0.0-1.0. Default: 0.5 |
curl -X POST /api/tts \
-H "Authorization: Bearer aigw_..." \
-H "Content-Type: application/json" \
-d '{"text":"Hello world!","voice_id":"preset_aria","language":"en"}' \
-o speech.wav
List all available voices (preset and cloned).
# Response:
{"voices": [
{"voice_id":"preset_aria","name":"Aria","type":"preset","preview_url":"..."},
{"voice_id":"clone_myvoice","name":"myvoice","type":"clone","preview_url":"..."}
]}
List all supported TTS languages (26+).
# Response: {"languages": {"en":"English","bn":"Bengali","fr":"French",...}}
Create a cloned voice from a 5-30 second audio sample.
| Parameter | Type | Description | |
|---|---|---|---|
| name | string | required | Voice name |
| file | file | required | Audio sample (WAV, MP3, WebM) |
curl -X POST /api/voices/clone \
-H "Authorization: Bearer aigw_..." \
-F "name=My Voice" -F "file=@sample.wav"
# Response: {"voice_id":"clone_my_voice","name":"My Voice","type":"clone"}
Delete a cloned voice. Preset voices cannot be deleted.
Transcribe audio to text using Whisper large-v3. Supports 90+ languages with auto-detection.
| Parameter | Type | Description | |
|---|---|---|---|
| file | file | required | Audio file (WAV, MP3, WebM, FLAC, OGG) |
| language | string | optional | Language hint (ISO code). Auto-detected if omitted. |
curl -X POST /api/stt \
-H "Authorization: Bearer aigw_..." \
-F "file=@recording.wav" -F "language=en"
# Response:
{
"text": "Hello world",
"language": "en",
"language_probability": 0.99,
"duration": 3.456,
"segments": [
{"start": 0.0, "end": 1.5, "text": "Hello world"}
]
}
Full pipeline in a single request: Audio in → STT → LLM → TTS → Audio out. Returns audio/wav.
| Parameter | Type | Description | |
|---|---|---|---|
| file | file | required | Audio recording |
| history | string | optional | JSON array of previous messages for multi-turn |
curl -X POST /api/voicechat \
-H "Authorization: Bearer aigw_..." \
-F "file=@recording.wav" -F 'history=[]' -o response.wav
# Response headers:
# X-Transcription: what the user said
# X-LLM-Response: what the AI replied
# Body: audio/wav of the spoken response
Extract text from images via OCR (Tesseract). Supports 12 languages.
| Parameter | Type | Description | |
|---|---|---|---|
| file | file | required | Image file (PNG, JPG, TIFF, BMP, WebP) |
| language | string | optional | Language code (en, bn, fr, es, de, ar, hi, ja, ko, zh, pt, ru). Default: en |
curl -X POST /api/ocr \
-H "Authorization: Bearer aigw_..." \
-F "file=@document.png" -F "language=en"
# Response: {"text":"extracted text","language":"en","filename":"document.png"}
AI-powered image analysis. Send any image with a custom prompt and get an intelligent description.
| Parameter | Type | Description | |
|---|---|---|---|
| file | file | required | Image file |
| prompt | string | optional | Analysis prompt. Default: "Describe this image in detail." |
curl -X POST /api/vision/analyze \
-H "Authorization: Bearer aigw_..." \
-F "file=@photo.jpg" \
-F "prompt=What objects are in this image? List them."
# Response: {"analysis": "The image contains a laptop, a coffee cup, ..."}
Same as above but streams the response as SSE (same format as /api/chat).
Document intelligence. Extracts structured JSON data from identity documents, invoices, receipts, and business cards. Uses OCR + AI vision together for maximum accuracy.
| Parameter | Type | Description | |
|---|---|---|---|
| file | file | required | Document image |
| type | string | optional | passport, drivers_license, id_card, invoice, receipt, business_card, or auto |
| language | string | optional | Document language for OCR. Default: en |
curl -X POST /api/vision/scan \
-H "Authorization: Bearer aigw_..." \
-F "file=@passport.jpg" -F "type=passport"
# Response:
{
"document_type": "passport",
"raw_text": "OCR extracted text...",
"ai_analysis": "AI interpretation...",
"extracted": {
"full_name": "John Michael Smith",
"surname": "Smith",
"given_names": "John Michael",
"nationality": "United States",
"date_of_birth": "15 Jan 1990",
"passport_number": "AB1234567",
"date_of_issue": "01 Mar 2020",
"date_of_expiry": "01 Mar 2030",
"sex": "M",
"mrz_line_1": "P<USASMITH<<JOHN<MICHAEL...",
"mrz_line_2": "AB1234567..."
},
"filename": "passport.jpg"
}
Passport fields: full_name, surname, given_names, nationality, date_of_birth, sex, place_of_birth, date_of_issue, date_of_expiry, passport_number, issuing_authority, mrz_line_1, mrz_line_2.
Driver's license fields: full_name, date_of_birth, address, license_number, class, issue_date, expiry_date, sex, height, restrictions.
Invoice fields: vendor_name, invoice_number, invoice_date, due_date, line_items, subtotal, tax, total_amount, currency.
Per-user isolated knowledge bases. Documents are automatically chunked, embedded, and injected into all chat and agent conversations.
List your knowledge bases.
# Response: [{"id":1,"name":"Product Docs","doc_count":5,"description":"..."}]
Create a knowledge base.
| Parameter | Type | Description | |
|---|---|---|---|
| name | string | required | KB name |
| description | string | optional | Description |
Delete a KB and all its documents and embeddings.
List documents in a knowledge base.
Upload a document. Supports text, PDF, CSV, images (OCR), and URL scraping.
| Parameter | Type | Description | |
|---|---|---|---|
| file | file | option A | File upload: .txt, .pdf, .csv, .md, .png, .jpg, .tiff |
| url | string | option B | URL to scrape and ingest |
| language | string | optional | Language for image OCR |
# Upload a file
curl -X POST /api/kb/1/documents -H "Authorization: Bearer aigw_..." \
-F "file=@pricing.pdf"
# Scrape a URL
curl -X POST /api/kb/1/documents -H "Authorization: Bearer aigw_..." \
-F "url=https://example.com/faq"
# Upload an image (OCR extracted automatically)
curl -X POST /api/kb/1/documents -H "Authorization: Bearer aigw_..." \
-F "file=@scanned_doc.png" -F "language=en"
Delete a document and its embeddings.
Test semantic search against a knowledge base.
| Parameter | Type | Description | |
|---|---|---|---|
| query | string | required | Search query |
curl -X POST /api/kb/1/query -H "Authorization: Bearer aigw_..." \
-H "Content-Type: application/json" -d '{"query":"pricing"}'
# Response: {"results":[{"content":"...","source":"pricing.pdf","score":0.89}]}
Pre-built and custom AI agents with specialized system prompts and KB access.
List agents available to the current user (public + own private agents).
# Response: [{"id":1,"name":"Customer Service","description":"...","system_prompt":"...","icon":"chat","color":"#6366f1","capabilities":"[\"KB\",\"Voice\"]","is_public":true}]
Get a single agent's details.
Create a custom agent.
| Parameter | Type | Description | |
|---|---|---|---|
| name | string | required | Agent name |
| system_prompt | string | required | System instructions for the AI |
| description | string | optional | Description |
| icon | string | optional | Icon: bot, chat, mic, chart, pen, code, globe |
| color | string | optional | Hex color. Default: #6366f1 |
| capabilities | string | optional | JSON array of capability tags |
| is_public | bool | optional | Make visible to all users (admin only) |
Update an agent. Same parameters as create.
Delete an agent. Must be owner or admin.
Chat with an agent. SSE streaming (same format as /api/chat). The agent's system prompt is prepended, and RAG context from your KBs is auto-injected.
| Parameter | Type | Description | |
|---|---|---|---|
| messages | array | required | Conversation messages (same format as /api/chat) |
curl -N -X POST /api/agents/1/chat \
-H "Authorization: Bearer aigw_..." \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"What is the refund policy?"}]}'
# SSE stream: event: token → event: done (same as /api/chat)
Full-duplex streaming voice interaction. Send audio chunks, receive live transcription, LLM token streaming, and TTS audio — all over a single WebSocket.
| Type | Format | Description |
|---|---|---|
| JSON | {"type":"start","voice_id":"preset_aria","language":"en"} | Configure session |
| Binary | Audio bytes (WebM/Opus or PCM 16kHz) | Audio input chunks |
| JSON | {"type":"stop"} | End of utterance — triggers pipeline |
| JSON | {"type":"end"} | Close session |
| Type | Format | Description |
|---|---|---|
| JSON | {"type":"stt","text":"hello world","partial":false} | STT transcript |
| JSON | {"type":"llm","token":"Hi"} | LLM streaming token |
| JSON | {"type":"llm_done"} | LLM finished |
| Binary | WAV audio bytes | TTS audio chunk |
| JSON | {"type":"tts_done"} | TTS finished |
| JSON | {"type":"error","message":"..."} | Error |
wscat -c "wss://ai.izitechnologies.com/ws/iva?token=aigw_..."
→ {"type":"start","voice_id":"preset_aria","language":"en"}
→ [binary audio frames...]
→ {"type":"stop"}
← {"type":"stt","text":"hello","partial":false}
← {"type":"llm","token":"Hi"}
← {"type":"llm","token":" there!"}
← {"type":"llm_done"}
← [binary TTS audio]
← {"type":"tts_done"}
List your API keys.
Create a new API key.
| Parameter | Type | Description | |
|---|---|---|---|
| name | string | required | Key label (e.g. "My App") |
curl -X POST /api/keys -H "Authorization: Bearer aigw_..." \
-H "Content-Type: application/json" -d '{"name":"Production"}'
# Response: {"id":1,"key":"aigw_abc123...","name":"Production"}
# ⚠️ The key value is only shown once — copy it immediately.
Delete an API key permanently.
List all users.
Create a user. Params: email, password, name, role (customer/admin), rate_limit.
Update a user. Params: name, role, active, rate_limit.
Deactivate a user.
Usage statistics. Query param: period (24h, 7d, 30d).
# Response: [{"user_id":1,"email":"...","endpoint":"/api/chat","count":42}]
List all API keys across all users.
Generate a key for any user. Params: user_id, name.
Delete any API key.
List all knowledge bases across all users.
Have an account? Sign in to boyoAI
38 API endpoints · SSE streaming · WebSocket · REST