Models - Hamsa API

Learn about the models that power the Hamsa API.

Flagship models

Text to Speech

Jobs API

Async TTS via /v1/jobs/text-to-speech

Natural-sounding output optimized for Arabic dialects

Multiple Arabic dialects + English

Async job-based — result delivered via webhook

Realtime API

Sync TTS via /v1/realtime/tts

Low latency — returns WAV audio directly

Arabic dialects + English

Optimized for conversational AI and voice agents

Speech to Text

Batch API

Async STT via /v1/jobs/transcribe

High accuracy transcription for Arabic dialects

Word-level timestamps

Speaker diarization support

Async job-based — result delivered via webhook

Realtime API

Sync STT via /v1/realtime/stt

Arabic dialects + English

Base64-encoded audio input

Returns transcription directly

End-of-speech detection

Pricing

Models overview

The Hamsa API offers audio processing optimized for Arabic language, with support for multiple dialects and English.

Endpoint	Description	Languages
`/v1/jobs/text-to-speech`	Async TTS — job-based with webhook delivery	Arabic dialects, English
`/v1/realtime/tts`	Sync TTS — returns WAV audio directly	Arabic dialects, English
`/v1/jobs/transcribe`	Async STT — job-based with webhook delivery	Arabic, English
`/v1/realtime/stt`	Sync STT — returns transcription directly	Arabic, English

Hamsa TTS — Jobs API

The Jobs API (/v1/jobs/text-to-speech) is an async TTS endpoint. It creates a job and delivers the audio result via webhook. Best for batch processing and media content generation. Use cases:

Content Creation: Generate Arabic audio content, podcasts, and videos
Accessibility: Audio versions of written Arabic content
E-Learning: Educational content in Arabic with natural pronunciation
Media Production: Professional-quality voiceovers

Parameters: text, voiceId, webhookUrl, webhookAuth → See the TTS Quickstart for examples.

Hamsa TTS — Realtime API

The Realtime API (/v1/realtime/tts) returns WAV audio directly in the response. Designed for real-time applications and voice agents. Use cases:

Voice Agents: Real-time voice agents and phone calls
Interactive Applications: Chatbots requiring immediate voice response
Live Conversations: Conversational AI applications

Parameters: text, speaker, dialect, mulaw

Supported dialects

Code	Dialect	Example voices
`pls`	Palestinian	Amjad, Layan
`egy`	Egyptian	Mariam, Samir
`syr`	Syrian	Dalal, Mais
`irq`	Iraqi	Lyali, Fatma
`jor`	Jordanian	Lana, Jasem
`leb`	Lebanese	Carla, Majd
`ksa`	Saudi	Hiba, Fahd
`uae`	Emirati	Salma, Dima
`bah`	Bahraini	Mazen, Ruba
`qat`	Qatari	Deema, Faisal
`kuw`	Kuwaiti	Mai, Hatem
`oma`	Omani	Aisha, Jaber
`msa`	Modern Standard Arabic	Salem, Tamim
`ar-sa`	Arabic – Gulf	Khalid, Rahma
`en`	English	Emma, James

→ See the TTS Quickstart for examples.

Hamsa STT — Batch API

The Batch API (/v1/jobs/transcribe) is an async STT endpoint. Submit a media URL and receive the transcription via webhook or polling. Choose from two models:

Model ID	Best for
`Hamsa-General-V2.0`	General-purpose — media, podcasts, pre-recorded content
`Hamsa-Conversational-V1.0`	Conversational audio — meetings, calls, dialogues

Use cases:

Transcription Services: Convert Arabic audio/video content to text
Meeting Documentation: Capture and document Arabic conversations with speaker identification
Media Subtitling: Generate SRT subtitles for Arabic media content
Content Analysis: Process and index Arabic audio content

Key features:

Word-level timestamps for each transcribed segment
Speaker diarization for multi-speaker audio
Automatic Arabic dialect detection (set language to ar)
SRT subtitle export with configurable formatting
Automatic punctuation and formatting

Parameters: mediaUrl, model, language, webhookUrl, returnSrtFormat, srtOptions → See the STT Quickstart for examples.

Hamsa STT — Realtime API

The Realtime API (/v1/realtime/stt) accepts base64-encoded audio and returns the transcription directly. For streaming, use the WebSocket API. Use cases:

Voice Agents: Real-time speech recognition for conversational AI
Live call transcription: Transcribe Arabic calls in real time
Interactive applications: Immediate transcription for chatbots and voice interfaces

Key features:

Synchronous — returns transcription in the response
End-of-speech detection with configurable threshold
Arabic and English language support

Parameters: audioBase64, language, isEosEnabled, eosThreshold → See the STT Quickstart for examples.

Model selection guide

Requirements

Batch / media content

Use the Jobs API (/v1/jobs/text-to-speech) for async processing with webhook delivery.

Real-time / voice agents

Use the Realtime API (/v1/realtime/tts) or WebSocket for low-latency streaming.

Arabic Dialects

Both TTS endpoints support 15 Arabic dialects + English. Choose based on latency requirements.

Use case

Content creation

Use the Jobs API for professional Arabic content, media, and video narration.

Voice Agents

Use the Realtime API / WebSocket for real-time conversational applications.

Transcription

Use the Batch API (/v1/jobs/transcribe) with Hamsa-General-V2.0 for media transcription or Hamsa-Conversational-V1.0 for conversational audio.

Character limits

Endpoint	Character limit
WebSocket TTS	2,000 characters per message

For longer content, consider splitting the input into multiple requests.

Audio duration limits

Endpoint	Audio duration limit	File size limit
Batch API (`/v1/jobs/transcribe`)	60 minutes	500 MB
Realtime API (`/v1/realtime/stt`)	Per-request	N/A
WebSocket (`/v1/realtime/ws`)	Streaming	N/A

Plans and Usage Limits

Your subscription plan determines your monthly usage limits and concurrent call capacity.

Plan Comparison

Plan	Price	Credits	Voice Agent	Speech to Text	Text to Speech	Concurrency	KB Storage
Free	$0/mo	50	9 min	50 min	25 min	1	1 MB
Starter	$5/mo	100	17 min	100 min	50 min	1	5 MB
Creator	$15/mo	500	84 min	500 min	250 min	2	10 MB
Pro	$100/mo	5,000	834 min	5,000 min	2,500 min	5	50 MB
Business	$320/mo	20,000	3,334 min	20,000 min	10,000 min	10	100 MB
Enterprise	Custom	Custom	Unlimited	Unlimited	Unlimited	Unlimited	300 MB

Plan Features

Feature	Free	Starter	Creator	Pro	Business	Enterprise
Access to All Models	✓	✓	✓	✓	✓	✓
Fine-tuned AI Models	-	-	-	-	✓	✓
Basic Cloud Support	-	-	-	✓	-	-
Full Cloud Support	-	-	-	-	✓	✓
On-Premise Solution	-	-	-	-	-	✓

To increase your usage limits & concurrent calls, upgrade your subscription plan.Enterprise customers can request custom limits by contacting sales.

API requests per minute vs concurrent requests

It’s important to understand that API requests per minute and concurrent requests are different metrics that depend on your usage patterns. API requests per minute can be different from concurrent requests since it depends on the length of time for each request and how the requests are batched. Example 1: Spaced requests If you had 60 requests per minute that each took 1 second to complete and you sent them each 1 second apart, the max concurrent requests would be 1 and the average would be 1. Example 2: Batched requests However, if you had 60 requests per minute that each took 3 seconds to complete but all fired at once, the max concurrent requests would be 60 and the average would be 3. Since our system cares about concurrency, requests per minute matter less than how long each of the requests take and the pattern of when they are sent.

Documentation Index

​Flagship models

​Text to Speech

Jobs API

Realtime API

​Speech to Text

Batch API

Realtime API

​Models overview

​Hamsa TTS — Jobs API

​Hamsa TTS — Realtime API

​Supported dialects

​Hamsa STT — Batch API

​Hamsa STT — Realtime API

​Model selection guide

Batch / media content

Real-time / voice agents

Arabic Dialects

Content creation

Voice Agents

Transcription

​Character limits

​Audio duration limits

​Plans and Usage Limits

​Plan Comparison

​Plan Features

​API requests per minute vs concurrent requests

Flagship models

Text to Speech

Speech to Text

Models overview

Hamsa TTS — Jobs API

Hamsa TTS — Realtime API

Supported dialects

Hamsa STT — Batch API

Hamsa STT — Realtime API

Model selection guide

Character limits

Audio duration limits

Plans and Usage Limits

Plan Comparison

Plan Features

API requests per minute vs concurrent requests