Hamsa provides two options for real-time speech-to-text:Documentation Index
Fetch the complete documentation index at: https://docs.tryhamsa.com/llms.txt
Use this file to discover all available pages before exploring further.
- Realtime API (
POST /v1/realtime/stt) — Send base64-encoded audio and receive the transcription directly in the response. Best for short audio clips. - WebSocket (
wss://api.tryhamsa.com/v1/realtime/ws) — Persistent bidirectional connection for streaming audio. Best for live conversations and continuous transcription.
Realtime API
The Realtime API accepts base64-encoded audio and returns the transcription synchronously. See the Quickstart for usage examples.Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
audioBase64 | string | Yes | Base64-encoded audio data (WAV format) |
language | string | No | Language code: ar (default) or en |
isEosEnabled | boolean | No | Enable end-of-speech detection (default: false) |
eosThreshold | number | No | End-of-speech detection threshold, 0.0-1.0 (default: 0.3) |
WebSocket Streaming
The WebSocket API provides a persistent connection for streaming audio in real time. Send audio chunks as you record and receive transcription results as they become available.Endpoint
Authentication
Authenticate via query parameter or header:Request Format
Send a JSON message withtype: "stt":