Python SDK — s2t
Python SDK — s2t
Speech-to-text from Python, REST batch + WebSocket streaming.
Install
Requires Python 3.10 or later. The package is imported as kotoba.
For live-microphone examples, install with the optional mic extra
(pulls in sounddevice, which needs PortAudio on the system):
Configure
KotobaClient() reads its credentials and per-route URLs from environment
variables. For ASR you need:
Pass them in code instead if you’d rather not rely on the environment:
See Authentication for the full handshake details.
Pick a transport
REST batch (transcribe)
POSTs the file to KOTOBA_ASR_REST_URL, polls until the job is done,
and returns the final transcript:
transcribe() accepts anything soundfile can decode (WAV / FLAC / OGG
/ MP3 / …). When with_timestamps=True, result.segments is populated
with Segment(text, start, end) entries. Polling knobs:
TranscriptionError is raised on a server-reported failure, TimeoutError
if the deadline elapses.
REST low-level (submit_job / get_job)
If you’d rather poll yourself — for example to drive a job queue or surface progress in a UI — call the two REST endpoints directly:
JobStatus.state is one of processing | done | error.
WebSocket streaming (transcribe_stream)
For the realtime / mic case — where transcript deltas should surface
while audio is still being captured — pass a generator of PCM16 LE
mono bytes to transcribe_stream(...). The feeder and receiver run
concurrently, so the first delta can fire before your source is
exhausted:
Optional knobs on both stream(...) and transcribe_stream(...):
language—"en","ja","ko", or"zh"sample_rate— defaults to 24 kHz; the session resamples internallykeywords— list of hotword biases, e.g.["Kotobatech", "LLM"]
Async
Every ASR entry point has an async equivalent via AsyncKotobaClient:
The sync wrapper runs an asyncio loop on a background daemon thread —
underlying transport is identical, only the call style differs.
What’s in the box (ASR)
See the API reference for the on-the-wire protocol that this SDK wraps: Live (WebSocket) · Batch (REST).