API Reference
Complete reference for the Scriptivox transcription API. All endpoints require an API key passed via the Authorization header (with or without Bearer prefix) or the X-Api-Key header.
Base URL: https://api.scriptivox.com/v1
Transcribe
Send audio for transcription. You can either pass a URL (we download it) or upload your own file.
From a URL
The simplest path — one POST request. We download the file, validate it, and start transcription automatically. Supports Google Drive, Dropbox, and OneDrive sharing links.
/v1/transcribeStart a transcription from a public URL. The file is downloaded and validated in the background. Poll GET /v1/transcribe/{id} for status updates. Duration and cost are determined after download.
Authorization: sk_live_YOUR_KEYRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | Required | Public URL to an audio/video file (http or https). Supports Google Drive, Dropbox, and OneDrive sharing links. Max 2048 characters. |
| language | string | Optional | ISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription. |
| diarize | boolean | Optional | Enable speaker diarization to identify who said what. Default: false. |
| speaker_count | integer | Optional | Expected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers. |
| align | boolean | Optional | Enable word-level timestamps with start/end times and confidence scores for every word. Default: false. |
| webhook_url | string | Optional | URL to receive completion/failure webhook. HTTPS recommended. |
urlstringRequiredPublic URL to an audio/video file (http or https). Supports Google Drive, Dropbox, and OneDrive sharing links. Max 2048 characters.
languagestringOptionalISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription.
diarizebooleanOptionalEnable speaker diarization to identify who said what. Default: false.
speaker_countintegerOptionalExpected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers.
alignbooleanOptionalEnable word-level timestamps with start/end times and confidence scores for every word. Default: false.
webhook_urlstringOptionalURL to receive completion/failure webhook. HTTPS recommended.
Response
{"id": "b2c3d4e5-f6a7-8901-bcde-f12345678901","status": "created","message": "Transcription created. The file will be downloaded and processed. Poll GET /v1/transcribe/{id} for status updates."}
Code Examples
resp = requests.post("https://api.scriptivox.com/v1/transcribe",headers={"Authorization": "sk_live_YOUR_KEY"},json={"url": "https://example.com/podcast-episode.mp3","diarize": True,"language": "en"})job = resp.json()# Poll for statusimport timewhile True:result = requests.get(f"https://api.scriptivox.com/v1/transcribe/{job['id']}",headers={"Authorization": "sk_live_YOUR_KEY"}).json()if result["status"] in ("completed", "failed"):breaktime.sleep(5)
From a file upload
Upload your own file when you need full control or don't have a public URL.
/v1/uploadGet a presigned URL to upload an audio or video file. The URL expires in 1 hour. Upload your file to the returned URL with a PUT request, then pass the upload_id to POST /v1/transcribe.
Authorization: sk_live_YOUR_KEYRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| filename | string | Required | Name of the file being uploaded (e.g. "meeting.mp3"). Maximum 255 characters. |
filenamestringRequiredName of the file being uploaded (e.g. "meeting.mp3"). Maximum 255 characters.
Response
{"upload_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890","upload_url": "https://storage.supabase.co/...","expires_in": 3600,"method": "PUT","headers": {"Content-Type": "audio/mpeg"}}
Code Examples
import requestsresp = requests.post("https://api.scriptivox.com/v1/upload",headers={"Authorization": "sk_live_YOUR_KEY"},json={"filename": "meeting.mp3"})upload = resp.json()with open("meeting.mp3", "rb") as f:requests.put(upload["upload_url"],headers=upload["headers"], data=f)
/v1/transcribeStart a transcription from an uploaded file. Pass the upload_id from POST /v1/upload. The file is validated in the background. Poll GET /v1/transcribe/{id} for status updates. Duration and cost are determined after validation.
Authorization: sk_live_YOUR_KEYRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| upload_id | string | Required | The upload ID from POST /v1/upload. |
| language | string | Optional | ISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription. |
| diarize | boolean | Optional | Enable speaker diarization to identify who said what. Default: false. |
| speaker_count | integer | Optional | Expected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers. |
| align | boolean | Optional | Enable word-level timestamps with start/end times and confidence scores for every word. Default: false. |
| webhook_url | string | Optional | URL to receive completion/failure webhook. HTTPS recommended. |
upload_idstringRequiredThe upload ID from POST /v1/upload.
languagestringOptionalISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription.
diarizebooleanOptionalEnable speaker diarization to identify who said what. Default: false.
speaker_countintegerOptionalExpected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers.
alignbooleanOptionalEnable word-level timestamps with start/end times and confidence scores for every word. Default: false.
webhook_urlstringOptionalURL to receive completion/failure webhook. HTTPS recommended.
Response
{"id": "b2c3d4e5-f6a7-8901-bcde-f12345678901","status": "created","message": "Transcription created. The file will be validated and processed. Poll GET /v1/transcribe/{id} for status updates."}
Code Examples
resp = requests.post("https://api.scriptivox.com/v1/transcribe",headers={"Authorization": "sk_live_YOUR_KEY"},json={"upload_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890","diarize": True,"speaker_count": 2,"language": "en"})job = resp.json()# Poll for statusimport timewhile True:result = requests.get(f"https://api.scriptivox.com/v1/transcribe/{job['id']}",headers={"Authorization": "sk_live_YOUR_KEY"}).json()if result["status"] in ("completed", "failed"):breaktime.sleep(5)
Get result
/v1/transcribe/{id}Get the status and result of a transcription. Poll this endpoint until status is completed or failed, or use webhooks for real-time notifications.
Authorization: sk_live_YOUR_KEYRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| id | string | Required | The transcription ID returned from POST /v1/transcribe (passed in the URL path) |
idstringRequiredThe transcription ID returned from POST /v1/transcribe (passed in the URL path)
Response
{"id": "b2c3d4e5-f6a7-8901-bcde-f12345678901","status": "completed","audio_duration_seconds": 120,"file_size_bytes": 1920000,"language": "en","diarize": true,"speaker_count": 2,"align": true,"cost_cents": 0.5,"source_url": "https://example.com/podcast.mp3","progress": "Transcription completed successfully.","created_at": "2025-01-15T10:30:00Z","started_at": "2025-01-15T10:30:01Z","completed_at": "2025-01-15T10:30:45Z","result": {"full_transcript": "Hello, thanks for joining...","language": "en","duration_seconds": 120,"speakers": ["SPEAKER 1", "SPEAKER 2"],"utterances": [{"start": 0.5,"end": 3.2,"text": "Hello, thanks for joining the call today.","speaker": "SPEAKER 1","confidence": 0.95,"words": [{"word": "Hello,","start": 0.5,"end": 0.9,"confidence": 0.98,"speaker": "SPEAKER 1"}]}]}}
Code Examples
resp = requests.get(f"https://api.scriptivox.com/v1/transcribe/{job['id']}",headers={"Authorization": "sk_live_YOUR_KEY"})result = resp.json()if result["status"] == "completed":print(result["result"]["full_transcript"])
Status values
| Status | Description |
|---|---|
created | Transcription created, file download/validation starting |
downloading | File is being downloaded from the provided URL (URL flow only) |
processing | Audio is being transcribed |
completed | Transcription finished successfully |
failed | Something went wrong — check the error field |
Balance
/v1/balanceGet your current account balance, including reserved amounts for in-progress transcriptions. Note: balance is reserved once the audio duration is determined (at `processing` status), not at `created` or `downloading` status. For URL-based transcriptions, `reserved_cents` will only reflect the job after the file has been downloaded and validated.
Authorization: sk_live_YOUR_KEYResponse
{"balance_cents": 1500,"reserved_cents": 100,"available_cents": 1400,"price_per_hour_cents": 20,"estimated_hours_available": 93.3,"deposit_url": "https://platform.scriptivox.com/billing","updated_at": "2025-01-15T10:30:00Z"}
Code Examples
resp = requests.get("https://api.scriptivox.com/v1/balance",headers={"Authorization": "sk_live_YOUR_KEY"})balance = resp.json()print(f"Available: ${balance['available_cents'] / 100:.2f}")print(f"Hours remaining: {balance['estimated_hours_available']:.1f}")
Error Codes
All errors follow the same format:
{"error": {"code": "ERROR_CODE","message": "Human-readable description"}}
| HTTP | Code | Description |
|---|---|---|
| 400 | INVALID_REQUEST | Malformed request body or missing required fields |
| 400 | INVALID_FILENAME | Filename is required |
| 400 | INVALID_MEDIA_FORMAT | Unsupported file format |
| 400 | FILE_NOT_UPLOADED | File not found at the upload URL |
| 400 | FILE_TOO_LARGE | File exceeds 5GB limit |
| 400 | DURATION_TOO_LONG | Audio exceeds 10 hour limit |
| 400 | UPLOAD_ALREADY_USED | Upload already used for a transcription |
| 400 | UPLOAD_EXPIRED | Upload URL expired (1 hour TTL) |
| 400 | URL_NOT_ACCESSIBLE | Provided URL is not accessible (4xx/5xx or network error) |
| 401 | INVALID_API_KEY | Invalid or missing API key |
| 401 | API_KEY_REVOKED | API key has been revoked |
| 402 | INSUFFICIENT_BALANCE | Not enough balance for this transcription |
| 402 | ZERO_BALANCE | Balance is $0 — deposit required |
| 404 | UPLOAD_NOT_FOUND | Upload ID does not exist |
| 404 | TRANSCRIPTION_NOT_FOUND | Transcription ID does not exist |
| 429 | RATE_LIMIT_EXCEEDED | Too many requests — see Rate Limits |
| 500 | INTERNAL_ERROR | Server error |
| 500 | PROCESSING_ERROR | Transcription processing failed |
| 500 | DOWNLOAD_FAILED | File download was interrupted (URL flow) |
Important notes
Language parameter behavior
When you specify a language code, the model is forced to interpret the audio as that language. If the audio is in a different language, the model may translate rather than transcribe. For example, setting language: "es" on English audio can produce a Spanish translation of the speech. Omit the language parameter to let the model auto-detect the correct language.
File size limit
Files larger than 5GB are rejected. Note that very large uploads may receive an HTML error page (HTTP 413) from the network layer instead of a JSON error response. Your client should handle non-JSON error responses gracefully.
Rate Limits
Rate limits are enforced per API key per endpoint. If you exceed the limit, the API returns a 429 status with a Retry-After header indicating when to retry.
| Endpoint | Limit | Notes |
|---|---|---|
| /v1/upload | 60/min | Presigned URL generation |
| /v1/transcribe | 60/min | Job submission |
| /v1/transcribe/{id} | 200/min | Higher limit for polling |
| /v1/balance | 100/min | Balance checks |
Rate Limit Headers
All responses include rate limit headers:
| Header | Description |
|---|---|
| X-RateLimit-Limit | Maximum requests allowed per minute |
| X-RateLimit-Remaining | Requests remaining in current window |
| X-RateLimit-Reset | Unix timestamp when the window resets |
| Retry-After | Seconds until you can retry (only on 429) |
Supported Formats
The following audio and video formats are accepted. Maximum file size is 5GB, maximum duration is 10 hours.
Audio
| Extension | Format |
|---|---|
| .mp3 | MPEG Audio |
| .wav | Waveform Audio |
| .m4a | MPEG-4 Audio |
| .aac | Advanced Audio Coding |
| .ogg | Ogg Vorbis |
| .flac | Free Lossless Audio |
| .opus | Opus |
| .wma | Windows Media Audio |
| .aiff | Audio Interchange |
| .caf | Core Audio Format |
Video
| Extension | Format |
|---|---|
| .mp4 | MPEG-4 Video |
| .mov | QuickTime |
| .avi | Audio Video Interleave |
| .mkv | Matroska Video |
| .webm | WebM |
| .wmv | Windows Media Video |
| .flv | Flash Video |
| .m4v | MPEG-4 Video (iTunes) |
| .3gp | 3GPP |
| .mpeg | MPEG Video |
| .mts | AVCHD |
| .ogv | Ogg Video |
| .ts | MPEG Transport Stream |
| .vob | DVD Video Object |
| .f4v | Flash MP4 Video |
Supported Languages
Pass the ISO 639-1 language code in the language parameter. Omit it or pass null for automatic language detection.
| Language | Code |
|---|---|
| Afrikaans | af |
| Albanian | sq |
| Amharic | am |
| Arabic | ar |
| Armenian | hy |
| Assamese | as |
| Azerbaijani | az |
| Bashkir | ba |
| Basque | eu |
| Belarusian | be |
| Bengali | bn |
| Bosnian | bs |
| Breton | br |
| Bulgarian | bg |
| Cantonese | yue |
| Catalan | ca |
| Chinese | zh |
| Croatian | hr |
| Czech | cs |
| Danish | da |
| Dutch | nl |
| English | en |
| Estonian | et |
| Faroese | fo |
| Finnish | fi |
| French | fr |
| Galician | gl |
| Georgian | ka |
| German | de |
| Greek | el |
| Gujarati | gu |
| Haitian Creole | ht |
| Hausa | ha |
| Hawaiian | haw |
| Hebrew | he |
| Hindi | hi |
| Hungarian | hu |
| Icelandic | is |
| Indonesian | id |
| Italian | it |
| Japanese | ja |
| Javanese | jw |
| Kannada | kn |
| Kazakh | kk |
| Khmer | km |
| Korean | ko |
| Lao | lo |
| Latin | la |
| Latvian | lv |
| Lingala | ln |
| Lithuanian | lt |
| Luxembourgish | lb |
| Macedonian | mk |
| Malagasy | mg |
| Malay | ms |
| Malayalam | ml |
| Maltese | mt |
| Maori | mi |
| Marathi | mr |
| Mongolian | mn |
| Myanmar | my |
| Nepali | ne |
| Norwegian | no |
| Nynorsk | nn |
| Occitan | oc |
| Pashto | ps |
| Persian | fa |
| Polish | pl |
| Portuguese | pt |
| Punjabi | pa |
| Romanian | ro |
| Russian | ru |
| Sanskrit | sa |
| Serbian | sr |
| Shona | sn |
| Sindhi | sd |
| Sinhala | si |
| Slovak | sk |
| Slovenian | sl |
| Somali | so |
| Spanish | es |
| Sundanese | su |
| Swahili | sw |
| Swedish | sv |
| Tagalog | tl |
| Tajik | tg |
| Tamil | ta |
| Tatar | tt |
| Telugu | te |
| Thai | th |
| Tibetan | bo |
| Turkish | tr |
| Turkmen | tk |
| Ukrainian | uk |
| Urdu | ur |
| Uzbek | uz |
| Vietnamese | vi |
| Welsh | cy |
| Yiddish | yi |
| Yoruba | yo |