Scriptivox logoScriptivox

    Get started

    OverviewQuickstartPricing

    API Reference

    TranscribeFile UploadGet ResultBalanceError CodesRate LimitsFormatsLanguages

    Guides

    Webhooks

    Use Cases

    Folder Watcher
    Scriptivox logoScriptivoxAPI Documentation

    API Reference

    Complete reference for the Scriptivox transcription API. All endpoints require an API key passed via the Authorization header (with or without Bearer prefix) or the X-Api-Key header.

    Base URL: https://api.scriptivox.com/v1


    Transcribe

    Send audio for transcription. You can either pass a URL (we download it) or upload your own file.

    From a URL

    The simplest path — one POST request. We download the file, validate it, and start transcription automatically. Supports Google Drive, Dropbox, and OneDrive sharing links.

    POST/v1/transcribe

    Start a transcription from a public URL. The file is downloaded and validated in the background. Poll GET /v1/transcribe/{id} for status updates. Duration and cost are determined after download.

    Authentication: Authorization: sk_live_YOUR_KEY

    Request Body

    ParameterTypeRequiredDescription
    urlstringRequiredPublic URL to an audio/video file (http or https). Supports Google Drive, Dropbox, and OneDrive sharing links. Max 2048 characters.
    languagestringOptionalISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription.
    diarizebooleanOptionalEnable speaker diarization to identify who said what. Default: false.
    speaker_countintegerOptionalExpected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers.
    alignbooleanOptionalEnable word-level timestamps with start/end times and confidence scores for every word. Default: false.
    webhook_urlstringOptionalURL to receive completion/failure webhook. HTTPS recommended.
    urlstringRequired

    Public URL to an audio/video file (http or https). Supports Google Drive, Dropbox, and OneDrive sharing links. Max 2048 characters.

    languagestringOptional

    ISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription.

    diarizebooleanOptional

    Enable speaker diarization to identify who said what. Default: false.

    speaker_countintegerOptional

    Expected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers.

    alignbooleanOptional

    Enable word-level timestamps with start/end times and confidence scores for every word. Default: false.

    webhook_urlstringOptional

    URL to receive completion/failure webhook. HTTPS recommended.

    Response

    json
    {
    "id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
    "status": "created",
    "message": "Transcription created. The file will be downloaded and processed. Poll GET /v1/transcribe/{id} for status updates."
    }

    Code Examples

    resp = requests.post("https://api.scriptivox.com/v1/transcribe",
    headers={"Authorization": "sk_live_YOUR_KEY"},
    json={
    "url": "https://example.com/podcast-episode.mp3",
    "diarize": True,
    "language": "en"
    })
    job = resp.json()
    # Poll for status
    import time
    while True:
    result = requests.get(
    f"https://api.scriptivox.com/v1/transcribe/{job['id']}",
    headers={"Authorization": "sk_live_YOUR_KEY"}).json()
    if result["status"] in ("completed", "failed"):
    break
    time.sleep(5)

    From a file upload

    Upload your own file when you need full control or don't have a public URL.

    POST/v1/upload

    Get a presigned URL to upload an audio or video file. The URL expires in 1 hour. Upload your file to the returned URL with a PUT request, then pass the upload_id to POST /v1/transcribe.

    Authentication: Authorization: sk_live_YOUR_KEY

    Request Body

    ParameterTypeRequiredDescription
    filenamestringRequiredName of the file being uploaded (e.g. "meeting.mp3"). Maximum 255 characters.
    filenamestringRequired

    Name of the file being uploaded (e.g. "meeting.mp3"). Maximum 255 characters.

    Response

    json
    {
    "upload_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "upload_url": "https://storage.supabase.co/...",
    "expires_in": 3600,
    "method": "PUT",
    "headers": {
    "Content-Type": "audio/mpeg"
    }
    }

    Code Examples

    import requests
    resp = requests.post("https://api.scriptivox.com/v1/upload",
    headers={"Authorization": "sk_live_YOUR_KEY"},
    json={"filename": "meeting.mp3"})
    upload = resp.json()
    with open("meeting.mp3", "rb") as f:
    requests.put(upload["upload_url"],
    headers=upload["headers"], data=f)
    POST/v1/transcribe

    Start a transcription from an uploaded file. Pass the upload_id from POST /v1/upload. The file is validated in the background. Poll GET /v1/transcribe/{id} for status updates. Duration and cost are determined after validation.

    Authentication: Authorization: sk_live_YOUR_KEY

    Request Body

    ParameterTypeRequiredDescription
    upload_idstringRequiredThe upload ID from POST /v1/upload.
    languagestringOptionalISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription.
    diarizebooleanOptionalEnable speaker diarization to identify who said what. Default: false.
    speaker_countintegerOptionalExpected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers.
    alignbooleanOptionalEnable word-level timestamps with start/end times and confidence scores for every word. Default: false.
    webhook_urlstringOptionalURL to receive completion/failure webhook. HTTPS recommended.
    upload_idstringRequired

    The upload ID from POST /v1/upload.

    languagestringOptional

    ISO 639-1 language code (e.g. "en", "es", "fr"). Omit for auto-detection. Warning: forcing a wrong language may produce a translation instead of a transcription.

    diarizebooleanOptional

    Enable speaker diarization to identify who said what. Default: false.

    speaker_countintegerOptional

    Expected number of speakers (1–50). Requires diarize to be true. If omitted, the model auto-detects the number of speakers.

    alignbooleanOptional

    Enable word-level timestamps with start/end times and confidence scores for every word. Default: false.

    webhook_urlstringOptional

    URL to receive completion/failure webhook. HTTPS recommended.

    Response

    json
    {
    "id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
    "status": "created",
    "message": "Transcription created. The file will be validated and processed. Poll GET /v1/transcribe/{id} for status updates."
    }

    Code Examples

    resp = requests.post("https://api.scriptivox.com/v1/transcribe",
    headers={"Authorization": "sk_live_YOUR_KEY"},
    json={
    "upload_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "diarize": True,
    "speaker_count": 2,
    "language": "en"
    })
    job = resp.json()
    # Poll for status
    import time
    while True:
    result = requests.get(
    f"https://api.scriptivox.com/v1/transcribe/{job['id']}",
    headers={"Authorization": "sk_live_YOUR_KEY"}).json()
    if result["status"] in ("completed", "failed"):
    break
    time.sleep(5)

    Get result

    GET/v1/transcribe/{id}

    Get the status and result of a transcription. Poll this endpoint until status is completed or failed, or use webhooks for real-time notifications.

    Authentication: Authorization: sk_live_YOUR_KEY

    Request Body

    ParameterTypeRequiredDescription
    idstringRequiredThe transcription ID returned from POST /v1/transcribe (passed in the URL path)
    idstringRequired

    The transcription ID returned from POST /v1/transcribe (passed in the URL path)

    Response

    json
    {
    "id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
    "status": "completed",
    "audio_duration_seconds": 120,
    "file_size_bytes": 1920000,
    "language": "en",
    "diarize": true,
    "speaker_count": 2,
    "align": true,
    "cost_cents": 0.5,
    "source_url": "https://example.com/podcast.mp3",
    "progress": "Transcription completed successfully.",
    "created_at": "2025-01-15T10:30:00Z",
    "started_at": "2025-01-15T10:30:01Z",
    "completed_at": "2025-01-15T10:30:45Z",
    "result": {
    "full_transcript": "Hello, thanks for joining...",
    "language": "en",
    "duration_seconds": 120,
    "speakers": ["SPEAKER 1", "SPEAKER 2"],
    "utterances": [
    {
    "start": 0.5,
    "end": 3.2,
    "text": "Hello, thanks for joining the call today.",
    "speaker": "SPEAKER 1",
    "confidence": 0.95,
    "words": [
    {
    "word": "Hello,",
    "start": 0.5,
    "end": 0.9,
    "confidence": 0.98,
    "speaker": "SPEAKER 1"
    }
    ]
    }
    ]
    }
    }

    Code Examples

    resp = requests.get(
    f"https://api.scriptivox.com/v1/transcribe/{job['id']}",
    headers={"Authorization": "sk_live_YOUR_KEY"})
    result = resp.json()
    if result["status"] == "completed":
    print(result["result"]["full_transcript"])

    Status values

    StatusDescription
    createdTranscription created, file download/validation starting
    downloadingFile is being downloaded from the provided URL (URL flow only)
    processingAudio is being transcribed
    completedTranscription finished successfully
    failedSomething went wrong — check the error field

    Balance

    GET/v1/balance

    Get your current account balance, including reserved amounts for in-progress transcriptions. Note: balance is reserved once the audio duration is determined (at `processing` status), not at `created` or `downloading` status. For URL-based transcriptions, `reserved_cents` will only reflect the job after the file has been downloaded and validated.

    Authentication: Authorization: sk_live_YOUR_KEY

    Response

    json
    {
    "balance_cents": 1500,
    "reserved_cents": 100,
    "available_cents": 1400,
    "price_per_hour_cents": 20,
    "estimated_hours_available": 93.3,
    "deposit_url": "https://platform.scriptivox.com/billing",
    "updated_at": "2025-01-15T10:30:00Z"
    }

    Code Examples

    resp = requests.get("https://api.scriptivox.com/v1/balance",
    headers={"Authorization": "sk_live_YOUR_KEY"})
    balance = resp.json()
    print(f"Available: ${balance['available_cents'] / 100:.2f}")
    print(f"Hours remaining: {balance['estimated_hours_available']:.1f}")

    Error Codes

    All errors follow the same format:

    json
    {
    "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description"
    }
    }
    HTTPCodeDescription
    400INVALID_REQUESTMalformed request body or missing required fields
    400INVALID_FILENAMEFilename is required
    400INVALID_MEDIA_FORMATUnsupported file format
    400FILE_NOT_UPLOADEDFile not found at the upload URL
    400FILE_TOO_LARGEFile exceeds 5GB limit
    400DURATION_TOO_LONGAudio exceeds 10 hour limit
    400UPLOAD_ALREADY_USEDUpload already used for a transcription
    400UPLOAD_EXPIREDUpload URL expired (1 hour TTL)
    400URL_NOT_ACCESSIBLEProvided URL is not accessible (4xx/5xx or network error)
    401INVALID_API_KEYInvalid or missing API key
    401API_KEY_REVOKEDAPI key has been revoked
    402INSUFFICIENT_BALANCENot enough balance for this transcription
    402ZERO_BALANCEBalance is $0 — deposit required
    404UPLOAD_NOT_FOUNDUpload ID does not exist
    404TRANSCRIPTION_NOT_FOUNDTranscription ID does not exist
    429RATE_LIMIT_EXCEEDEDToo many requests — see Rate Limits
    500INTERNAL_ERRORServer error
    500PROCESSING_ERRORTranscription processing failed
    500DOWNLOAD_FAILEDFile download was interrupted (URL flow)

    Important notes

    Language parameter behavior

    When you specify a language code, the model is forced to interpret the audio as that language. If the audio is in a different language, the model may translate rather than transcribe. For example, setting language: "es" on English audio can produce a Spanish translation of the speech. Omit the language parameter to let the model auto-detect the correct language.

    File size limit

    Files larger than 5GB are rejected. Note that very large uploads may receive an HTML error page (HTTP 413) from the network layer instead of a JSON error response. Your client should handle non-JSON error responses gracefully.


    Rate Limits

    Rate limits are enforced per API key per endpoint. If you exceed the limit, the API returns a 429 status with a Retry-After header indicating when to retry.

    EndpointLimitNotes
    /v1/upload60/minPresigned URL generation
    /v1/transcribe60/minJob submission
    /v1/transcribe/{id}200/minHigher limit for polling
    /v1/balance100/minBalance checks

    Rate Limit Headers

    All responses include rate limit headers:

    HeaderDescription
    X-RateLimit-LimitMaximum requests allowed per minute
    X-RateLimit-RemainingRequests remaining in current window
    X-RateLimit-ResetUnix timestamp when the window resets
    Retry-AfterSeconds until you can retry (only on 429)

    Supported Formats

    The following audio and video formats are accepted. Maximum file size is 5GB, maximum duration is 10 hours.

    Audio

    ExtensionFormat
    .mp3MPEG Audio
    .wavWaveform Audio
    .m4aMPEG-4 Audio
    .aacAdvanced Audio Coding
    .oggOgg Vorbis
    .flacFree Lossless Audio
    .opusOpus
    .wmaWindows Media Audio
    .aiffAudio Interchange
    .cafCore Audio Format

    Video

    ExtensionFormat
    .mp4MPEG-4 Video
    .movQuickTime
    .aviAudio Video Interleave
    .mkvMatroska Video
    .webmWebM
    .wmvWindows Media Video
    .flvFlash Video
    .m4vMPEG-4 Video (iTunes)
    .3gp3GPP
    .mpegMPEG Video
    .mtsAVCHD
    .ogvOgg Video
    .tsMPEG Transport Stream
    .vobDVD Video Object
    .f4vFlash MP4 Video

    Supported Languages

    Pass the ISO 639-1 language code in the language parameter. Omit it or pass null for automatic language detection.

    LanguageCode
    Afrikaansaf
    Albaniansq
    Amharicam
    Arabicar
    Armenianhy
    Assameseas
    Azerbaijaniaz
    Bashkirba
    Basqueeu
    Belarusianbe
    Bengalibn
    Bosnianbs
    Bretonbr
    Bulgarianbg
    Cantoneseyue
    Catalanca
    Chinesezh
    Croatianhr
    Czechcs
    Danishda
    Dutchnl
    Englishen
    Estonianet
    Faroesefo
    Finnishfi
    Frenchfr
    Galiciangl
    Georgianka
    Germande
    Greekel
    Gujaratigu
    Haitian Creoleht
    Hausaha
    Hawaiianhaw
    Hebrewhe
    Hindihi
    Hungarianhu
    Icelandicis
    Indonesianid
    Italianit
    Japaneseja
    Javanesejw
    Kannadakn
    Kazakhkk
    Khmerkm
    Koreanko
    Laolo
    Latinla
    Latvianlv
    Lingalaln
    Lithuanianlt
    Luxembourgishlb
    Macedonianmk
    Malagasymg
    Malayms
    Malayalamml
    Maltesemt
    Maorimi
    Marathimr
    Mongolianmn
    Myanmarmy
    Nepaline
    Norwegianno
    Nynorsknn
    Occitanoc
    Pashtops
    Persianfa
    Polishpl
    Portuguesept
    Punjabipa
    Romanianro
    Russianru
    Sanskritsa
    Serbiansr
    Shonasn
    Sindhisd
    Sinhalasi
    Slovaksk
    Sloveniansl
    Somaliso
    Spanishes
    Sundanesesu
    Swahilisw
    Swedishsv
    Tagalogtl
    Tajiktg
    Tamilta
    Tatartt
    Telugute
    Thaith
    Tibetanbo
    Turkishtr
    Turkmentk
    Ukrainianuk
    Urduur
    Uzbekuz
    Vietnamesevi
    Welshcy
    Yiddishyi
    Yorubayo

    Webhooks

    Real-time completion notifications

    Pricing

    Pay-as-you-go at $0.20/hour