Version 1.0 · REST API

API Documentation

Everything you need to integrate audio transcription into your application. Submit audio, poll for results, and download structured transcripts with speaker labels, PII redaction, and analytics.

Quick Start

Get your first transcription running in under a minute.

1. Create an account and get your API key

Sign up at tuplets.ai/signup, verify your email, and generate an API key from your dashboard settings. Keys are prefixed with tb_ and are shown only once.

2. Set your API key

Pass your API key as a Bearer token in the Authorization header on every request.

Example request
export TUPLETS_API_KEY="tb_your_key_here"

# Verify the key works
curl -H "Authorization: Bearer $TUPLETS_API_KEY" \
  https://api.tuplets.ai/health

3. Submit audio for transcription

Upload an audio file directly, use the browser upload flow, or point to a remote URL on a public HTTP/HTTPS host. We support MP3, WAV, M4A, FLAC, OGG, AAC, WMA, WebM, and Opus.

Remote URLs are validated before processing starts. Private, loopback, link-local, and otherwise non-public hosts are rejected, and redirect targets must also stay on public internet hosts.

Example request
# Upload a local file
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "audio_file=@interview.mp3" \
  -F "language=en" \
  -F "diarization=true"

# Or submit a remote URL on a public host
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "remote_url=https://storage.example.com/recording.mp3" \
  -F "language=en"

4. Poll for completion and download

Current API behavior: POST /jobs returns the new job id immediately, along with status_url, cancel_url, and cancel_token. The examples below show the normal flow: submit, capture the returned job ID, then poll that job directly.

Example request
# Submit and capture the returned job ID
job_id=$(curl -s -X POST https://api.tuplets.ai/jobs   -H "Authorization: Bearer $TUPLETS_API_KEY"   -F "audio_file=@interview.mp3"   -F "language=en"   -F "diarization=true" | jq -r '.id')

# Poll job status
curl https://api.tuplets.ai/jobs/$job_id \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

# Download completed transcript JSON
curl https://api.tuplets.ai/jobs/$job_id/download \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

SDKs

Use our official client libraries to integrate Tuplets into your application.

All SDKs are open-source, type-safe, and cover the full Tuplets API — including job creation, polling, downloads, browser uploads, and solutions inquiries.

Python SDK

Install from PyPI and use the sync or async client.

pip install tuplets-ai

tuplets-ai on PyPI →

JavaScript / TypeScript SDK

Install from npm and use with Node.js or browser environments.

npm install @tupletsai/sdk

@tupletsai/sdk on npm →

Authentication

All API requests require authentication via an API key or a session cookie.

API Keys

Generate API keys from your account settings. Keys begin with tb_ and are 43 characters total.

Include the key in the Authorization header as a Bearer token:

Example request
curl -H "Authorization: Bearer tb_your_key_here" \
  https://api.tuplets.ai/jobs
EndpointMethodDescription
/account/api-keysPOSTCreate a new API key
/account/api-keysGETList active API keys
/account/api-keys/{key_id}/rotatePOSTRotate (replace) an API key
/account/api-keys/{key_id}DELETERevoke an API key
Example request
# Create a new API key
curl -X POST https://api.tuplets.ai/account/api-keys \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Integration Key"}'

Key Rotation & Security

Rotate keys regularly and revoke compromised keys immediately.

  • Keys are stored in a secure, non-reversible manner.
  • The raw key is shown only once at creation time.
  • Rotate keys via POST /account/api-keys/{key_id}/rotate.
  • Revoke keys via DELETE /account/api-keys/{key_id}.
  • Each key is scoped to a single account.

API Reference

Base URL: https://api.tuplets.ai — All endpoints return JSON.

Create a Transcription Job

Submit audio for processing. You must provide exactly one of: audio_file, remote_url, or uploaded_audio_key. Browser-direct upload is recommended for large or private files.

POST/jobsSubmit a new transcription job

Analytics requirements

  • Prefer analytics as a JSON string on multipart form posts. Example: {"profile":"full","domain":"insurance"}.
  • profile accepts basic (summary-oriented modules) or full (adds evidence-linked audit fields). Set none or off to omit a profile while still requesting custom analytics.
  • Optional domain selects a built-in custom pack: insurance, support, sales, healthcare, or financial_services. You may combine a profile with a domain pack.
  • Generic profile analytics and custom analytics each require at least 15 seconds of audio.

Transcription model selection

  • transcription_model accepts standard or premium. The default is standard.
  • premium routes to the higher-accuracy premium transcription tier for harder audio and long-form recordings.
  • premium can be combined with diarization=true for speaker-attributed premium transcripts.
  • speaker_embeddings=true is optional on premium diarized jobs. It adds per-speaker voice embedding vectors to diarization.speaker_embeddings for cross-recording speaker matching (no extra charge).
ParameterTypeRequiredDescription
audio_filefileconditionalAudio file (MP3, WAV, M4A, FLAC, OGG, AAC, WMA, WebM, Opus). Max 1 GB.
remote_urlstringconditionalPublic HTTP/HTTPS URL pointing directly to audio or video media on a public internet host. Private or localhost-style addresses are rejected, redirects are revalidated, and the same 1 GB limit applies.
uploaded_audio_keystringconditionalObject key from a browser upload. Must be paired with uploaded_audio_token.
uploaded_audio_tokenstringconditionalUpload token from the browser upload flow.
languagestringoptionalLanguage code (ISO 639-1). Default: auto (automatic detection).
transcription_modelstringoptionalTranscription tier: standard or premium. Default: standard.
diarizationbooleanoptionalEnable speaker diarization (separate speakers). Default: false.
speaker_embeddingsbooleanoptionalInclude per-speaker voice embedding vectors in diarization output. Requires transcription_model=premium and diarization=true. Default: false.
pii_processingbooleanoptionalEnable transcript-level PII redaction for supported transcript languages only: English, Spanish, French, German, Portuguese, Italian, Dutch, and Polish (en, es, fr, de, pt, it, nl, pl). Unsupported languages skip PII and are not billed for it. Default: false.
analyticsstring (JSON)optionalAnalytics request as JSON. Fields: profile (basic | full | none | off), domain (built-in pack key), schema (custom field definitions). Omit for no analytics.
Example request
# Upload file with standard transcription + all add-ons
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "audio_file=@call_recording.mp3" \
  -F "language=en" \
  -F "transcription_model=standard" \
  -F "diarization=true" \
  -F "pii_processing=true" \
  -F 'analytics={"profile":"full","domain":"insurance"}'

# Remote URL on a public host with premium transcription
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "remote_url=https://example.com/audio.mp3" \
  -F "transcription_model=premium" \
  -F "language=auto"

# Premium diarization with per-speaker embedding vectors
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "audio_file=@meeting.mp3" \
  -F "transcription_model=premium" \
  -F "diarization=true" \
  -F "speaker_embeddings=true"

Response

{
  "status": "accepted",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status_url": "https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000",
  "cancel_url": "https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000",
  "cancel_token": "<TOKEN>"
}

Submission response

The create endpoint returns the new id immediately, plus a status_url, cancel_url, and shared cancel_token. Your client can start polling the returned job URL right away.

The API still does not support client-provided request IDs or echoed metadata on create, so if you need retry-safe deduplication you should keep your own idempotency layer on the client side for now.

PII processing details

  • Detection runs once on the full transcript, then the same canonical spans are projected back to segment output for consistent redaction.
  • Structured identifiers such as email, SSN, IP address, payment card numbers, and phone-like values use deterministic validation before redaction.
  • Names and address-like phrases use contextual entity detection instead of regex alone. Generic organization and location names are excluded by default to reduce false positives and preserve downstream topic quality.
  • A final automated review pass checks the masked transcript for any surviving direct identifiers before the redacted result is returned.
  • PII currently runs only for supported transcript languages: English, Spanish, French, German, Portuguese, Italian, Dutch, and Polish (en, es, fr, de, pt, it, nl, pl). If auto-detect or an explicit language resolves outside that set, the PII pass is skipped.
  • PII mode is coverage-first. Short person names and ambiguous address-like text can still be redacted when the pipeline prefers privacy over recall risk.

PII currently runs only for English, Spanish, French, German, Portuguese, Italian, Dutch, and Polish (en, es, fr, de, pt, it, nl, pl). If the transcript language resolves outside that set, PII is skipped and you are not charged for PII processing.

Remote URL rules

  • Use a direct http:// or https:// link to publicly reachable media.
  • Hosts that resolve to private, loopback, link-local, or other non-public IP ranges are rejected.
  • Redirects are allowed only when every hop stays on a public internet host.
  • If the file is private or requires signed storage access, use POST /jobs/upload-target and submit uploaded_audio_key instead.

Get Job Status

Retrieve the current status and result of a transcription job. This endpoint returns the full nested result object once the job completes.

GET/jobs/{job_id}Get a single job by ID
Example request
curl https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

Response

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "result": {
    "text": "Welcome everyone. Today we are reviewing launch readiness and confirming final ownership.",
    "segments": [
      {
        "start": 0,
        "end": 6.2,
        "speaker": "SPEAKER_00",
        "text": "Welcome everyone. Today we are reviewing launch readiness."
      },
      {
        "start": 6.2,
        "end": 12.8,
        "speaker": "SPEAKER_01",
        "text": "Support still needs the updated help-center copy before release."
      }
    ],
    "language": "en",
    "audio_duration_seconds": 1842.4,
    "transcription_meta": {
      "coverage_ratio": 0.94,
      "total_covered_seconds": 1731.8
    },
    "diarization": {
      "speaker_count": 4,
      "confidence": "medium",
      "warning": "Speaker attribution may be less accurate around short backchannels and overlapping speech.",
      "speaker_embeddings": {
        "model": "pyannote/speaker-diarization-community-1",
        "dimension": 256,
        "vectors": {
          "SPEAKER_00": [
            0.12,
            -0.04,
            0.08
          ],
          "SPEAKER_01": [
            -0.03,
            0.11,
            0.05
          ]
        }
      }
    },
    "pii_findings": [
      {
        "type": "EMAIL_ADDRESS",
        "start": 214,
        "end": 230,
        "confidence": 0.85,
        "detector_count": 1
      }
    ],
    "analytics": {
      "profile": "full",
      "tier": "deep",
      "modules": [
        "summary",
        "sentiment",
        "key_topics",
        "action_items",
        "qa",
        "decisions",
        "risks",
        "chapters",
        "keywords",
        "statistics",
        "speaker_stats",
        "urgency"
      ],
      "summary": "A launch-readiness meeting covering remaining documentation work, support copy approval, and release ownership.",
      "sentiment": {
        "overall": "neutral",
        "explanation": "The discussion is procedural, with mild urgency around launch blockers and final handoff items.",
        "distribution": {
          "positive": {
            "count": 5,
            "percentage": 22.7,
            "avg_score": 0.71
          },
          "neutral": {
            "count": 14,
            "percentage": 63.6,
            "avg_score": 0.8
          },
          "negative": {
            "count": 3,
            "percentage": 13.7,
            "avg_score": 0.64
          }
        },
        "per_speaker": [
          {
            "speaker": "SPEAKER_00",
            "positive_count": 2,
            "neutral_count": 6,
            "negative_count": 0,
            "total_segments": 8,
            "avg_score": 0.77
          }
        ],
        "segments": [
          {
            "start": 0,
            "end": 6.2,
            "speaker": "SPEAKER_00",
            "text": "Welcome everyone. Today we are reviewing launch readiness.",
            "label": "neutral",
            "score": 0.79
          }
        ]
      },
      "key_topics": [
        {
          "topic": "launch readiness",
          "evidence": [
            {
              "start": 0,
              "end": 6.2,
              "speaker": "SPEAKER_00",
              "text": "Welcome everyone. Today we are reviewing launch readiness."
            }
          ]
        }
      ],
      "action_items": [
        {
          "action": "Update the help-center copy",
          "owner": "SPEAKER_01",
          "due_date": "Friday",
          "confidence": "high",
          "source_quote": "I'll send the revised support copy by Friday.",
          "evidence": [
            {
              "start": 142.4,
              "end": 148.9,
              "speaker": "SPEAKER_01",
              "text": "I'll send the revised support copy by Friday."
            }
          ]
        }
      ],
      "qa_pairs": [],
      "decisions": [
        {
          "decision": "Keep the staged release for next week",
          "evidence": [
            {
              "start": 312.7,
              "end": 319.4,
              "speaker": "SPEAKER_00",
              "text": "We'll keep the staged release for next week."
            }
          ]
        }
      ],
      "risks": [
        {
          "risk": "Support copy approval may slip the release checklist",
          "severity": "medium",
          "evidence": [
            {
              "start": 6.2,
              "end": 12.8,
              "speaker": "SPEAKER_01",
              "text": "Support still needs the updated help-center copy before release."
            }
          ]
        }
      ],
      "missed_opportunities": [],
      "chapters": [
        {
          "title": "Launch review",
          "start": 0,
          "end": 420,
          "summary": "The team reviews blockers, content updates, and release ownership."
        }
      ],
      "urgency": {
        "score": 5,
        "reasoning": "The release is on track, but one remaining approval item still needs to land."
      },
      "keywords": [
        "launch readiness",
        "support copy",
        "staged release",
        "handoff"
      ],
      "statistics": {
        "word_count": 2841,
        "segment_count": 221,
        "total_duration_seconds": 1731.8,
        "speaking_rate_wpm": 98.4
      },
      "speaker_stats": [
        {
          "speaker": "SPEAKER_00",
          "talk_time_seconds": 612.4,
          "word_count": 980,
          "segment_count": 77,
          "percentage": 35.4
        },
        {
          "speaker": "SPEAKER_01",
          "talk_time_seconds": 484.2,
          "word_count": 801,
          "segment_count": 61,
          "percentage": 28
        }
      ],
      "skipped_reason": null,
      "custom": {
        "schema_version": "2026-05-31",
        "domain": "insurance",
        "fields": {
          "call_reason": {
            "status": "matched",
            "selected": [
              {
                "id": 1,
                "code": "CLAIM_STATUS",
                "label": "Claim status",
                "confidence": 0.91,
                "evidence_segment_ids": [
                  4,
                  5
                ]
              }
            ],
            "reasoning": "The caller asked for an update on an existing claim."
          },
          "policy_line": {
            "status": "matched",
            "selected": [
              {
                "id": 1,
                "code": "AUTO",
                "label": "Auto",
                "confidence": 0.88,
                "evidence_segment_ids": [
                  2,
                  3
                ]
              }
            ],
            "reasoning": "The caller referenced their auto policy number."
          },
          "cancellation_risk": {
            "status": "matched",
            "value": 2,
            "confidence": 0.74,
            "evidence_segment_ids": [
              6
            ],
            "reasoning": "The caller expressed mild frustration but did not ask to cancel."
          }
        },
        "skipped_reason": null
      }
    },
    "feature_execution": {
      "transcription_requested": true,
      "transcription_applied": true,
      "transcription_model_requested": "standard",
      "transcription_model_applied": "standard",
      "transcription_elapsed_seconds": 12.4,
      "diarization_requested": true,
      "diarization_applied": true,
      "diarization_elapsed_seconds": 3.1,
      "analytics_requested": true,
      "analytics_profile_requested": "full",
      "analytics_applied": true,
      "analytics_elapsed_seconds": 9.9,
      "generic_analytics_requested": true,
      "generic_analytics_tier_requested": "deep",
      "generic_analytics_applied": true,
      "generic_analytics_tier_applied": "deep",
      "generic_analytics_elapsed_seconds": 7.8,
      "custom_analytics_requested": true,
      "custom_analytics_applied": true,
      "custom_analytics_elapsed_seconds": 2.1,
      "pii_processing_requested": true,
      "pii_processing_language": "en",
      "pii_processing_applied": true,
      "pii_processing_elapsed_seconds": 1.2
    },
    "processing_warnings": []
  },
  "error_message": null,
  "audio_duration_seconds": 120.5,
  "transcription_model": "standard",
  "diarization": true,
  "pii_processing": true,
  "analytics": {
    "profile": "full",
    "tier": "deep",
    "modules": [
      "summary",
      "sentiment",
      "key_topics",
      "action_items",
      "qa",
      "decisions",
      "risks",
      "chapters",
      "keywords",
      "statistics",
      "speaker_stats",
      "urgency"
    ],
    "custom_schema": {
      "domain": "insurance",
      "schema_version": "2026-05-31",
      "fields": []
    }
  },
  "estimated_cost_usd": 0.01,
  "billed_cost_usd": 0.01,
  "billing_status": "charged",
  "source_type": "upload",
  "result_download_available": true,
  "source_audio_available": true,
  "progress_percent": 100,
  "estimated_seconds_remaining": null,
  "cancel_token": null,
  "created_at": "2025-03-15T10:30:00Z",
  "started_at": "2025-03-15T10:30:02Z",
  "completed_at": "2025-03-15T10:31:45Z",
  "runtime_ms": 103000
}
FieldTypeDescription
idUUIDUnique job identifier
statusstringqueued | running | completed | failed
resultobject | nullFull transcript result for completed jobs. Use GET /jobs/{job_id}/download to fetch the same object as a file attachment.
error_messagestring | nullError detail if failed
audio_duration_secondsnumber | nullDetected audio duration
transcription_modelstringSelected transcription tier: standard or premium
diarizationbooleanWhether speaker diarization was requested
speaker_embeddingsbooleanWhether per-speaker voice embedding vectors were requested (premium + diarization only)
pii_processingbooleanWhether PII processing was requested
analyticsobject | nullNormalized analytics request: profile, tier, modules, and custom_schema when analytics was requested.
estimated_cost_usdnumber | nullPre-run cost estimate
billed_cost_usdnumber | nullActual amount charged
billing_statusstring | nullpending | charged | skipped | failed
source_typestring | nullupload | remote_url
result_download_availablebooleanWhether JSON download is currently available
source_audio_availablebooleanWhether the original source audio is still available on the backend
progress_percentnumber | nullBest-effort progress snapshot for queued/running jobs
estimated_seconds_remaininginteger | nullBest-effort ETA for queued/running jobs
cancel_tokenstring | nullToken to cancel this job while it is queued or running
created_atdatetimeJob creation timestamp
started_atdatetime | nullWhen processing actually began
completed_atdatetime | nullJob completion timestamp
runtime_msnumber | nullProcessing time in milliseconds

Result payload notes

  • GET /jobs/{job_id} returns the full nested result object after completion. GET /jobs returns job rows with a result text preview only (see List Jobs).
  • The nested result object is feature-dependent. Your client should treat optional sub-objects such as diarization, diarization.speaker_embeddings, pii_findings, feature_execution, processing_warnings, and analytics as conditional.
  • diarization.speaker_embeddings contains model, dimension, and vectors (map of speaker label to float array). Only present when the job was submitted with speaker_embeddings=true on premium diarized audio.
  • Phrase-level segments[] always include start, end, and text. speaker is present when diarization ran. Standard non-diarized jobs may also include decoder fields such as avg_logprob and no_speech_prob.
  • The premium model usually returns both words[] (word-level timestamps) and segments[] (phrase-level). With diarization=true, treat segments[] with speaker as the primary timeline. See the schema tables below.

List Jobs

Browse transcription jobs with optional status filtering.

GET/jobsList jobs for the authenticated account
Query ParameterTypeRequiredDescription
statusstringoptionalFilter by status: queued, running, completed, or failed
limitintegeroptionalMax results (1–100). Default: 20.
Example request
# List recent completed jobs
curl "https://api.tuplets.ai/jobs?status=completed&limit=10" \
  -H "Authorization: Bearer $TUPLETS_API_KEY"
{
  "items": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "status": "completed",
      "result": {
        "text": "Welcome everyone. Today we are reviewing launch readiness and confirming final ownership…"
      },
      "audio_duration_seconds": 120.5,
      "transcription_model": "standard",
      "diarization": true,
      "estimated_cost_usd": 0.012,
      "billed_cost_usd": 0.012,
      "created_at": "2025-03-15T10:30:00Z"
    }
  ],
  "total_items": 1,
  "status_filter": "completed"
}

Results are sorted newest-first. Use this endpoint to browse recent jobs, not to discover the ID of a newly submitted one. POST /jobs now returns the job ID directly. For completed jobs, result is a preview object with only text (truncated to about 1,000 characters). Use GET /jobs/{job_id} or GET /jobs/{job_id}/download for the full transcript JSON.

Cancel a Job

Cancel a queued or running job. Two methods: by ID (authenticated) or by cancel token (shared).

Cancellation billing policy

Tuplets does not add a separate cancellation fee. Cancelled jobs are billed under the standard pricing rules, including minimum charges, so cancellation should be treated as a request to stop further work rather than a guarantee that the job becomes free.

Cancel by Job ID

DELETE/jobs/{job_id}Cancel a job
Example request
curl -X DELETE https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

Cancel by Token

POST/jobs/cancelCancel a job using a shared cancel token
Example request
curl -X POST https://api.tuplets.ai/jobs/cancel \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"cancel_token": "<cancel_token_from_job_status>"}'

Download Transcript

Download the full transcript JSON for a completed job. Under the current backend defaults, results are typically retained for 7 days after completion and each job allows up to 20 attachment downloads.

GET/jobs/{job_id}/downloadDownload transcript JSON as an attachment
Example request
curl https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000/download \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -o transcript.json

The downloaded JSON is the same nested object returned in result from GET /jobs/{job_id}. That means feature-specific fields are present only when the related feature ran.

Availability and abuse limits

  • Completed-job transcript JSON is usually available for about 7 days under the current retention setting.
  • Each job currently allows up to 20 attachment downloads. After that the API returns 429.
  • If the retained JSON has expired, the API returns 410.

Result Schema

The transcript JSON changes depending on which transcription model and optional features were requested.

Standard Model

Returns phrase-level segments. Can be combined with diarization, PII, and analytics.

FieldTypeNotes
textstringFull transcript text. Redacted when PII processing ran.
segments[]arrayPhrase-level segments with start, end, and text.
segments[].speakerstring?Speaker label (e.g. SPEAKER_00). Present when diarization=true.
segments[].avg_logprobnumber?Decoder confidence (standard, non-diarized). Omitted on diarized segments.
segments[].no_speech_probnumber?No-speech probability (standard, non-diarized). Omitted on diarized segments.
segments[].languagestring?Per-segment language when detected (standard path).
languagestring?Detected or requested language code.
audio_duration_secondsnumber?Duration of the source audio.
transcription_metaobject?Coverage metrics (coverage_ratio, total_covered_seconds) plus decode health fields when available.
diarizationobject?Speaker count, confidence (medium or low), and warning. Present when diarization=true.
diarization.speaker_embeddingsobject?Per-speaker embedding vectors (model, dimension, vectors). Present when speaker_embeddings=true on premium diarized jobs.
pii_findingsarray?Span-level PII findings. Present when pii_processing=true and the PII pass ran.
feature_executionobject?Execution metadata for transcription tier, diarization, PII, generic analytics, and custom analytics timing.
processing_warningsarray?Warnings when a feature was skipped or downgraded.
analyticsobject?Canonical analytics payload (profile, modules, and module outputs). Present when analytics was requested.

Premium Model

Returns word-level words[] timestamps when forced alignment succeeds, plus phrase-level segments[] grouped from those words. Can be combined with diarization=true; in that case segments[] includes speaker labels and is the primary timeline for playback and analytics. If alignment fails, you may receive transcript-only segments[] without words[].

FieldTypeNotes
textstringFull transcript text. Redacted when PII processing ran.
words[]array?Word-level timestamps. Each item has start, end, word, and language. Omitted if alignment did not run.
segments[]arrayPhrase-level segments. Include speaker when diarization=true.
segments[].speakerstring?Speaker label (e.g. SPEAKER_00). Present when diarization=true.
languagestring?Detected or requested language code.
audio_duration_secondsnumber?Duration of the source audio.
transcription_metaobject?Coverage metrics plus decode health fields when available.
diarizationobject?Speaker count, confidence, and warning. Present when diarization=true.
diarization.speaker_embeddingsobject?Per-speaker embedding vectors. Present when speaker_embeddings=true.
diarization.speaker_embeddings.modelstringEmbedding model identifier (e.g. pyannote/speaker-diarization-community-1).
diarization.speaker_embeddings.dimensionnumberLength of each embedding vector.
diarization.speaker_embeddings.vectorsobjectMap of speaker label (e.g. SPEAKER_00) to float array.
pii_findingsarray?Span-level PII findings. Present when pii_processing=true and the PII pass ran.
feature_executionobject?Execution metadata for transcription tier, diarization, speaker embeddings, PII, and analytics timing.
feature_execution.speaker_embeddings_requestedboolean?Whether speaker_embeddings was requested.
feature_execution.speaker_embeddings_appliedboolean?Whether embedding vectors were included in the result.
processing_warningsarray?Warnings when a feature was skipped or downgraded.
analyticsobject?Canonical analytics payload. Present when analytics was requested.

Advanced and diagnostic fields

These fields are optional. Most integrations can ignore them; they are useful for debugging decode quality, routing, and partial feature failures.

FieldTypeNotes
decode_attempts[]array?Per-attempt decode profiles, params, decisions, and health metrics (standard path).
routing_metadataobject?Internal routing codes (route_code, engine_code, align_code) when segments and duration are known.
transcription_meta.decode_attemptsnumber?Count of decode attempts included in transcription_meta.
transcription_meta.issuesstring[]?Decode quality flags such as low_coverage or repetition_detected.
pii_processing_errorstring?Present when PII processing was requested but failed; findings may be empty.
pii_review_requiredboolean?True when PII output needs manual review after a processing error.
feature_execution.pii_processing_skip_reasonstring?e.g. unsupported_language when PII was skipped.
feature_execution.transcription_model_requestedstring?Requested transcription tier: standard or premium.
feature_execution.transcription_model_appliedstring?Transcription tier that actually ran. Present on completed jobs.
analytics.statistics.advisory_limit_exceededstring?Present when generic analytics output hit an advisory cap.
analytics.generic_skipped_reasonstring?Present when a generic profile was requested but the extraction path was skipped or downgraded.

Integration guidance

  • Model your client types so optional feature blocks are nullable or absent instead of required.
  • Do not assume PII ran just because pii_processing=true on the job. Check result.feature_execution.pii_processing_applied and result.processing_warnings for skip or degradation messages.
  • For analytics, read job.analytics on status responses and result.analytics on completed jobs. Use result.feature_execution.analytics_applied and result.feature_execution.custom_analytics_applied to confirm each path ran.
  • For downloaded JSON, read result.feature_execution.transcription_model_requested and result.feature_execution.transcription_model_applied to know which transcription tier produced the file. The job status field transcription_model mirrors the requested tier.
  • If you need a file download, GET /jobs/{job_id}/download returns the same JSON object as the nested result field.

Browser Upload Flow

For large files (up to 1 GB), use the direct-to-storage upload flow to avoid proxying through your server.

The browser upload flow has three steps:

  1. Request an upload target via POST /jobs/upload-target.
  2. Upload the file directly to the returned URL using a PUT request.
  3. Submit the job referencing uploaded_audio_key and uploaded_audio_token.

Step 1: Get an upload target

POST/jobs/upload-targetGet a signed upload URL
Example request
curl -X POST https://api.tuplets.ai/jobs/upload-target \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"filename": "meeting.mp3", "content_type": "audio/mpeg"}'

Step 2: Upload the file directly

Example request
# Upload directly using the signed URL
curl -X PUT "{upload_url}" \
  -H "Content-Type: audio/mpeg" \
  --data-binary @meeting.mp3

Step 3: Submit the job

Example request
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "uploaded_audio_key={object_key}" \
  -F "uploaded_audio_token={upload_token}" \
  -F "language=en" \
  -F "diarization=true"

Analytics Output

Completed jobs return result.analytics when analytics was requested. This is the canonical analytics payload for generic profiles and custom analytics.

Submit analytics with the analytics form field. Profile basic maps to generic tier fast; profile full maps to tier deep. Custom domain packs and custom schemas bill separately and can be combined with either profile.

Generic profile and custom analytics each require at least 15 seconds of audio. Evidence arrays on audit fields are hydrated deterministically from transcript segments after the model returns cited segment IDs.

Model hosting boundary

We do not send customer audio or transcripts to third-party LLM APIs for processing. Inference runs inside our controlled pipeline using self-hosted models.

Full profile output caps

Full profile (deep tier) generic modules are intentionally capped so the structured JSON stays complete on long recordings. Treat these as maximums, not guaranteed counts.

  • evidence is hydrated from at most 3 transcript segments per item.
  • qa_pairs: up to 12 items.
  • action_items: up to 12 items.
  • key_topics: up to 12 items.
  • risks: up to 12 items.
  • missed_opportunities: up to 12 items.
  • chapters: up to 12 items.
  • action_items[*].source_quote is derived from the first hydrated evidence snippet rather than copied verbatim from model output.

Basic Profile Example

Representative result.analytics for profile basic (tier fast).

{
  "profile": "basic",
  "tier": "fast",
  "modules": [
    "summary",
    "sentiment",
    "key_topics",
    "statistics",
    "speaker_stats",
    "chapters",
    "keywords"
  ],
  "summary": "A product review meeting covering launch readiness, one support blocker, and the agreed next steps.",
  "sentiment": {
    "overall": "neutral",
    "explanation": "The discussion is mostly procedural, with mild urgency around a remaining launch blocker.",
    "distribution": {
      "neutral": {
        "count": 10,
        "percentage": 71.4,
        "avg_score": 0.81
      },
      "positive": {
        "count": 4,
        "percentage": 28.6,
        "avg_score": 0.73
      }
    },
    "per_speaker": [
      {
        "speaker": "SPEAKER_00",
        "positive_count": 1,
        "neutral_count": 4,
        "negative_count": 0,
        "total_segments": 5,
        "avg_score": 0.79
      }
    ],
    "segments": [
      {
        "start": 0,
        "end": 14.2,
        "speaker": "SPEAKER_00",
        "text": "Let's review launch readiness and open blockers.",
        "label": "neutral",
        "score": 0.81
      }
    ]
  },
  "key_topics": [
    {
      "topic": "Launch readiness",
      "evidence": [
        {
          "start": 0,
          "end": 14.2,
          "speaker": "SPEAKER_00",
          "text": "Let's review launch readiness and open blockers."
        }
      ]
    },
    {
      "topic": "Support approval blocker",
      "evidence": [
        {
          "start": 48.1,
          "end": 56.4,
          "speaker": "SPEAKER_02",
          "text": "Support still needs to approve the migration warning copy."
        }
      ]
    }
  ],
  "action_items": [],
  "qa_pairs": [],
  "decisions": [],
  "risks": [],
  "missed_opportunities": [],
  "chapters": [
    {
      "title": "Open blockers",
      "start": 0,
      "end": 58,
      "summary": "The team reviewed support and rollout blockers."
    },
    {
      "title": "Next steps",
      "start": 58,
      "end": 108,
      "summary": "The team aligned on rollout timing and follow-up work."
    }
  ],
  "urgency": {
    "score": 1,
    "reasoning": ""
  },
  "keywords": [
    "launch readiness",
    "support approval blocker",
    "rollout checklist"
  ],
  "statistics": {
    "word_count": 842,
    "segment_count": 14,
    "total_duration_seconds": 108.6,
    "speaking_rate_wpm": 465.2
  },
  "speaker_stats": [
    {
      "speaker": "SPEAKER_00",
      "talk_time_seconds": 37.8,
      "word_count": 290,
      "segment_count": 5,
      "percentage": 34.8
    },
    {
      "speaker": "SPEAKER_01",
      "talk_time_seconds": 42.3,
      "word_count": 340,
      "segment_count": 5,
      "percentage": 39
    }
  ],
  "skipped_reason": null
}

Full Profile Example

Representative result.analytics for profile full (tier deep).

{
  "profile": "full",
  "tier": "deep",
  "modules": [
    "summary",
    "sentiment",
    "key_topics",
    "action_items",
    "qa",
    "decisions",
    "risks",
    "missed_opportunities",
    "urgency",
    "chapters",
    "keywords",
    "statistics",
    "speaker_stats"
  ],
  "summary": "A product review meeting covering launch readiness, support risks, and two follow-up actions.",
  "sentiment": {
    "overall": "neutral",
    "explanation": "The conversation was mostly procedural with one positive confirmation at the end.",
    "distribution": {
      "neutral": {
        "count": 10,
        "percentage": 71.4,
        "avg_score": 0.81
      },
      "positive": {
        "count": 4,
        "percentage": 28.6,
        "avg_score": 0.73
      }
    },
    "per_speaker": [
      {
        "speaker": "SPEAKER_00",
        "positive_count": 1,
        "neutral_count": 4,
        "negative_count": 0,
        "total_segments": 5,
        "avg_score": 0.79
      }
    ],
    "segments": [
      {
        "start": 0,
        "end": 14.2,
        "speaker": "SPEAKER_00",
        "text": "Let's review launch readiness and open blockers.",
        "label": "neutral",
        "score": 0.81
      }
    ]
  },
  "key_topics": [
    {
      "topic": "Launch readiness",
      "evidence": [
        {
          "start": 0,
          "end": 14.2,
          "speaker": "SPEAKER_00",
          "text": "Let's review launch readiness and open blockers."
        }
      ]
    }
  ],
  "action_items": [
    {
      "action": "Publish the revised rollout checklist",
      "owner": "SPEAKER_01",
      "due_date": "Friday",
      "confidence": "high",
      "source_quote": "I'll send the revised rollout checklist by Friday.",
      "evidence": [
        {
          "start": 92.5,
          "end": 100.3,
          "speaker": "SPEAKER_01",
          "text": "I'll send the revised rollout checklist by Friday."
        }
      ]
    }
  ],
  "qa_pairs": [
    {
      "question": "Are there any open support blockers?",
      "answer": "Only the migration warning copy still needs review.",
      "evidence": [
        {
          "start": 31.4,
          "end": 36.2,
          "speaker": "SPEAKER_00",
          "text": "Are there any open support blockers?"
        }
      ]
    }
  ],
  "decisions": [
    {
      "decision": "Proceed with the staged rollout next week",
      "evidence": [
        {
          "start": 66.8,
          "end": 73.1,
          "speaker": "SPEAKER_00",
          "text": "We'll keep the staged rollout for next week."
        }
      ]
    }
  ],
  "risks": [
    {
      "risk": "Migration warning copy may delay support approval",
      "severity": "high",
      "evidence": [
        {
          "start": 48.1,
          "end": 56.4,
          "speaker": "SPEAKER_02",
          "text": "Support still needs to approve the migration warning copy."
        }
      ]
    }
  ],
  "missed_opportunities": [
    {
      "opportunity": "No owner was assigned for the support approval follow-up",
      "type": "follow-up",
      "impact": "medium",
      "evidence": [
        {
          "start": 48.1,
          "end": 56.4,
          "speaker": "SPEAKER_02",
          "text": "Support still needs to approve the migration warning copy."
        }
      ]
    }
  ],
  "chapters": [
    {
      "title": "Open blockers",
      "start": 0,
      "end": 58,
      "summary": "The team reviewed support and rollout blockers."
    },
    {
      "title": "Next steps",
      "start": 58,
      "end": 108,
      "summary": "The team agreed on rollout timing and follow-up actions."
    }
  ],
  "urgency": {
    "score": 6,
    "reasoning": "A blocker remains open before next week's rollout."
  },
  "keywords": [
    "launch readiness",
    "support blockers",
    "rollout checklist"
  ],
  "statistics": {
    "word_count": 842,
    "segment_count": 14,
    "total_duration_seconds": 108.6,
    "speaking_rate_wpm": 465.2
  },
  "speaker_stats": [
    {
      "speaker": "SPEAKER_00",
      "talk_time_seconds": 37.8,
      "word_count": 290,
      "segment_count": 5,
      "percentage": 34.8
    },
    {
      "speaker": "SPEAKER_01",
      "talk_time_seconds": 42.3,
      "word_count": 340,
      "segment_count": 5,
      "percentage": 39
    }
  ],
  "skipped_reason": null
}

Domain Pack Output Example

Representative result.analytics.custom for domain insurance (schema_version 2026-05-31).

{
  "schema_version": "2026-05-31",
  "domain": "insurance",
  "fields": {
    "call_reason": {
      "status": "matched",
      "selected": [
        {
          "id": 1,
          "code": "CLAIM_STATUS",
          "label": "Claim status",
          "confidence": 0.91,
          "evidence_segment_ids": [
            4,
            5
          ]
        }
      ],
      "reasoning": "The caller asked for an update on an existing claim."
    },
    "policy_line": {
      "status": "matched",
      "selected": [
        {
          "id": 1,
          "code": "AUTO",
          "label": "Auto",
          "confidence": 0.88,
          "evidence_segment_ids": [
            2,
            3
          ]
        },
        {
          "id": 2,
          "code": "HOME",
          "label": "Home",
          "confidence": 0.76,
          "evidence_segment_ids": [
            7
          ]
        }
      ],
      "reasoning": "The caller referenced both auto and home policies during the conversation."
    },
    "claim_stage": {
      "status": "no_match",
      "selected": [],
      "reasoning": "The transcript does not contain enough evidence to determine claim stage."
    },
    "cancellation_risk": {
      "status": "matched",
      "value": 2,
      "confidence": 0.74,
      "evidence_segment_ids": [
        6
      ],
      "reasoning": "The caller expressed mild frustration but did not ask to cancel."
    },
    "missed_retention_opportunity": {
      "status": "matched",
      "value": false,
      "confidence": 0.82,
      "evidence_segment_ids": [
        8,
        9
      ],
      "reasoning": "The agent addressed the caller's billing concern directly."
    },
    "compliance_disclosure_present": {
      "status": "matched",
      "value": true,
      "confidence": 0.95,
      "evidence_segment_ids": [
        0
      ],
      "reasoning": "The call opening included a recording disclosure."
    }
  },
  "skipped_reason": null
}

Field Reference

result.analytics contains generic profile fields, custom fields, or both depending on the analytics request.

FieldTypeDescription
profilestring | nullRequested analytics profile: basic, full, or null for custom-only jobs.
tierstring | nullResolved generic analytics tier: fast for profile basic, deep for profile full, or null for custom-only jobs.
modulesarrayModule keys included in this response (summary, sentiment, key_topics, action_items, qa, decisions, risks, missed_opportunities, urgency, chapters, keywords, statistics, speaker_stats).
summarystringCompact narrative summary of the recording.
sentimentobjectOverall label, explanation, distribution, per-speaker rollup, and segment timeline.
key_topicsarrayTopics with evidence snippets using transcript timestamps and speaker labels. Maximum 12 items.
action_itemsarrayFull profile only. Tasks, owners, due dates, confidence, derived source quotes, and evidence. Basic profile returns an empty array.
qa_pairsarrayFull profile only. Question and answer pairs with supporting evidence. Basic profile returns an empty array.
decisionsarrayFull profile only. Decisions made in the recording with evidence references. Basic profile returns an empty array.
risksarrayFull profile only. Risks or blockers with severity and evidence. Basic profile returns an empty array.
missed_opportunitiesarrayFull profile only. Follow-up, clarification, or unanswered openings surfaced from the transcript. Basic profile returns an empty array.
chaptersarrayTopical time ranges with a title and short summary. Maximum 12 items.
urgencyobjectFull profile only for substantive scoring. Basic profile returns the default object with score=1 and empty reasoning.
keywordsarrayCPU-extracted keywords for quick indexing and filtering.
statisticsobjectWord count, segment count, total duration, and speaking rate.
speaker_statsarray | nullPer-speaker talk time, word count, segment count, and share of speaking time when diarization is enabled.
skipped_reasonstring | nullPresent when a generic profile was requested but the full extraction path was skipped or downgraded.
customobject?Domain-pack or schema-driven custom fields when custom analytics ran.

Custom Analytics Request

FieldTypeDescription
profilebasic | full | none | offOptional generic analytics profile. Use none/off for custom-only extraction.
domainstringOptional built-in domain pack. When set, the pack's fields are injected automatically — see Built-in Domain Packs below for keys, types, and option codes.
schema.schema_versionstringOptional caller-defined schema version. Defaults to 2026-05-31.
schema.fieldsarrayOptional custom field definitions. When domain is set, pack fields are injected automatically and caller-defined fields are appended. Duplicate keys are rejected at validation.
schema.fields[].keystringStable snake_case field key. Must start with a lowercase letter and contain lowercase letters, numbers, or underscores.
schema.fields[].labelstringHuman-readable field label.
schema.fields[].typesingle_select | multi_select | boolean | ratingControls the validated output shape.
schema.fields[].descriptionstring?Optional extraction guidance for this field.
schema.fields[].requiredboolean?Marks the field as expected by the caller; unmatched required fields still return status no_match.
schema.fields[].optionsarray?Required for select fields. Options include id, uppercase code, and label.
schema.fields[].min / maxinteger?Bounds for rating fields. Defaults are 1 and 5.

Custom Analytics Output

FieldTypeDescription
custom.schema_versionstring | nullSchema version used for extraction.
custom.domainstring | nullDomain pack key when one was requested.
custom.fieldsobjectMap of custom field key to normalized field result.
custom.fields.*.statusmatched | no_match | skippedWhether the field matched transcript evidence, had no supported answer, or was skipped.
custom.fields.*.selectedarraySelected options for single_select and multi_select fields, each with id, code, label, confidence, and evidence_segment_ids.
custom.fields.*.valueboolean | number | nullBoolean or rating value for boolean and rating fields.
custom.fields.*.confidencenumber?Model confidence for boolean/rating fields when matched.
custom.fields.*.evidence_segment_idsnumber[]Transcript segment indexes supporting the classification. Maximum 8 per field.
custom.fields.*.reasoningstringShort explanation for the selected value or no_match result.
custom.skipped_reasonstring | nullPresent when custom analytics could not run or generation failed.

Built-in Domain Packs

Stable field catalogs for insurance, support, sales, healthcare, and financial_services. Requesting a domain alone injects all pack fields — schema.fields is optional. Caller-defined fields are appended after pack fields; duplicate keys are rejected at validation. Every field key below appears in custom.fields on completed jobs; option codes are stable across API versions.

Supported domain keys: insurance, support, sales, healthcare, financial_services.

Insurance (insurance)

Claims, policy lines, retention signals, and compliance disclosures for insurance call centers.

KeyLabelTypeRequiredOptions / boundsDescription
call_reasonReason for callsingle_selectyesCLAIM_STATUS, NEW_POLICY, POLICY_CHANGE, BILLING, COVERAGE_QUESTION, CANCELLATION, OTHER
policy_linePolicy linemulti_selectnoAUTO, HOME, RENTERS, LIFE, HEALTH, COMMERCIAL, UNKNOWN
claim_stageClaim stagesingle_selectnoFIRST_NOTICE, DOCUMENT_COLLECTION, ADJUSTER_REVIEW, SETTLEMENT, DENIAL_APPEAL, NOT_A_CLAIM
cancellation_riskCancellation riskratingno1–51 means no cancellation signal; 5 means explicit cancellation intent.
missed_retention_opportunityMissed retention opportunitybooleannoTrue when the agent could have addressed cancellation, price, or coverage concern but did not.
compliance_disclosure_presentCompliance disclosure presentbooleannoTrue when required disclosure, recording notice, or authorization language is present.

Support (support)

Contact reasons, resolution outcomes, escalation signals, and churn risk for customer support interactions.

KeyLabelTypeRequiredOptions / boundsDescription
contact_reasonContact reasonsingle_selectyesACCOUNT_ACCESS, BILLING, TECHNICAL_ISSUE, ORDER_STATUS, REFUND, CANCELLATION, OTHER
resolution_statusResolution statussingle_selectnoRESOLVED, PARTIALLY_RESOLVED, ESCALATED, CALLBACK_REQUIRED, UNRESOLVED
escalation_requiredEscalation requiredbooleanno
churn_riskChurn riskratingno1–5
refund_requestedRefund requestedbooleanno

Sales (sales)

Buying intent, pipeline stage, objections, and upsell signals for sales conversations.

KeyLabelTypeRequiredOptions / boundsDescription
buying_intentBuying intentratingno1–51 means no interest; 5 means explicit intent to buy or advance.
lead_stageLead stagesingle_selectnoUNQUALIFIED, QUALIFYING, QUALIFIED, PROPOSAL, NEGOTIATION, CLOSED_WON_SIGNAL, CLOSED_LOST_SIGNAL
objection_typeObjection typemulti_selectnoPRICE, TIMING, AUTHORITY, NEED, COMPETITOR, IMPLEMENTATION, NONE
competitor_mentionedCompetitor mentionedbooleanno
next_step_committedNext step committedbooleanno
missed_upsell_opportunityMissed upsell opportunitybooleanno

Healthcare (healthcare)

Patient intent, escalation needs, referral issues, and consent language for healthcare front-desk calls.

KeyLabelTypeRequiredOptions / boundsDescription
patient_intentPatient intentsingle_selectnoSCHEDULE_APPOINTMENT, RESCHEDULE, PRESCRIPTION, TEST_RESULTS, BILLING_INSURANCE, SYMPTOM_TRIAGE, OTHER
escalation_requiredClinical or administrative escalation requiredbooleanno
referral_issueReferral issuebooleanno
consent_language_presentConsent language presentbooleanno

Financial Services (financial_services)

Service reasons, fraud signals, complaint risk, and regulatory language for banking and fintech calls.

KeyLabelTypeRequiredOptions / boundsDescription
service_reasonService reasonsingle_selectnoACCOUNT_ACCESS, TRANSACTION_DISPUTE, FRAUD_CONCERN, KYC_DOCUMENTS, LOAN_OR_CREDIT, COMPLAINT, OTHER
fraud_signalFraud signalbooleanno
complaint_riskComplaint riskratingno1–5
regulatory_language_presentRegulatory language presentbooleanno

Common Use Cases

How teams leverage structured analytics across different scenarios.

Sales Calls

Extract next steps, objections, competitor mentions, and buyer sentiment. Auto-generate CRM notes and follow-up tasks from action_items and risks.

Customer Support

Surface escalations, sentiment drops, and unresolved issues via qa_pairs and sentiment.segments. Identify frustrated customers before they churn.

Internal Meetings

Track decisions, action items, and topic coverage across recurring meetings. Use speaker_stats and sentiment.per_speaker to monitor participation balance.

Recruiting Interviews

Evaluate candidate responses, extract Q&A patterns, and generate structured interview summaries with qa_pairs, summary, and key_topics.

Audio Formats & Constraints

Supported input formats and file constraints for transcription jobs.

FormatExtensionNotes
MP3.mp3Most common. Good balance of quality and size.
WAV.wavUncompressed. Best quality but largest size.
FLAC.flacLossless compression. High quality.
OGG.oggOpen format. Good for web use.
M4A.m4aCommon in Apple ecosystem.
AAC.aacAdvanced Audio Codec.
WMA.wmaWindows Media Audio.
WebM.webmWeb-optimized format.
Opus.opusLow latency, high compression.

Constraints

  • Minimum audio duration: 5 seconds
  • Maximum audio duration: 8 hours (WAV files are further limited by the 4 GiB RIFF container)
  • Maximum file size (all submission methods — direct upload, browser upload, and remote URL): 1 GB
  • Generic profile analytics and custom analytics each require at least 15 seconds of audio
  • At most one generic profile (basic or full) per job; custom domain packs can be added on top
  • Remote URLs must be publicly accessible HTTP/HTTPS endpoints pointing directly to media on a public internet host; private and localhost-style targets are rejected, including redirect targets

Supported Languages

Standard transcription supports 96 languages. Premium transcription supports Auto-detect plus 30 languages.

Specify language using ISO 639-1 two-letter codes (e.g., en for English, es for Spanish, ja for Japanese). Use auto or omit the parameter for automatic language detection.

Premium uses a narrower language set today: ar, cs, da, de, el, en, es, fa, fi, fil, fr, hi, hu, id, it, ja, ko, mk, ms, nl, pl, pt, ro, ru, sv, th, tr, vi, yue, zh. If you need a language outside that list, use standard.

Browse standard-tier language codes

af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, cs, cy, da, de, el, en, es, et, eu, fa, fi, fo, fr, gl, gu, ha, haw, hi, hr, hu, hy, id, is, it, ja, jw, ka, kk, km, kn, ko, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, ro, ru, sa, sd, si, sk, sl, sn, so, sr, su, sv, sw, ta, te, tg, th, tk, tl, tr, tt, uk, ur, uz, vi, yi, yo, zh

Pricing

You are charged per-second of audio processed, with a minimum charge based on the selected feature bundle. New accounts receive $2.50 in free credits to get started.

Rules that affect what you pay

  • Premium transcription is a standalone tier billed at its published hourly rate and carries a minimum charge of $0.035 per job.
  • Generic profile and custom analytics each require at least 15 seconds of audio.
  • Basic analytics minimum charge: $0.08. Full analytics minimum: $0.125. Custom analytics minimum: $0.15 (stackable with a profile).
  • Premium supports Auto-detect plus these languages: ar, cs, da, de, el, en, es, fa, fi, fil, fr, hi, hu, id, it, ja, ko, mk, ms, nl, pl, pt, ro, ru, sv, th, tr, vi, yue, zh.
  • At most one generic profile per job; custom analytics can be added to any profile.
  • Cancellation does not add a separate fee, and cancelled jobs are billed under the normal pricing rules.
FeatureRateMinimum chargeRequired
Standard transcription$0.14 / hour$0.01Always included
Premium transcription$0.25 / hour$0.035Always included
Speaker diarization$0.08 / hour$0.035Optional
PII processing$0.06 / hour$0.035Optional
Basic analytics$0.24 / hour$0.08Optional
Full analytics$0.60 / hour$0.125Optional
Custom analytics$0.45 / hour$0.15Optional
BundleMinimum billable charge
Transcript only$0.01
Premium transcription only$0.035
Transcript + diarization$0.035
Transcript + PII$0.035
Premium transcription + PII$0.08
Transcript + basic analytics$0.08
Transcript + full analytics$0.125
Transcript + custom analytics$0.15
Transcript + full analytics + custom$0.15
Transcript + diarization + PII$0.065
Any bundle that includes basic analytics$0.08
Any bundle that includes full analytics$0.125

Rate Limits

Rate limits protect the platform from abuse. Limits apply per IP address and per user account.

ScopeLimitWindow
Global (IP)300 requests5 minutes
Job creation (IP)300 requests1 hour
Job creation (user)120 requests1 hour
Job creation burst (user)20 requests5 minutes
Download (IP)120 requests1 hour
Download (user)40 requests15 minutes

When rate limited

The API returns 429 Too Many Requests. Implement exponential backoff in your integration before retrying.

Error Codes

Every error response includes a detail field explaining what went wrong and how to fix it.

CodeLabelDetailFix
400Bad RequestInvalid input, unsafe remote URL, or missing required field.Check your request parameters and ensure remote URLs use a public HTTP/HTTPS host.
401UnauthorizedMissing or invalid API key / session.Provide a valid Authorization: Bearer header.
402Insufficient CreditAccount balance is too low for the estimated cost.Top up your account and retry.
403ForbiddenEmail not verified or action not allowed.Verify your email address first.
404Not FoundThe requested resource does not exist.Check the job ID or resource path.
409ConflictConcurrent job limit reached or resource unavailable.Wait for transcription to complete.
410GoneResult download is no longer available.Download results within the published retention window.
429Rate LimitedToo many requests in the current window.Back off and retry with exponential delay.
503Service UnavailableProcessing capacity temporarily unavailable.Retry after a short delay.

Error responses follow this format:

{
  "detail": "You do not have enough credit for this job."
}

Best Practices

Recommendations for building a reliable integration.

Poll with backoff

Poll job status every 2–5 seconds. Use exponential backoff for longer-running jobs. Most jobs under 1 hour complete within 1–5 minutes.

Store cancel tokens

Save the cancel_token returned by POST /jobsor job status responses if you need to implement user-facing cancellation for queued or running jobs. The token must still be used by an authenticated account request. There is no separate cancellation fee, and cancelled jobs are billed under the normal pricing rules.

Use browser uploads for large files

Files over 25 MB should use the browser upload flow. This avoids double-proxying through your server and is significantly faster.

Specify language for accuracy

Automatic detection is accurate, but specifying the language code yields better results — especially for code-switching or accented speech.

Download results promptly

Transcript JSON downloads are typically available for about 7 days after job completion under the current retention setting, and each job currently has a hard cap of 20 attachment downloads. Download and archive results you need to keep.

Handle rate limits gracefully

Check for 429 responses and implement retry logic with exponential backoff. Avoid assuming a Retry-After header is present.