Tuplets

Version 1.0 · REST API

API Documentation

Everything you need to integrate audio transcription into your application. Submit audio, poll for results, and download structured transcripts with speaker labels, PII redaction, and analytics.

Get started Create free account

Quick Start

Get your first transcription running in under a minute.

1. Create an account and get your API key

Sign up at tuplets.ai/signup, verify your email, and generate an API key from your dashboard settings. Keys are prefixed with tb_ and are shown only once.

2. Set your API key

Pass your API key as a Bearer token in the Authorization header on every request.

Example request

export TUPLETS_API_KEY="tb_your_key_here"

# Verify the key works
curl -H "Authorization: Bearer $TUPLETS_API_KEY" \
  https://api.tuplets.ai/health

3. Submit audio for transcription

Upload an audio file directly, use the browser upload flow, or point to a remote URL on a public HTTP/HTTPS host. We support MP3, WAV, M4A, FLAC, OGG, AAC, WMA, WebM, and Opus.

Remote URLs are validated before processing starts. Private, loopback, link-local, and otherwise non-public hosts are rejected, and redirect targets must also stay on public internet hosts.

Example request

# Upload a local file
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "audio_file=@interview.mp3" \
  -F "language=en" \
  -F "diarization=true"

# Or submit a remote URL on a public host
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "remote_url=https://storage.example.com/recording.mp3" \
  -F "language=en"

4. Poll for completion and download

Current API behavior: POST /jobs returns the new job id immediately, along with status_url, cancel_url, and cancel_token. The examples below show the normal flow: submit, capture the returned job ID, then poll that job directly.

Example request

# Submit and capture the returned job ID
job_id=$(curl -s -X POST https://api.tuplets.ai/jobs   -H "Authorization: Bearer $TUPLETS_API_KEY"   -F "audio_file=@interview.mp3"   -F "language=en"   -F "diarization=true" | jq -r '.id')

# Poll job status
curl https://api.tuplets.ai/jobs/$job_id \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

# Download completed transcript JSON
curl https://api.tuplets.ai/jobs/$job_id/download \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

SDKs

Use our official client libraries to integrate Tuplets into your application.

All SDKs are open-source, type-safe, and cover the full Tuplets API — including job creation, polling, downloads, browser uploads, and solutions inquiries.

Python SDK

Install from PyPI and use the sync or async client.

pip install tuplets-ai

tuplets-ai on PyPI →

JavaScript / TypeScript SDK

Install from npm and use with Node.js or browser environments.

npm install @tupletsai/sdk

@tupletsai/sdk on npm →

Authentication

All API requests require authentication via an API key or a session cookie.

API Keys

Generate API keys from your account settings. Keys begin with tb_ and are 43 characters total.

Include the key in the Authorization header as a Bearer token:

Example request

curl -H "Authorization: Bearer tb_your_key_here" \
  https://api.tuplets.ai/jobs

Endpoint	Method	Description
/account/api-keys	POST	Create a new API key
/account/api-keys	GET	List active API keys
/account/api-keys/{key_id}/rotate	POST	Rotate (replace) an API key
/account/api-keys/{key_id}	DELETE	Revoke an API key

Example request

# Create a new API key
curl -X POST https://api.tuplets.ai/account/api-keys \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Integration Key"}'

Key Rotation & Security

Rotate keys regularly and revoke compromised keys immediately.

Keys are stored in a secure, non-reversible manner.
The raw key is shown only once at creation time.
Rotate keys via POST /account/api-keys/{key_id}/rotate.
Revoke keys via DELETE /account/api-keys/{key_id}.
Each key is scoped to a single account.

API Reference

Base URL: https://api.tuplets.ai — All endpoints return JSON.

Create a Transcription Job

Submit audio for processing. You must provide exactly one of: audio_file, remote_url, or uploaded_audio_key. Browser-direct upload is recommended for large or private files.

POST/jobsSubmit a new transcription job

Analytics requirements

Prefer analytics as a JSON string on multipart form posts. Example: {"profile":"full","domain":"insurance"}.
profile accepts basic (summary-oriented modules) or full (adds evidence-linked audit fields). Set none or off to omit a profile while still requesting custom analytics.
Optional domain selects a built-in custom pack: insurance, support, sales, healthcare, or financial_services. You may combine a profile with a domain pack.
Generic profile analytics and custom analytics each require at least 15 seconds of audio.

Transcription model selection

transcription_model accepts standard or premium. The default is standard.
premium routes to the higher-accuracy premium transcription tier for harder audio and long-form recordings.
premium can be combined with diarization=true for speaker-attributed premium transcripts.
speaker_embeddings=true is optional on premium diarized jobs. It adds per-speaker voice embedding vectors to diarization.speaker_embeddings for cross-recording speaker matching (no extra charge).

Parameter	Type	Required	Description
audio_file	file	conditional	Audio file (MP3, WAV, M4A, FLAC, OGG, AAC, WMA, WebM, Opus). Max 1 GB.
remote_url	string	conditional	Public HTTP/HTTPS URL pointing directly to audio or video media on a public internet host. Private or localhost-style addresses are rejected, redirects are revalidated, and the same 1 GB limit applies.
uploaded_audio_key	string	conditional	Object key from a browser upload. Must be paired with uploaded_audio_token.
uploaded_audio_token	string	conditional	Upload token from the browser upload flow.
language	string	optional	Language code (ISO 639-1). Default: auto (automatic detection).
transcription_model	string	optional	Transcription tier: standard or premium. Default: standard.
diarization	boolean	optional	Enable speaker diarization (separate speakers). Default: false.
speaker_embeddings	boolean	optional	Include per-speaker voice embedding vectors in diarization output. Requires transcription_model=premium and diarization=true. Default: false.
pii_processing	boolean	optional	Enable transcript-level PII redaction for supported transcript languages only: English, Spanish, French, German, Portuguese, Italian, Dutch, and Polish (en, es, fr, de, pt, it, nl, pl). Unsupported languages skip PII and are not billed for it. Default: false.
analytics	string (JSON)	optional	Analytics request as JSON. Fields: profile (basic \| full \| none \| off), domain (built-in pack key), schema (custom field definitions). Omit for no analytics.

Example request

# Upload file with standard transcription + all add-ons
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "audio_file=@call_recording.mp3" \
  -F "language=en" \
  -F "transcription_model=standard" \
  -F "diarization=true" \
  -F "pii_processing=true" \
  -F 'analytics={"profile":"full","domain":"insurance"}'

# Remote URL on a public host with premium transcription
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "remote_url=https://example.com/audio.mp3" \
  -F "transcription_model=premium" \
  -F "language=auto"

# Premium diarization with per-speaker embedding vectors
curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "audio_file=@meeting.mp3" \
  -F "transcription_model=premium" \
  -F "diarization=true" \
  -F "speaker_embeddings=true"

Response

{
  "status": "accepted",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status_url": "https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000",
  "cancel_url": "https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000",
  "cancel_token": "<TOKEN>"
}

Submission response

The create endpoint returns the new id immediately, plus a status_url, cancel_url, and shared cancel_token. Your client can start polling the returned job URL right away.

The API still does not support client-provided request IDs or echoed metadata on create, so if you need retry-safe deduplication you should keep your own idempotency layer on the client side for now.

PII processing details

Detection runs once on the full transcript, then the same canonical spans are projected back to segment output for consistent redaction.
Structured identifiers such as email, SSN, IP address, payment card numbers, and phone-like values use deterministic validation before redaction.
Names and address-like phrases use contextual entity detection instead of regex alone. Generic organization and location names are excluded by default to reduce false positives and preserve downstream topic quality.
A final automated review pass checks the masked transcript for any surviving direct identifiers before the redacted result is returned.
PII currently runs only for supported transcript languages: English, Spanish, French, German, Portuguese, Italian, Dutch, and Polish (en, es, fr, de, pt, it, nl, pl). If auto-detect or an explicit language resolves outside that set, the PII pass is skipped.
PII mode is coverage-first. Short person names and ambiguous address-like text can still be redacted when the pipeline prefers privacy over recall risk.

PII currently runs only for English, Spanish, French, German, Portuguese, Italian, Dutch, and Polish (en, es, fr, de, pt, it, nl, pl). If the transcript language resolves outside that set, PII is skipped and you are not charged for PII processing.

Remote URL rules

Use a direct http:// or https:// link to publicly reachable media.
Hosts that resolve to private, loopback, link-local, or other non-public IP ranges are rejected.
Redirects are allowed only when every hop stays on a public internet host.
If the file is private or requires signed storage access, use POST /jobs/upload-target and submit uploaded_audio_key instead.

Get Job Status

Retrieve the current status and result of a transcription job. This endpoint returns the full nested result object once the job completes.

GET/jobs/{job_id}Get a single job by ID

Example request

curl https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

Response

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "result": {
    "text": "Welcome everyone. Today we are reviewing launch readiness and confirming final ownership.",
    "segments": [
      {
        "start": 0,
        "end": 6.2,
        "speaker": "SPEAKER_00",
        "text": "Welcome everyone. Today we are reviewing launch readiness."
      },
      {
        "start": 6.2,
        "end": 12.8,
        "speaker": "SPEAKER_01",
        "text": "Support still needs the updated help-center copy before release."
      }
    ],
    "language": "en",
    "audio_duration_seconds": 1842.4,
    "transcription_meta": {
      "coverage_ratio": 0.94,
      "total_covered_seconds": 1731.8
    },
    "diarization": {
      "speaker_count": 4,
      "confidence": "medium",
      "warning": "Speaker attribution may be less accurate around short backchannels and overlapping speech.",
      "speaker_embeddings": {
        "model": "pyannote/speaker-diarization-community-1",
        "dimension": 256,
        "vectors": {
          "SPEAKER_00": [
            0.12,
            -0.04,
            0.08
          ],
          "SPEAKER_01": [
            -0.03,
            0.11,
            0.05
          ]
        }
      }
    },
    "pii_findings": [
      {
        "type": "EMAIL_ADDRESS",
        "start": 214,
        "end": 230,
        "confidence": 0.85,
        "detector_count": 1
      }
    ],
    "analytics": {
      "profile": "full",
      "tier": "deep",
      "modules": [
        "summary",
        "sentiment",
        "key_topics",
        "action_items",
        "qa",
        "decisions",
        "risks",
        "chapters",
        "keywords",
        "statistics",
        "speaker_stats",
        "urgency"
      ],
      "summary": "A launch-readiness meeting covering remaining documentation work, support copy approval, and release ownership.",
      "sentiment": {
        "overall": "neutral",
        "explanation": "The discussion is procedural, with mild urgency around launch blockers and final handoff items.",
        "distribution": {
          "positive": {
            "count": 5,
            "percentage": 22.7,
            "avg_score": 0.71
          },
          "neutral": {
            "count": 14,
            "percentage": 63.6,
            "avg_score": 0.8
          },
          "negative": {
            "count": 3,
            "percentage": 13.7,
            "avg_score": 0.64
          }
        },
        "per_speaker": [
          {
            "speaker": "SPEAKER_00",
            "positive_count": 2,
            "neutral_count": 6,
            "negative_count": 0,
            "total_segments": 8,
            "avg_score": 0.77
          }
        ],
        "segments": [
          {
            "start": 0,
            "end": 6.2,
            "speaker": "SPEAKER_00",
            "text": "Welcome everyone. Today we are reviewing launch readiness.",
            "label": "neutral",
            "score": 0.79
          }
        ]
      },
      "key_topics": [
        {
          "topic": "launch readiness",
          "evidence": [
            {
              "start": 0,
              "end": 6.2,
              "speaker": "SPEAKER_00",
              "text": "Welcome everyone. Today we are reviewing launch readiness."
            }
          ]
        }
      ],
      "action_items": [
        {
          "action": "Update the help-center copy",
          "owner": "SPEAKER_01",
          "due_date": "Friday",
          "confidence": "high",
          "source_quote": "I'll send the revised support copy by Friday.",
          "evidence": [
            {
              "start": 142.4,
              "end": 148.9,
              "speaker": "SPEAKER_01",
              "text": "I'll send the revised support copy by Friday."
            }
          ]
        }
      ],
      "qa_pairs": [],
      "decisions": [
        {
          "decision": "Keep the staged release for next week",
          "evidence": [
            {
              "start": 312.7,
              "end": 319.4,
              "speaker": "SPEAKER_00",
              "text": "We'll keep the staged release for next week."
            }
          ]
        }
      ],
      "risks": [
        {
          "risk": "Support copy approval may slip the release checklist",
          "severity": "medium",
          "evidence": [
            {
              "start": 6.2,
              "end": 12.8,
              "speaker": "SPEAKER_01",
              "text": "Support still needs the updated help-center copy before release."
            }
          ]
        }
      ],
      "missed_opportunities": [],
      "chapters": [
        {
          "title": "Launch review",
          "start": 0,
          "end": 420,
          "summary": "The team reviews blockers, content updates, and release ownership."
        }
      ],
      "urgency": {
        "score": 5,
        "reasoning": "The release is on track, but one remaining approval item still needs to land."
      },
      "keywords": [
        "launch readiness",
        "support copy",
        "staged release",
        "handoff"
      ],
      "statistics": {
        "word_count": 2841,
        "segment_count": 221,
        "total_duration_seconds": 1731.8,
        "speaking_rate_wpm": 98.4
      },
      "speaker_stats": [
        {
          "speaker": "SPEAKER_00",
          "talk_time_seconds": 612.4,
          "word_count": 980,
          "segment_count": 77,
          "percentage": 35.4
        },
        {
          "speaker": "SPEAKER_01",
          "talk_time_seconds": 484.2,
          "word_count": 801,
          "segment_count": 61,
          "percentage": 28
        }
      ],
      "skipped_reason": null,
      "custom": {
        "schema_version": "2026-05-31",
        "domain": "insurance",
        "fields": {
          "call_reason": {
            "status": "matched",
            "selected": [
              {
                "id": 1,
                "code": "CLAIM_STATUS",
                "label": "Claim status",
                "confidence": 0.91,
                "evidence_segment_ids": [
                  4,
                  5
                ]
              }
            ],
            "reasoning": "The caller asked for an update on an existing claim."
          },
          "policy_line": {
            "status": "matched",
            "selected": [
              {
                "id": 1,
                "code": "AUTO",
                "label": "Auto",
                "confidence": 0.88,
                "evidence_segment_ids": [
                  2,
                  3
                ]
              }
            ],
            "reasoning": "The caller referenced their auto policy number."
          },
          "cancellation_risk": {
            "status": "matched",
            "value": 2,
            "confidence": 0.74,
            "evidence_segment_ids": [
              6
            ],
            "reasoning": "The caller expressed mild frustration but did not ask to cancel."
          }
        },
        "skipped_reason": null
      }
    },
    "feature_execution": {
      "transcription_requested": true,
      "transcription_applied": true,
      "transcription_model_requested": "standard",
      "transcription_model_applied": "standard",
      "transcription_elapsed_seconds": 12.4,
      "diarization_requested": true,
      "diarization_applied": true,
      "diarization_elapsed_seconds": 3.1,
      "analytics_requested": true,
      "analytics_profile_requested": "full",
      "analytics_applied": true,
      "analytics_elapsed_seconds": 9.9,
      "generic_analytics_requested": true,
      "generic_analytics_tier_requested": "deep",
      "generic_analytics_applied": true,
      "generic_analytics_tier_applied": "deep",
      "generic_analytics_elapsed_seconds": 7.8,
      "custom_analytics_requested": true,
      "custom_analytics_applied": true,
      "custom_analytics_elapsed_seconds": 2.1,
      "pii_processing_requested": true,
      "pii_processing_language": "en",
      "pii_processing_applied": true,
      "pii_processing_elapsed_seconds": 1.2
    },
    "processing_warnings": []
  },
  "error_message": null,
  "audio_duration_seconds": 120.5,
  "transcription_model": "standard",
  "diarization": true,
  "pii_processing": true,
  "analytics": {
    "profile": "full",
    "tier": "deep",
    "modules": [
      "summary",
      "sentiment",
      "key_topics",
      "action_items",
      "qa",
      "decisions",
      "risks",
      "chapters",
      "keywords",
      "statistics",
      "speaker_stats",
      "urgency"
    ],
    "custom_schema": {
      "domain": "insurance",
      "schema_version": "2026-05-31",
      "fields": []
    }
  },
  "estimated_cost_usd": 0.01,
  "billed_cost_usd": 0.01,
  "billing_status": "charged",
  "source_type": "upload",
  "result_download_available": true,
  "source_audio_available": true,
  "progress_percent": 100,
  "estimated_seconds_remaining": null,
  "cancel_token": null,
  "created_at": "2025-03-15T10:30:00Z",
  "started_at": "2025-03-15T10:30:02Z",
  "completed_at": "2025-03-15T10:31:45Z",
  "runtime_ms": 103000
}

Field	Type	Description
id	UUID	Unique job identifier
status	string	queued \| running \| completed \| failed
result	object \| null	Full transcript result for completed jobs. Use GET /jobs/{job_id}/download to fetch the same object as a file attachment.
error_message	string \| null	Error detail if failed
audio_duration_seconds	number \| null	Detected audio duration
transcription_model	string	Selected transcription tier: standard or premium
diarization	boolean	Whether speaker diarization was requested
speaker_embeddings	boolean	Whether per-speaker voice embedding vectors were requested (premium + diarization only)
pii_processing	boolean	Whether PII processing was requested
analytics	object \| null	Normalized analytics request: profile, tier, modules, and custom_schema when analytics was requested.
estimated_cost_usd	number \| null	Pre-run cost estimate
billed_cost_usd	number \| null	Actual amount charged
billing_status	string \| null	pending \| charged \| skipped \| failed
source_type	string \| null	upload \| remote_url
result_download_available	boolean	Whether JSON download is currently available
source_audio_available	boolean	Whether the original source audio is still available on the backend
progress_percent	number \| null	Best-effort progress snapshot for queued/running jobs
estimated_seconds_remaining	integer \| null	Best-effort ETA for queued/running jobs
cancel_token	string \| null	Token to cancel this job while it is queued or running
created_at	datetime	Job creation timestamp
started_at	datetime \| null	When processing actually began
completed_at	datetime \| null	Job completion timestamp
runtime_ms	number \| null	Processing time in milliseconds

Result payload notes

GET /jobs/{job_id} returns the full nested result object after completion. GET /jobs returns job rows with a result text preview only (see List Jobs).
The nested result object is feature-dependent. Your client should treat optional sub-objects such as diarization, diarization.speaker_embeddings, pii_findings, feature_execution, processing_warnings, and analytics as conditional.
diarization.speaker_embeddings contains model, dimension, and vectors (map of speaker label to float array). Only present when the job was submitted with speaker_embeddings=true on premium diarized audio.
Phrase-level segments[] always include start, end, and text. speaker is present when diarization ran. Standard non-diarized jobs may also include decoder fields such as avg_logprob and no_speech_prob.
The premium model usually returns both words[] (word-level timestamps) and segments[] (phrase-level). With diarization=true, treat segments[] with speaker as the primary timeline. See the schema tables below.

List Jobs

Browse transcription jobs with optional status filtering.

GET/jobsList jobs for the authenticated account

Query Parameter	Type	Required	Description
status	string	optional	Filter by status: queued, running, completed, or failed
limit	integer	optional	Max results (1–100). Default: 20.

Example request

# List recent completed jobs
curl "https://api.tuplets.ai/jobs?status=completed&limit=10" \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

{
  "items": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "status": "completed",
      "result": {
        "text": "Welcome everyone. Today we are reviewing launch readiness and confirming final ownership…"
      },
      "audio_duration_seconds": 120.5,
      "transcription_model": "standard",
      "diarization": true,
      "estimated_cost_usd": 0.012,
      "billed_cost_usd": 0.012,
      "created_at": "2025-03-15T10:30:00Z"
    }
  ],
  "total_items": 1,
  "status_filter": "completed"
}

Results are sorted newest-first. Use this endpoint to browse recent jobs, not to discover the ID of a newly submitted one. POST /jobs now returns the job ID directly. For completed jobs, result is a preview object with only text (truncated to about 1,000 characters). Use GET /jobs/{job_id} or GET /jobs/{job_id}/download for the full transcript JSON.

Cancel a Job

Cancel a queued or running job. Two methods: by ID (authenticated) or by cancel token (shared).

Cancellation billing policy

Tuplets does not add a separate cancellation fee. Cancelled jobs are billed under the standard pricing rules, including minimum charges, so cancellation should be treated as a request to stop further work rather than a guarantee that the job becomes free.

Cancel by Job ID

DELETE/jobs/{job_id}Cancel a job

Example request

curl -X DELETE https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer $TUPLETS_API_KEY"

Cancel by Token

POST/jobs/cancelCancel a job using a shared cancel token

Example request

curl -X POST https://api.tuplets.ai/jobs/cancel \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"cancel_token": "<cancel_token_from_job_status>"}'

Download Transcript

Download the full transcript JSON for a completed job. Under the current backend defaults, results are typically retained for 7 days after completion and each job allows up to 20 attachment downloads.

GET/jobs/{job_id}/downloadDownload transcript JSON as an attachment

Example request

curl https://api.tuplets.ai/jobs/550e8400-e29b-41d4-a716-446655440000/download \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -o transcript.json

The downloaded JSON is the same nested object returned in result from GET /jobs/{job_id}. That means feature-specific fields are present only when the related feature ran.

Availability and abuse limits

Completed-job transcript JSON is usually available for about 7 days under the current retention setting.
Each job currently allows up to 20 attachment downloads. After that the API returns 429.
If the retained JSON has expired, the API returns 410.

Result Schema

The transcript JSON changes depending on which transcription model and optional features were requested.

Standard Model

Returns phrase-level segments. Can be combined with diarization, PII, and analytics.

Field	Type	Notes
text	string	Full transcript text. Redacted when PII processing ran.
segments[]	array	Phrase-level segments with start, end, and text.
segments[].speaker	string?	Speaker label (e.g. SPEAKER_00). Present when diarization=true.
segments[].avg_logprob	number?	Decoder confidence (standard, non-diarized). Omitted on diarized segments.
segments[].no_speech_prob	number?	No-speech probability (standard, non-diarized). Omitted on diarized segments.
segments[].language	string?	Per-segment language when detected (standard path).
language	string?	Detected or requested language code.
audio_duration_seconds	number?	Duration of the source audio.
transcription_meta	object?	Coverage metrics (coverage_ratio, total_covered_seconds) plus decode health fields when available.
diarization	object?	Speaker count, confidence (medium or low), and warning. Present when diarization=true.
diarization.speaker_embeddings	object?	Per-speaker embedding vectors (model, dimension, vectors). Present when speaker_embeddings=true on premium diarized jobs.
pii_findings	array?	Span-level PII findings. Present when pii_processing=true and the PII pass ran.
feature_execution	object?	Execution metadata for transcription tier, diarization, PII, generic analytics, and custom analytics timing.
processing_warnings	array?	Warnings when a feature was skipped or downgraded.
analytics	object?	Canonical analytics payload (profile, modules, and module outputs). Present when analytics was requested.

Premium Model

Returns word-level words[] timestamps when forced alignment succeeds, plus phrase-level segments[] grouped from those words. Can be combined with diarization=true; in that case segments[] includes speaker labels and is the primary timeline for playback and analytics. If alignment fails, you may receive transcript-only segments[] without words[].

Field	Type	Notes
text	string	Full transcript text. Redacted when PII processing ran.
words[]	array?	Word-level timestamps. Each item has start, end, word, and language. Omitted if alignment did not run.
segments[]	array	Phrase-level segments. Include speaker when diarization=true.
segments[].speaker	string?	Speaker label (e.g. SPEAKER_00). Present when diarization=true.
language	string?	Detected or requested language code.
audio_duration_seconds	number?	Duration of the source audio.
transcription_meta	object?	Coverage metrics plus decode health fields when available.
diarization	object?	Speaker count, confidence, and warning. Present when diarization=true.
diarization.speaker_embeddings	object?	Per-speaker embedding vectors. Present when speaker_embeddings=true.
diarization.speaker_embeddings.model	string	Embedding model identifier (e.g. pyannote/speaker-diarization-community-1).
diarization.speaker_embeddings.dimension	number	Length of each embedding vector.
diarization.speaker_embeddings.vectors	object	Map of speaker label (e.g. SPEAKER_00) to float array.
pii_findings	array?	Span-level PII findings. Present when pii_processing=true and the PII pass ran.
feature_execution	object?	Execution metadata for transcription tier, diarization, speaker embeddings, PII, and analytics timing.
feature_execution.speaker_embeddings_requested	boolean?	Whether speaker_embeddings was requested.
feature_execution.speaker_embeddings_applied	boolean?	Whether embedding vectors were included in the result.
processing_warnings	array?	Warnings when a feature was skipped or downgraded.
analytics	object?	Canonical analytics payload. Present when analytics was requested.

Advanced and diagnostic fields

These fields are optional. Most integrations can ignore them; they are useful for debugging decode quality, routing, and partial feature failures.

Field	Type	Notes
decode_attempts[]	array?	Per-attempt decode profiles, params, decisions, and health metrics (standard path).
routing_metadata	object?	Internal routing codes (route_code, engine_code, align_code) when segments and duration are known.
transcription_meta.decode_attempts	number?	Count of decode attempts included in transcription_meta.
transcription_meta.issues	string[]?	Decode quality flags such as low_coverage or repetition_detected.
pii_processing_error	string?	Present when PII processing was requested but failed; findings may be empty.
pii_review_required	boolean?	True when PII output needs manual review after a processing error.
feature_execution.pii_processing_skip_reason	string?	e.g. unsupported_language when PII was skipped.
feature_execution.transcription_model_requested	string?	Requested transcription tier: standard or premium.
feature_execution.transcription_model_applied	string?	Transcription tier that actually ran. Present on completed jobs.
analytics.statistics.advisory_limit_exceeded	string?	Present when generic analytics output hit an advisory cap.
analytics.generic_skipped_reason	string?	Present when a generic profile was requested but the extraction path was skipped or downgraded.

Integration guidance

Model your client types so optional feature blocks are nullable or absent instead of required.
Do not assume PII ran just because pii_processing=true on the job. Check result.feature_execution.pii_processing_applied and result.processing_warnings for skip or degradation messages.
For analytics, read job.analytics on status responses and result.analytics on completed jobs. Use result.feature_execution.analytics_applied and result.feature_execution.custom_analytics_applied to confirm each path ran.
For downloaded JSON, read result.feature_execution.transcription_model_requested and result.feature_execution.transcription_model_applied to know which transcription tier produced the file. The job status field transcription_model mirrors the requested tier.
If you need a file download, GET /jobs/{job_id}/download returns the same JSON object as the nested result field.

Browser Upload Flow

For large files (up to 1 GB), use the direct-to-storage upload flow to avoid proxying through your server.

The browser upload flow has three steps:

Request an upload target via POST /jobs/upload-target.
Upload the file directly to the returned URL using a PUT request.
Submit the job referencing uploaded_audio_key and uploaded_audio_token.

Step 1: Get an upload target

POST/jobs/upload-targetGet a signed upload URL

Example request

curl -X POST https://api.tuplets.ai/jobs/upload-target \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"filename": "meeting.mp3", "content_type": "audio/mpeg"}'

Step 2: Upload the file directly

Example request

# Upload directly using the signed URL
curl -X PUT "{upload_url}" \
  -H "Content-Type: audio/mpeg" \
  --data-binary @meeting.mp3

Step 3: Submit the job

Example request

curl -X POST https://api.tuplets.ai/jobs \
  -H "Authorization: Bearer $TUPLETS_API_KEY" \
  -F "uploaded_audio_key={object_key}" \
  -F "uploaded_audio_token={upload_token}" \
  -F "language=en" \
  -F "diarization=true"

Analytics Output

Completed jobs return result.analytics when analytics was requested. This is the canonical analytics payload for generic profiles and custom analytics.

Submit analytics with the analytics form field. Profile basic maps to generic tier fast; profile full maps to tier deep. Custom domain packs and custom schemas bill separately and can be combined with either profile.

Generic profile and custom analytics each require at least 15 seconds of audio. Evidence arrays on audit fields are hydrated deterministically from transcript segments after the model returns cited segment IDs.

Model hosting boundary

We do not send customer audio or transcripts to third-party LLM APIs for processing. Inference runs inside our controlled pipeline using self-hosted models.

Full profile output caps

Full profile (deep tier) generic modules are intentionally capped so the structured JSON stays complete on long recordings. Treat these as maximums, not guaranteed counts.

evidence is hydrated from at most 3 transcript segments per item.
qa_pairs: up to 12 items.
action_items: up to 12 items.
key_topics: up to 12 items.
risks: up to 12 items.
missed_opportunities: up to 12 items.
chapters: up to 12 items.
action_items[*].source_quote is derived from the first hydrated evidence snippet rather than copied verbatim from model output.

Basic Profile Example

Representative result.analytics for profile basic (tier fast).

{
  "profile": "basic",
  "tier": "fast",
  "modules": [
    "summary",
    "sentiment",
    "key_topics",
    "statistics",
    "speaker_stats",
    "chapters",
    "keywords"
  ],
  "summary": "A product review meeting covering launch readiness, one support blocker, and the agreed next steps.",
  "sentiment": {
    "overall": "neutral",
    "explanation": "The discussion is mostly procedural, with mild urgency around a remaining launch blocker.",
    "distribution": {
      "neutral": {
        "count": 10,
        "percentage": 71.4,
        "avg_score": 0.81
      },
      "positive": {
        "count": 4,
        "percentage": 28.6,
        "avg_score": 0.73
      }
    },
    "per_speaker": [
      {
        "speaker": "SPEAKER_00",
        "positive_count": 1,
        "neutral_count": 4,
        "negative_count": 0,
        "total_segments": 5,
        "avg_score": 0.79
      }
    ],
    "segments": [
      {
        "start": 0,
        "end": 14.2,
        "speaker": "SPEAKER_00",
        "text": "Let's review launch readiness and open blockers.",
        "label": "neutral",
        "score": 0.81
      }
    ]
  },
  "key_topics": [
    {
      "topic": "Launch readiness",
      "evidence": [
        {
          "start": 0,
          "end": 14.2,
          "speaker": "SPEAKER_00",
          "text": "Let's review launch readiness and open blockers."
        }
      ]
    },
    {
      "topic": "Support approval blocker",
      "evidence": [
        {
          "start": 48.1,
          "end": 56.4,
          "speaker": "SPEAKER_02",
          "text": "Support still needs to approve the migration warning copy."
        }
      ]
    }
  ],
  "action_items": [],
  "qa_pairs": [],
  "decisions": [],
  "risks": [],
  "missed_opportunities": [],
  "chapters": [
    {
      "title": "Open blockers",
      "start": 0,
      "end": 58,
      "summary": "The team reviewed support and rollout blockers."
    },
    {
      "title": "Next steps",
      "start": 58,
      "end": 108,
      "summary": "The team aligned on rollout timing and follow-up work."
    }
  ],
  "urgency": {
    "score": 1,
    "reasoning": ""
  },
  "keywords": [
    "launch readiness",
    "support approval blocker",
    "rollout checklist"
  ],
  "statistics": {
    "word_count": 842,
    "segment_count": 14,
    "total_duration_seconds": 108.6,
    "speaking_rate_wpm": 465.2
  },
  "speaker_stats": [
    {
      "speaker": "SPEAKER_00",
      "talk_time_seconds": 37.8,
      "word_count": 290,
      "segment_count": 5,
      "percentage": 34.8
    },
    {
      "speaker": "SPEAKER_01",
      "talk_time_seconds": 42.3,
      "word_count": 340,
      "segment_count": 5,
      "percentage": 39
    }
  ],
  "skipped_reason": null
}

Full Profile Example

Representative result.analytics for profile full (tier deep).

{
  "profile": "full",
  "tier": "deep",
  "modules": [
    "summary",
    "sentiment",
    "key_topics",
    "action_items",
    "qa",
    "decisions",
    "risks",
    "missed_opportunities",
    "urgency",
    "chapters",
    "keywords",
    "statistics",
    "speaker_stats"
  ],
  "summary": "A product review meeting covering launch readiness, support risks, and two follow-up actions.",
  "sentiment": {
    "overall": "neutral",
    "explanation": "The conversation was mostly procedural with one positive confirmation at the end.",
    "distribution": {
      "neutral": {
        "count": 10,
        "percentage": 71.4,
        "avg_score": 0.81
      },
      "positive": {
        "count": 4,
        "percentage": 28.6,
        "avg_score": 0.73
      }
    },
    "per_speaker": [
      {
        "speaker": "SPEAKER_00",
        "positive_count": 1,
        "neutral_count": 4,
        "negative_count": 0,
        "total_segments": 5,
        "avg_score": 0.79
      }
    ],
    "segments": [
      {
        "start": 0,
        "end": 14.2,
        "speaker": "SPEAKER_00",
        "text": "Let's review launch readiness and open blockers.",
        "label": "neutral",
        "score": 0.81
      }
    ]
  },
  "key_topics": [
    {
      "topic": "Launch readiness",
      "evidence": [
        {
          "start": 0,
          "end": 14.2,
          "speaker": "SPEAKER_00",
          "text": "Let's review launch readiness and open blockers."
        }
      ]
    }
  ],
  "action_items": [
    {
      "action": "Publish the revised rollout checklist",
      "owner": "SPEAKER_01",
      "due_date": "Friday",
      "confidence": "high",
      "source_quote": "I'll send the revised rollout checklist by Friday.",
      "evidence": [
        {
          "start": 92.5,
          "end": 100.3,
          "speaker": "SPEAKER_01",
          "text": "I'll send the revised rollout checklist by Friday."
        }
      ]
    }
  ],
  "qa_pairs": [
    {
      "question": "Are there any open support blockers?",
      "answer": "Only the migration warning copy still needs review.",
      "evidence": [
        {
          "start": 31.4,
          "end": 36.2,
          "speaker": "SPEAKER_00",
          "text": "Are there any open support blockers?"
        }
      ]
    }
  ],
  "decisions": [
    {
      "decision": "Proceed with the staged rollout next week",
      "evidence": [
        {
          "start": 66.8,
          "end": 73.1,
          "speaker": "SPEAKER_00",
          "text": "We'll keep the staged rollout for next week."
        }
      ]
    }
  ],
  "risks": [
    {
      "risk": "Migration warning copy may delay support approval",
      "severity": "high",
      "evidence": [
        {
          "start": 48.1,
          "end": 56.4,
          "speaker": "SPEAKER_02",
          "text": "Support still needs to approve the migration warning copy."
        }
      ]
    }
  ],
  "missed_opportunities": [
    {
      "opportunity": "No owner was assigned for the support approval follow-up",
      "type": "follow-up",
      "impact": "medium",
      "evidence": [
        {
          "start": 48.1,
          "end": 56.4,
          "speaker": "SPEAKER_02",
          "text": "Support still needs to approve the migration warning copy."
        }
      ]
    }
  ],
  "chapters": [
    {
      "title": "Open blockers",
      "start": 0,
      "end": 58,
      "summary": "The team reviewed support and rollout blockers."
    },
    {
      "title": "Next steps",
      "start": 58,
      "end": 108,
      "summary": "The team agreed on rollout timing and follow-up actions."
    }
  ],
  "urgency": {
    "score": 6,
    "reasoning": "A blocker remains open before next week's rollout."
  },
  "keywords": [
    "launch readiness",
    "support blockers",
    "rollout checklist"
  ],
  "statistics": {
    "word_count": 842,
    "segment_count": 14,
    "total_duration_seconds": 108.6,
    "speaking_rate_wpm": 465.2
  },
  "speaker_stats": [
    {
      "speaker": "SPEAKER_00",
      "talk_time_seconds": 37.8,
      "word_count": 290,
      "segment_count": 5,
      "percentage": 34.8
    },
    {
      "speaker": "SPEAKER_01",
      "talk_time_seconds": 42.3,
      "word_count": 340,
      "segment_count": 5,
      "percentage": 39
    }
  ],
  "skipped_reason": null
}

Domain Pack Output Example

Representative result.analytics.custom for domain insurance (schema_version 2026-05-31).

{
  "schema_version": "2026-05-31",
  "domain": "insurance",
  "fields": {
    "call_reason": {
      "status": "matched",
      "selected": [
        {
          "id": 1,
          "code": "CLAIM_STATUS",
          "label": "Claim status",
          "confidence": 0.91,
          "evidence_segment_ids": [
            4,
            5
          ]
        }
      ],
      "reasoning": "The caller asked for an update on an existing claim."
    },
    "policy_line": {
      "status": "matched",
      "selected": [
        {
          "id": 1,
          "code": "AUTO",
          "label": "Auto",
          "confidence": 0.88,
          "evidence_segment_ids": [
            2,
            3
          ]
        },
        {
          "id": 2,
          "code": "HOME",
          "label": "Home",
          "confidence": 0.76,
          "evidence_segment_ids": [
            7
          ]
        }
      ],
      "reasoning": "The caller referenced both auto and home policies during the conversation."
    },
    "claim_stage": {
      "status": "no_match",
      "selected": [],
      "reasoning": "The transcript does not contain enough evidence to determine claim stage."
    },
    "cancellation_risk": {
      "status": "matched",
      "value": 2,
      "confidence": 0.74,
      "evidence_segment_ids": [
        6
      ],
      "reasoning": "The caller expressed mild frustration but did not ask to cancel."
    },
    "missed_retention_opportunity": {
      "status": "matched",
      "value": false,
      "confidence": 0.82,
      "evidence_segment_ids": [
        8,
        9
      ],
      "reasoning": "The agent addressed the caller's billing concern directly."
    },
    "compliance_disclosure_present": {
      "status": "matched",
      "value": true,
      "confidence": 0.95,
      "evidence_segment_ids": [
        0
      ],
      "reasoning": "The call opening included a recording disclosure."
    }
  },
  "skipped_reason": null
}

Field Reference

result.analytics contains generic profile fields, custom fields, or both depending on the analytics request.

Field	Type	Description
profile	string \| null	Requested analytics profile: basic, full, or null for custom-only jobs.
tier	string \| null	Resolved generic analytics tier: fast for profile basic, deep for profile full, or null for custom-only jobs.
modules	array	Module keys included in this response (summary, sentiment, key_topics, action_items, qa, decisions, risks, missed_opportunities, urgency, chapters, keywords, statistics, speaker_stats).
summary	string	Compact narrative summary of the recording.
sentiment	object	Overall label, explanation, distribution, per-speaker rollup, and segment timeline.
key_topics	array	Topics with evidence snippets using transcript timestamps and speaker labels. Maximum 12 items.
action_items	array	Full profile only. Tasks, owners, due dates, confidence, derived source quotes, and evidence. Basic profile returns an empty array.
qa_pairs	array	Full profile only. Question and answer pairs with supporting evidence. Basic profile returns an empty array.
decisions	array	Full profile only. Decisions made in the recording with evidence references. Basic profile returns an empty array.
risks	array	Full profile only. Risks or blockers with severity and evidence. Basic profile returns an empty array.
missed_opportunities	array	Full profile only. Follow-up, clarification, or unanswered openings surfaced from the transcript. Basic profile returns an empty array.
chapters	array	Topical time ranges with a title and short summary. Maximum 12 items.
urgency	object	Full profile only for substantive scoring. Basic profile returns the default object with score=1 and empty reasoning.
keywords	array	CPU-extracted keywords for quick indexing and filtering.
statistics	object	Word count, segment count, total duration, and speaking rate.
speaker_stats	array \| null	Per-speaker talk time, word count, segment count, and share of speaking time when diarization is enabled.
skipped_reason	string \| null	Present when a generic profile was requested but the full extraction path was skipped or downgraded.
custom	object?	Domain-pack or schema-driven custom fields when custom analytics ran.

Custom Analytics Request

Field	Type	Description
profile	basic \| full \| none \| off	Optional generic analytics profile. Use none/off for custom-only extraction.
domain	string	Optional built-in domain pack. When set, the pack's fields are injected automatically — see Built-in Domain Packs below for keys, types, and option codes.
schema.schema_version	string	Optional caller-defined schema version. Defaults to 2026-05-31.
schema.fields	array	Optional custom field definitions. When domain is set, pack fields are injected automatically and caller-defined fields are appended. Duplicate keys are rejected at validation.
schema.fields[].key	string	Stable snake_case field key. Must start with a lowercase letter and contain lowercase letters, numbers, or underscores.
schema.fields[].label	string	Human-readable field label.
schema.fields[].type	single_select \| multi_select \| boolean \| rating	Controls the validated output shape.
schema.fields[].description	string?	Optional extraction guidance for this field.
schema.fields[].required	boolean?	Marks the field as expected by the caller; unmatched required fields still return status no_match.
schema.fields[].options	array?	Required for select fields. Options include id, uppercase code, and label.
schema.fields[].min / max	integer?	Bounds for rating fields. Defaults are 1 and 5.

Custom Analytics Output

Field	Type	Description
custom.schema_version	string \| null	Schema version used for extraction.
custom.domain	string \| null	Domain pack key when one was requested.
custom.fields	object	Map of custom field key to normalized field result.
custom.fields.*.status	matched \| no_match \| skipped	Whether the field matched transcript evidence, had no supported answer, or was skipped.
custom.fields.*.selected	array	Selected options for single_select and multi_select fields, each with id, code, label, confidence, and evidence_segment_ids.
custom.fields.*.value	boolean \| number \| null	Boolean or rating value for boolean and rating fields.
custom.fields.*.confidence	number?	Model confidence for boolean/rating fields when matched.
custom.fields.*.evidence_segment_ids	number[]	Transcript segment indexes supporting the classification. Maximum 8 per field.
custom.fields.*.reasoning	string	Short explanation for the selected value or no_match result.
custom.skipped_reason	string \| null	Present when custom analytics could not run or generation failed.

Built-in Domain Packs

Stable field catalogs for insurance, support, sales, healthcare, and financial_services. Requesting a domain alone injects all pack fields — schema.fields is optional. Caller-defined fields are appended after pack fields; duplicate keys are rejected at validation. Every field key below appears in custom.fields on completed jobs; option codes are stable across API versions.

Supported domain keys: insurance, support, sales, healthcare, financial_services.

Insurance (insurance)

Claims, policy lines, retention signals, and compliance disclosures for insurance call centers.

Key	Label	Type	Required	Options / bounds	Description
call_reason	Reason for call	single_select	yes	CLAIM_STATUS, NEW_POLICY, POLICY_CHANGE, BILLING, COVERAGE_QUESTION, CANCELLATION, OTHER	—
policy_line	Policy line	multi_select	no	AUTO, HOME, RENTERS, LIFE, HEALTH, COMMERCIAL, UNKNOWN	—
claim_stage	Claim stage	single_select	no	FIRST_NOTICE, DOCUMENT_COLLECTION, ADJUSTER_REVIEW, SETTLEMENT, DENIAL_APPEAL, NOT_A_CLAIM	—
cancellation_risk	Cancellation risk	rating	no	1–5	1 means no cancellation signal; 5 means explicit cancellation intent.
missed_retention_opportunity	Missed retention opportunity	boolean	no	—	True when the agent could have addressed cancellation, price, or coverage concern but did not.
compliance_disclosure_present	Compliance disclosure present	boolean	no	—	True when required disclosure, recording notice, or authorization language is present.

Support (support)

Contact reasons, resolution outcomes, escalation signals, and churn risk for customer support interactions.

Key	Label	Type	Required	Options / bounds	Description
contact_reason	Contact reason	single_select	yes	ACCOUNT_ACCESS, BILLING, TECHNICAL_ISSUE, ORDER_STATUS, REFUND, CANCELLATION, OTHER	—
resolution_status	Resolution status	single_select	no	RESOLVED, PARTIALLY_RESOLVED, ESCALATED, CALLBACK_REQUIRED, UNRESOLVED	—
escalation_required	Escalation required	boolean	no	—	—
churn_risk	Churn risk	rating	no	1–5	—
refund_requested	Refund requested	boolean	no	—	—

Sales (sales)

Buying intent, pipeline stage, objections, and upsell signals for sales conversations.

Key	Label	Type	Required	Options / bounds	Description
buying_intent	Buying intent	rating	no	1–5	1 means no interest; 5 means explicit intent to buy or advance.
lead_stage	Lead stage	single_select	no	UNQUALIFIED, QUALIFYING, QUALIFIED, PROPOSAL, NEGOTIATION, CLOSED_WON_SIGNAL, CLOSED_LOST_SIGNAL	—
objection_type	Objection type	multi_select	no	PRICE, TIMING, AUTHORITY, NEED, COMPETITOR, IMPLEMENTATION, NONE	—
competitor_mentioned	Competitor mentioned	boolean	no	—	—
next_step_committed	Next step committed	boolean	no	—	—
missed_upsell_opportunity	Missed upsell opportunity	boolean	no	—	—

Healthcare (healthcare)

Patient intent, escalation needs, referral issues, and consent language for healthcare front-desk calls.

Key	Label	Type	Required	Options / bounds	Description
patient_intent	Patient intent	single_select	no	SCHEDULE_APPOINTMENT, RESCHEDULE, PRESCRIPTION, TEST_RESULTS, BILLING_INSURANCE, SYMPTOM_TRIAGE, OTHER	—
escalation_required	Clinical or administrative escalation required	boolean	no	—	—
referral_issue	Referral issue	boolean	no	—	—
consent_language_present	Consent language present	boolean	no	—	—

Financial Services (financial_services)

Service reasons, fraud signals, complaint risk, and regulatory language for banking and fintech calls.

Key	Label	Type	Required	Options / bounds	Description
service_reason	Service reason	single_select	no	ACCOUNT_ACCESS, TRANSACTION_DISPUTE, FRAUD_CONCERN, KYC_DOCUMENTS, LOAN_OR_CREDIT, COMPLAINT, OTHER	—
fraud_signal	Fraud signal	boolean	no	—	—
complaint_risk	Complaint risk	rating	no	1–5	—
regulatory_language_present	Regulatory language present	boolean	no	—	—

Common Use Cases

How teams leverage structured analytics across different scenarios.

Sales Calls

Extract next steps, objections, competitor mentions, and buyer sentiment. Auto-generate CRM notes and follow-up tasks from action_items and risks.

Customer Support

Surface escalations, sentiment drops, and unresolved issues via qa_pairs and sentiment.segments. Identify frustrated customers before they churn.

Internal Meetings

Track decisions, action items, and topic coverage across recurring meetings. Use speaker_stats and sentiment.per_speaker to monitor participation balance.

Recruiting Interviews

Evaluate candidate responses, extract Q&A patterns, and generate structured interview summaries with qa_pairs, summary, and key_topics.

Audio Formats & Constraints

Supported input formats and file constraints for transcription jobs.

Format	Extension	Notes
MP3	.mp3	Most common. Good balance of quality and size.
WAV	.wav	Uncompressed. Best quality but largest size.
FLAC	.flac	Lossless compression. High quality.
OGG	.ogg	Open format. Good for web use.
M4A	.m4a	Common in Apple ecosystem.
AAC	.aac	Advanced Audio Codec.
WMA	.wma	Windows Media Audio.
WebM	.webm	Web-optimized format.
Opus	.opus	Low latency, high compression.

Constraints

Minimum audio duration: 5 seconds
Maximum audio duration: 8 hours (WAV files are further limited by the 4 GiB RIFF container)
Maximum file size (all submission methods — direct upload, browser upload, and remote URL): 1 GB
Generic profile analytics and custom analytics each require at least 15 seconds of audio
At most one generic profile (basic or full) per job; custom domain packs can be added on top
Remote URLs must be publicly accessible HTTP/HTTPS endpoints pointing directly to media on a public internet host; private and localhost-style targets are rejected, including redirect targets

Supported Languages

Standard transcription supports 96 languages. Premium transcription supports Auto-detect plus 30 languages.

Specify language using ISO 639-1 two-letter codes (e.g., en for English, es for Spanish, ja for Japanese). Use auto or omit the parameter for automatic language detection.

Premium uses a narrower language set today: ar, cs, da, de, el, en, es, fa, fi, fil, fr, hi, hu, id, it, ja, ko, mk, ms, nl, pl, pt, ro, ru, sv, th, tr, vi, yue, zh. If you need a language outside that list, use standard.

Browse standard-tier language codes

af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, cs, cy, da, de, el, en, es, et, eu, fa, fi, fo, fr, gl, gu, ha, haw, hi, hr, hu, hy, id, is, it, ja, jw, ka, kk, km, kn, ko, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, ro, ru, sa, sd, si, sk, sl, sn, so, sr, su, sv, sw, ta, te, tg, th, tk, tl, tr, tt, uk, ur, uz, vi, yi, yo, zh

Pricing

You are charged per-second of audio processed, with a minimum charge based on the selected feature bundle. New accounts receive $2.50 in free credits to get started.

Rules that affect what you pay

Premium transcription is a standalone tier billed at its published hourly rate and carries a minimum charge of $0.035 per job.
Generic profile and custom analytics each require at least 15 seconds of audio.
Basic analytics minimum charge: $0.08. Full analytics minimum: $0.125. Custom analytics minimum: $0.15 (stackable with a profile).
Premium supports Auto-detect plus these languages: ar, cs, da, de, el, en, es, fa, fi, fil, fr, hi, hu, id, it, ja, ko, mk, ms, nl, pl, pt, ro, ru, sv, th, tr, vi, yue, zh.
At most one generic profile per job; custom analytics can be added to any profile.
Cancellation does not add a separate fee, and cancelled jobs are billed under the normal pricing rules.

Feature	Rate	Minimum charge	Required
Standard transcription	`$0.14 / hour`	`$0.01`	Always included
Premium transcription	`$0.25 / hour`	`$0.035`	Always included
Speaker diarization	`$0.08 / hour`	`$0.035`	Optional
PII processing	`$0.06 / hour`	`$0.035`	Optional
Basic analytics	`$0.24 / hour`	`$0.08`	Optional
Full analytics	`$0.60 / hour`	`$0.125`	Optional
Custom analytics	`$0.45 / hour`	`$0.15`	Optional

Bundle	Minimum billable charge
Transcript only	$0.01
Premium transcription only	$0.035
Transcript + diarization	$0.035
Transcript + PII	$0.035
Premium transcription + PII	$0.08
Transcript + basic analytics	$0.08
Transcript + full analytics	$0.125
Transcript + custom analytics	$0.15
Transcript + full analytics + custom	$0.15
Transcript + diarization + PII	$0.065
Any bundle that includes basic analytics	$0.08
Any bundle that includes full analytics	$0.125

Rate Limits

Rate limits protect the platform from abuse. Limits apply per IP address and per user account.

Scope	Limit	Window
Global (IP)	300 requests	5 minutes
Job creation (IP)	300 requests	1 hour
Job creation (user)	120 requests	1 hour
Job creation burst (user)	20 requests	5 minutes
Download (IP)	120 requests	1 hour
Download (user)	40 requests	15 minutes

When rate limited

The API returns 429 Too Many Requests. Implement exponential backoff in your integration before retrying.

Error Codes

Every error response includes a detail field explaining what went wrong and how to fix it.

Code	Label	Detail	Fix
`400`	Bad Request	Invalid input, unsafe remote URL, or missing required field.	Check your request parameters and ensure remote URLs use a public HTTP/HTTPS host.
`401`	Unauthorized	Missing or invalid API key / session.	Provide a valid Authorization: Bearer header.
`402`	Insufficient Credit	Account balance is too low for the estimated cost.	Top up your account and retry.
`403`	Forbidden	Email not verified or action not allowed.	Verify your email address first.
`404`	Not Found	The requested resource does not exist.	Check the job ID or resource path.
`409`	Conflict	Concurrent job limit reached or resource unavailable.	Wait for transcription to complete.
`410`	Gone	Result download is no longer available.	Download results within the published retention window.
`429`	Rate Limited	Too many requests in the current window.	Back off and retry with exponential delay.
`503`	Service Unavailable	Processing capacity temporarily unavailable.	Retry after a short delay.

Error responses follow this format:

{
  "detail": "You do not have enough credit for this job."
}

Best Practices

Recommendations for building a reliable integration.

Poll with backoff

Poll job status every 2–5 seconds. Use exponential backoff for longer-running jobs. Most jobs under 1 hour complete within 1–5 minutes.

Store cancel tokens

Save the cancel_token returned by POST /jobsor job status responses if you need to implement user-facing cancellation for queued or running jobs. The token must still be used by an authenticated account request. There is no separate cancellation fee, and cancelled jobs are billed under the normal pricing rules.

Use browser uploads for large files

Files over 25 MB should use the browser upload flow. This avoids double-proxying through your server and is significantly faster.

Specify language for accuracy

Automatic detection is accurate, but specifying the language code yields better results — especially for code-switching or accented speech.

Download results promptly

Transcript JSON downloads are typically available for about 7 days after job completion under the current retention setting, and each job currently has a hard cap of 20 attachment downloads. Download and archive results you need to keep.

Handle rate limits gracefully

Check for 429 responses and implement retry logic with exponential backoff. Avoid assuming a Retry-After header is present.