Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.case.dev/llms.txt

Use this file to discover all available pages before exploring further.

Speaker identification, 100+ languages, word-level timestamps. Perfect for depositions, hearings, and interviews.
Endpoint
POST /voice/transcription
Upload your audio to a vault, then transcribe with automatic result storage. The transcript is saved back to your vault when complete.
# 1. Get upload URL
curl -X POST https://api.case.dev/vault/VAULT_ID/upload \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"filename": "deposition.mp3", "contentType": "audio/mpeg", "auto_index": false}'

# 2. Upload file to the returned uploadUrl (PUT request)

# 3. Start transcription
curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "vault_id": "VAULT_ID",
    "object_id": "OBJECT_ID",
    "speaker_labels": true
  }'
Response
{
  "id": "tr_rvy731o5zxur0dg72sh3mjar",
  "status": "processing",
  "vault_id": "vault_abc123",
  "source_object_id": "obj_xyz789"
}

Get Results (Vault Mode)

curl https://api.case.dev/voice/transcription/tr_rvy731o5zxur0dg72sh3mjar \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY"
Response (completed)
{
  "id": "tr_rvy731o5zxur0dg72sh3mjar",
  "status": "completed",
  "vault_id": "vault_abc123",
  "source_object_id": "obj_xyz789",
  "result_object_id": "obj_abc456",
  "audio_duration": 238,
  "word_count": 594,
  "confidence": 97
}
Vault Mode Benefits:
  • Transcript automatically saved to your vault
  • No webhook setup required
  • Simpler polling with result_object_id
  • Audio stored securely in your vault

Direct URL Mode

For audio hosted elsewhere, provide a public URL directly.
curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "https://storage.example.com/deposition.m4a",
    "speaker_labels": true,
    "auto_chapters": true
  }'
Response
{
  "id": "474e21cf-fd65-45d4-97fd-87558f7caf9b",
  "status": "queued",
  "audio_url": "https://storage.example.com/deposition.m4a",
  "created_at": "2025-11-04T09:15:30Z"
}

Parameters

Vault Mode

ParameterTypeRequiredDescription
vault_idstringYesVault containing the audio file
object_idstringYesObject ID of the audio file
formatstringNoOutput format: json (default) or text
speaker_labelsbooleanNoIdentify different speakers
language_codestringNoLanguage code (auto-detected if omitted)

Direct URL Mode

ParameterTypeRequiredDescription
audio_urlstringYesURL to audio/video file (max 5GB, 10 hours)
webhook_urlstringNoURL for completion notification

Shared Options

ParameterTypeDefaultDescription
speaker_labelsbooleanfalseIdentify different speakers
speakers_expectednumberExpected number of speakers
language_codestringautoLanguage code (en, es, fr, de, etc.)
speech_modelsarray["universal-3-pro", "universal-2"]Priority-ordered speech models to use
punctuatebooleantrueAdd punctuation
format_textbooleantrueFormat numbers, dates, etc.
word_boostarrayBoost specific words (e.g., legal terms)
auto_highlightsbooleanfalseDetect key phrases
content_safety_labelsbooleanfalseFlag sensitive content

Get Results (Direct URL Mode)

curl https://api.case.dev/voice/transcription/JOB_ID \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY"
Response (completed)
{
  "id": "474e21cf-fd65-45d4-97fd-87558f7caf9b",
  "status": "completed",
  "audio_duration": 3847000,
  "confidence": 0.94,
  "text": "Q: Can you state your name for the record?\nA: My name is Dr. Sarah Johnson...",
  "utterances": [
    {
      "speaker": "A",
      "text": "Can you state your name for the record?",
      "start": 120,
      "end": 2450
    },
    {
      "speaker": "B",
      "text": "My name is Dr. Sarah Johnson.",
      "start": 2450,
      "end": 4820
    }
  ],
  "chapters": [
    {
      "headline": "Witness Introduction",
      "summary": "Introduction and witness identification",
      "start": 120,
      "end": 15000
    }
  ]
}

Status Values

StatusMeaning
queuedWaiting to start
processingTranscribing
completedDone, results ready
failedError occurred

Processing Times

Audio LengthTime
1 minute~15 seconds
10 minutes~1-2 minutes
1 hour~8-10 minutes
3 hours~20-30 minutes

Examples

Deposition with Speaker Labels (Vault Mode)

casedev voice:transcription create \
  --vault-id vault_depositions \
  --object-id "$OBJECT_ID" \
  --speaker-labels \
  --speakers-expected 4 \
  --word-boost "plaintiff,defendant,objection,sustained,overruled"

Court Recording (Direct URL with Webhook)

casedev voice:transcription create \
  --audio-url "https://storage.example.com/3-hour-hearing.m4a" \
  --speaker-labels \
  --webhook-url "https://your-app.com/webhooks/transcription"

Supported Formats

Audio: MP3, M4A, WAV, FLAC, OGG, OPUS, WebM
Video: MP4, WebM, MOV, AVI, MKV (audio track extracted)
Languages: 100+ including English, Spanish, French, German, Chinese, Japanese
Pricing: $0.01/minute. A 2-hour deposition costs $1.20.