Process document

Submit a document for OCR. We extract text, detect tables, and optionally generate a searchable PDF. Processing is async — you get a job ID immediately, then poll for results or use webhooks.

Endpoint

POST /ocr/v1/process

curl -X POST https://api.case.dev/ocr/v1/process \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://storage.example.com/scanned-deposition.pdf"
  }'

Response

{
  "id": "1f4a195e-026b-41ff-b367-c61089f5f367",
  "status": "pending",
  "document_url": "https://storage.example.com/scanned-deposition.pdf",
  "engine": "doctr",
  "created_at": "2025-11-04T09:30:12Z",
  "links": {
    "self": "https://api.case.dev/ocr/v1/1f4a195e-026b-41ff-b367-c61089f5f367",
    "text": "https://api.case.dev/ocr/v1/1f4a195e-026b-41ff-b367-c61089f5f367/download/text",
    "json": "https://api.case.dev/ocr/v1/1f4a195e-026b-41ff-b367-c61089f5f367/download/json"
  }
}

Parameters

Required

Parameter	Type	Description
`document_url`	string	URL to your document. HTTP/HTTPS or `s3://`

Optional

Parameter	Type	Default	Description
`document_id`	string	auto-generated	Your internal reference ID
`engine`	string	`doctr`	OCR engine (see below)
`callback_url`	string	—	Webhook URL for completion notification
`features`	object	`{}`	Additional processing options

OCR engines

Engine	Best for	Speed
`doctr`	Clean printed text, typed documents	Fast
`paddleocr`	Tables, forms, complex layouts, handwriting	Medium

For legal documents: Start with doctr. If you’re getting poor results on forms or tables, try paddleocr.

Features

Enable additional processing:

JSON

{
  "features": {
    "embed": {},          // Generate searchable PDF
    "tables": {           // Extract tables as CSV
      "format": "csv"
    }
  }
}

Checking status

Poll the job to check if processing is complete:

casedev ocr:v1 retrieve --id $JOB_ID

Using webhooks

For large documents, use webhooks instead of polling:

casedev ocr:v1 process \
  --document-url "https://storage.example.com/500-page-discovery.pdf" \
  --callback-url "https://your-app.com/api/ocr-complete"

We POST the completed job to your callback URL when processing finishes.

S3 URLs

If your document is in S3, use an s3:// URL:

casedev ocr:v1 process \
  --document-url "s3://your-bucket/documents/deposition.pdf"

We automatically generate a presigned URL to access the file.

Examples

Scanned deposition

casedev ocr:v1 process \
  --document-url "https://storage.example.com/deposition-smith.pdf" \
  --document-id smith-depo-2024 \
  --engine doctr \
  --features.embed '{}'

Medical records with tables

casedev ocr:v1 process \
  --document-url "https://storage.example.com/patient-records.pdf" \
  --engine paddleocr \
  --features.tables '{"format": "csv"}' \
  --features.embed '{}' \
  --callback-url "https://your-app.com/webhooks/ocr"

Handwritten notes

casedev ocr:v1 process \
  --document-url "https://storage.example.com/witness-notes.jpg" \
  --engine paddleocr

Get Started

Platform

Resources

Parameters

Required

Optional

OCR engines

Features

Checking status

Using webhooks

S3 URLs

Examples

Scanned deposition

Medical records with tables

Handwritten notes

Get Started

Platform

Resources

Documentation Index

​Parameters

​Required

​Optional

​OCR engines

​Features

​Checking status

​Using webhooks

​S3 URLs

​Examples

​Scanned deposition

​Medical records with tables

​Handwritten notes

Parameters

Required

Optional

OCR engines

Features

Checking status

Using webhooks

S3 URLs

Examples

Scanned deposition

Medical records with tables

Handwritten notes