Documentation Index Fetch the complete documentation index at: https://docs.case.dev/llms.txt
Use this file to discover all available pages before exploring further.
This is the core endpoint for all AI-powered features — summarization, extraction, analysis, drafting.
POST /llm/v1/chat/completions
cURL
CLI
Typescript
Python
C#
Java
PHP
Go
curl -X POST https://api.case.dev/llm/v1/chat/completions \
-H "Authorization: Bearer sk_case_YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.5",
"messages": [
{"role": "user", "content": "Summarize this deposition in 3 bullet points."}
]
}'
{
"id" : "gen_01K972J7KV4Y0MJZ3SRTA6YYMH" ,
"object" : "chat.completion" ,
"model" : "anthropic/claude-sonnet-4.5" ,
"choices" : [
{
"index" : 0 ,
"message" : {
"role" : "assistant" ,
"content" : "Here are the key points: \n\n • Witness testified that... \n • Documents reviewed include... \n • Timeline established from..."
},
"finish_reason" : "stop"
}
],
"usage" : {
"prompt_tokens" : 245 ,
"completion_tokens" : 87 ,
"total_tokens" : 332 ,
"cost" : 0.000105
}
}
Parameters
Required
Parameter Type Description messagesarray The conversation. Each message has a role and content.
Optional
Parameter Type Default Description modelstring casemark/core-largeWhich model to use. Browse all 195+ models → max_tokensnumber 4096 Maximum tokens to generate temperaturenumber 1 Randomness (0-2). Use 0 for factual tasks. streamboolean false Stream response token-by-token stoparray null Stop generation when these strings appear
Messages
Each message in the messages array:
Field Type Description rolestring system, user, or assistantcontentstring The message text
System prompts
Set the AI’s behavior with a system message:
CLI
Typescript
Python
C#
Java
PHP
Go
casedev llm:v1:chat create-completion \
--model anthropic/claude-sonnet-4.5 \
--message '{role: system, content: "You are a legal assistant. Be concise. Cite case law when relevant."}' \
--message '{role: user, content: "What are the elements of negligence?"}'
Multi-turn conversations
Include previous messages to maintain context:
CLI
Typescript
Python
C#
Java
PHP
Go
casedev llm:v1:chat create-completion \
--model openai/gpt-4o \
--message '{role: user, content: "What is a deposition?"}' \
--message '{role: assistant, content: "A deposition is sworn testimony taken outside of court..."}' \
--message '{role: user, content: "How long do they typically last?"}'
Streaming
Get responses token-by-token as they’re generated:
CLI
Typescript
Python
C#
Java
PHP
Go
casedev llm:v1:chat create-completion \
--model anthropic/claude-sonnet-4.5 \
--message '{role: user, content: "Write a case summary."}' \
--stream
Vision
Send images to models that support vision (Claude, GPT-4o):
const response = await client . llm . v1 . chat . createCompletion ({
model : 'anthropic/claude-sonnet-4.5' ,
messages : [
{
role : 'user' ,
content : [
{ type : 'text' , text : 'What medical equipment is visible in this image?' },
{ type : 'image_url' , image_url : { url : 'https://example.com/exhibit-a.jpg' } }
]
}
]
});
Usage and costs
Every response includes token counts and cost:
{
"usage" : {
"prompt_tokens" : 1245 ,
"completion_tokens" : 387 ,
"total_tokens" : 1632 ,
"cost" : 0.004896
}
}
Reduce costs: Use temperature: 0 for factual extraction. Try cheaper models like deepseek/deepseek-chat or qwen/qwen-2.5-72b-instruct for simpler tasks.
Common patterns
Deposition summary
CLI
Typescript
Python
C#
Java
PHP
Go
casedev llm:v1:chat create-completion \
--model anthropic/claude-sonnet-4.5 \
--message '{role: system, content: "Summarize depositions with: 1. Key admissions 2. Timeline of events 3. Credibility issues 4. Contradictions with other testimony"}' \
--message '{role: user, content: "<deposition text>"}' \
--temperature 0.3 \
--max-tokens 2000
CLI
Typescript
Python
C#
Java
PHP
Go
casedev llm:v1:chat create-completion \
--model openai/gpt-4o \
--message '{role: system, content: "Extract all indemnification clauses. Return JSON: [{clause_text, page, party_protected}]"}' \
--message '{role: user, content: "<contract text>"}' \
--temperature 0
Medical record review
CLI
Typescript
Python
C#
Java
PHP
Go
casedev llm:v1:chat create-completion \
--model anthropic/claude-opus-4 \
--message '{role: system, content: "You are a medical-legal expert. Identify standard-of-care deviations and timeline inconsistencies."}' \
--message '{role: user, content: "<medical records>"}' \
--max-tokens 5000