Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.case.dev/llms.txt

Use this file to discover all available pages before exploring further.

What You’ll Build

An intelligent agent that can:
  • Store documents in an encrypted vault with automatic OCR and embedding generation
  • Answer questions by retrieving relevant document chunks and synthesizing responses
  • Add knowledge dynamically as users provide new information
  • Cite sources with page numbers and document references

Why RAG?

Large Language Models are powerful, but they can only reason on their training data. RAG solves this by:
  1. Embedding your documents into a searchable vector space
  2. Retrieving relevant chunks when a user asks a question
  3. Augmenting the LLM’s context with those chunks
  4. Generating an accurate, grounded response
With Case.dev, you don’t need to manage embeddings, vector databases, or chunking strategies — Vaults handle all of this automatically.

Architecture

Prerequisites

  • Case.dev API key (get one here)
  • Node.js 18+ or Python 3.9+
  • Vercel AI SDK (optional, for streaming UI)

Project Setup

Step 1: Install dependencies

# No installation needed — just set your API key
export CASEDEV_API_KEY="sk_case_YOUR_API_KEY"

Step 2: Set up environment variables

Environment
CASEDEV_API_KEY=sk_case_your_api_key

Step 3: Create a vault for your knowledge base

curl -X POST https://api.case.dev/vault \
  -H "Authorization: Bearer $CASEDEV_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Knowledge Base",
    "description": "Document intelligence agent knowledge store"
  }'

Core Functions

1. Add Documents to Knowledge Base

When a user uploads a document or provides information, store it in the vault:
casedev vault upload \
  --id $VAULT_ID \
  --filename "document.pdf" \
  --content-type "application/pdf"

2. Retrieve Relevant Information

Search the knowledge base for content relevant to a user’s question:
casedev vault search \
  --id $VAULT_ID \
  --query "search query"

3. Generate Responses with Context

Use the LLM Gateway to generate responses grounded in your documents:
# Search knowledge base
casedev vault search --id $VAULT_ID \
  --query "What are the key terms?" --method hybrid --limit 10

# Answer with LLM
casedev llm:v1:chat create-completion \
  --model anthropic/claude-sonnet-4.5 \
  --message '{role: system, content: "Answer using only the provided context. Cite sources."}' \
  --message '{role: user, content: "Context: <search results>\n\nQuestion: What are the key terms?"}' \
  --temperature 0.3 --max-tokens 1000

Building the Agent with Tools

For a more sophisticated agent that can decide when to search vs. add knowledge, use tool calling:
# Tool-calling agents require programmatic control flow.
# Use the Go or TypeScript SDK for agent loops.
# For simple questions, pipe vault search into LLM:
casedev vault search --id $VAULT_ID --query "What deadlines are coming up?"

casedev llm:v1:chat create-completion \
  --model anthropic/claude-sonnet-4.5 \
  --message '{role: system, content: "You are a document intelligence agent."}' \
  --message '{role: user, content: "Based on these documents: <search results>\n\nWhat deadlines are coming up?"}'

Integration with Vercel AI SDK

For Next.js applications, integrate with the Vercel AI SDK for streaming responses:
Typescript
// app/api/chat/route.ts
import { streamText, tool } from 'ai';
import { createOpenAICompatible } from '@ai-sdk/openai-compatible';
import { z } from 'zod';
import Casedev from 'casedev';

const client = new Casedev({ apiKey: process.env.CASEDEV_API_KEY });
const VAULT_ID = process.env.VAULT_ID;

// Set up Case.dev as an OpenAI-compatible provider for Vercel AI SDK
const casedev = createOpenAICompatible({
  name: 'casedev',
  baseURL: 'https://api.case.dev/llm/v1',
  headers: { Authorization: `Bearer ${process.env.CASEDEV_API_KEY}` },
});

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: casedev('anthropic/claude-sonnet-4.5'),
    system: `You are a document intelligence assistant. 
Check your knowledge base before answering questions.
Only respond using information from tool calls.
If no relevant information is found, say "I don't have information about that."`,
    messages,
    maxSteps: 5,
    tools: {
      searchDocuments: tool({
        description: 'Search the document knowledge base',
        parameters: z.object({
          query: z.string().describe('The search query')
        }),
        execute: async ({ query }) => {
          const results = await client.vault.search(VAULT_ID, {
            query,
            method: 'hybrid',
            limit: 5
          });
          return results.chunks.map(c => ({
            text: c.text,
            source: c.filename,
            page: c.page
          }));
        }
      }),
      
      addDocument: tool({
        description: 'Add information to the knowledge base',
        parameters: z.object({
          content: z.string().describe('Content to add')
        }),
        execute: async ({ content }) => {
          const upload = await client.vault.upload(VAULT_ID, {
            filename: `note-${Date.now()}.txt`,
            contentType: 'text/plain'
          });
          await fetch(upload.uploadUrl, {
            method: 'PUT',
            body: content
          });
          await client.vault.ingest(VAULT_ID, upload.objectId);
          return 'Added to knowledge base successfully';
        }
      })
    }
  });

  return result.toDataStreamResponse();
}

Example Usage

Typescript
// Add some knowledge
await addToKnowledgeBase(
  'The Smith v. Jones case was filed on March 15, 2024. The plaintiff alleges negligence in the maintenance of the property.',
  { topic: 'case-facts' }
);

await addToKnowledgeBase(
  'Deposition of John Smith on April 2, 2024: Witness stated he observed water damage on the ceiling two weeks before the incident.',
  { topic: 'depositions' }
);

// Ask questions
const result = await answerQuestion('When was the Smith v. Jones case filed?');
console.log(result.answer);
// "The Smith v. Jones case was filed on March 15, 2024 [1]."

const result2 = await answerQuestion('What did John Smith observe?');
console.log(result2.answer);
// "John Smith observed water damage on the ceiling two weeks before the incident [1]."

// Using the agent
const response = await runAgent('My favorite pizza topping is pepperoni. Remember that.');
console.log(response);
// "I've added that to my knowledge base. Your favorite pizza topping is pepperoni."

const response2 = await runAgent('What is my favorite pizza topping?');
console.log(response2);
// "According to my knowledge base, your favorite pizza topping is pepperoni."

Best Practices

Chunking is automatic. Case.dev Vaults automatically chunk documents into semantic segments optimized for retrieval. You don’t need to implement chunking yourself.
Combine semantic and keyword search for best results:
casedev vault search --id $VAULT_ID \
  --query "liability insurance coverage limits" \
  --method hybrid --limit 10

2. Set appropriate temperature

Use low temperature for factual retrieval:
curl -X POST https://api.case.dev/llm/v1/chat/createCompletion \
  -H "Authorization: Bearer $CASEDEV_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{}'

3. Structure your prompts

Be explicit about using only provided context:
Typescript
const systemPrompt = `You are a legal research assistant.

Rules:
- ONLY use information from the provided context
- If information is not in the context, say "I don't have that information"
- Always cite sources using [1], [2], etc.
- Never make up or infer facts not explicitly stated`;

4. Handle no results gracefully

Typescript
const results = await findRelevantContent(query);

if (results.length === 0 || results[0].score < 0.5) {
  return "I couldn't find relevant information in the knowledge base.";
}

Next Steps