Using the Claude API in Real Projects: A Practical Developer Guide

Reading the docs and building a toy "ask a question, get an answer" demo takes about 20 minutes with the Claude API. Building something production-ready — with proper streaming, error handling, cost controls, and useful system prompts — takes a lot more thought.

This guide is about that second part. We'll cover the patterns that matter once you move past hello-world.

Setup

npm install @anthropic-ai/sdk

// lib/claude.ts
import Anthropic from '@anthropic-ai/sdk'
 
export const claude = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!,
})

Always initialize the client once and export it. Creating a new instance per request wastes resources and loses connection pooling benefits.

Streaming Responses

Nobody wants to wait 8 seconds for a response to appear all at once. Always stream for user-facing features:

// app/api/chat/route.ts (Next.js)
import { claude } from '@/lib/claude'
 
export async function POST(req: Request) {
  const { messages, systemPrompt } = await req.json()
 
  const stream = await claude.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    system: systemPrompt,
    messages,
    stream: true,
  })
 
  // Return a ReadableStream to the client
  const readable = new ReadableStream({
    async start(controller) {
      for await (const event of stream) {
        if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
          controller.enqueue(new TextEncoder().encode(event.delta.text))
        }
        if (event.type === 'message_stop') {
          controller.close()
        }
      }
    },
  })
 
  return new Response(readable, {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Transfer-Encoding': 'chunked',
    },
  })
}

// components/ChatStream.tsx — consuming streamed response
'use client'
 
import { useState } from 'react'
 
export default function ChatStream() {
  const [response, setResponse] = useState('')
  const [loading, setLoading] = useState(false)
 
  async function sendMessage(message: string) {
    setLoading(true)
    setResponse('')
 
    const res = await fetch('/api/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        messages: [{ role: 'user', content: message }],
        systemPrompt: 'You are a helpful coding assistant.',
      }),
    })
 
    const reader = res.body!.getReader()
    const decoder = new TextDecoder()
 
    while (true) {
      const { done, value } = await reader.read()
      if (done) break
      setResponse(prev => prev + decoder.decode(value))
    }
 
    setLoading(false)
  }
 
  return (
    <div>
      <button onClick={() => sendMessage('Explain async/await in JavaScript')}>
        Ask Claude
      </button>
      {loading && <p>Thinking...</p>}
      <div>{response}</div>
    </div>
  )
}

System Prompts: Your Most Powerful Lever

A weak system prompt produces generic responses. A strong one shapes every response in the conversation. Invest time here.

const CODE_REVIEW_SYSTEM_PROMPT = `You are a senior software engineer doing a thorough code review.
 
When reviewing code:
- Focus on correctness, security, and performance in that order
- Point out specific line numbers when referencing issues
- Explain WHY something is a problem, not just that it is one
- Suggest concrete improvements with example code
- Be direct but constructive — you're helping a colleague, not grading homework
 
Format your review as:
1. Summary (2-3 sentences)
2. Critical issues (must fix before merging)
3. Suggestions (nice to have)
4. What's done well
 
If there are no critical issues, say so clearly.`

const DOCUMENTATION_SYSTEM_PROMPT = `You are a technical writer who specializes in developer documentation.
 
Rules:
- Write for developers, not managers
- Use active voice
- Include runnable code examples for every concept
- Don't explain what something is — explain when and why to use it
- Keep paragraphs to 3 sentences maximum
- Use second person ("you") not third person ("the developer")`

Tool Use: Giving Claude Capabilities

Tool use (function calling) lets Claude decide to call a function when it needs external data:

import Anthropic from '@anthropic-ai/sdk'
 
const claude = new Anthropic()
 
const tools: Anthropic.Tool[] = [
  {
    name: 'get_post_by_slug',
    description: 'Fetch a blog post by its URL slug. Use this when the user asks about a specific post.',
    input_schema: {
      type: 'object',
      properties: {
        slug: {
          type: 'string',
          description: 'The URL slug of the post, e.g. "nextjs-server-components"',
        },
      },
      required: ['slug'],
    },
  },
  {
    name: 'search_posts',
    description: 'Search blog posts by keyword. Use when the user asks to find posts about a topic.',
    input_schema: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search query' },
        limit: { type: 'number', description: 'Max results to return (default 5)' },
      },
      required: ['query'],
    },
  },
]
 
async function chatWithTools(userMessage: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: 'user', content: userMessage },
  ]
 
  while (true) {
    const response = await claude.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      tools,
      messages,
    })
 
    if (response.stop_reason === 'end_turn') {
      const textBlock = response.content.find(b => b.type === 'text')
      return textBlock?.type === 'text' ? textBlock.text : ''
    }
 
    if (response.stop_reason === 'tool_use') {
      // Add Claude's response (with tool calls) to history
      messages.push({ role: 'assistant', content: response.content })
 
      // Execute each tool call
      const toolResults: Anthropic.ToolResultBlockParam[] = []
 
      for (const block of response.content) {
        if (block.type !== 'tool_use') continue
 
        let result: string
        if (block.name === 'get_post_by_slug') {
          const post = await getPostBySlug((block.input as any).slug)
          result = post ? JSON.stringify(post) : 'Post not found'
        } else if (block.name === 'search_posts') {
          const posts = await searchPosts((block.input as any).query)
          result = JSON.stringify(posts)
        } else {
          result = 'Unknown tool'
        }
 
        toolResults.push({
          type: 'tool_result',
          tool_use_id: block.id,
          content: result,
        })
      }
 
      // Add tool results to history and continue
      messages.push({ role: 'user', content: toolResults })
    }
  }
}

Prompt Caching: Cut Costs on Long System Prompts

If your system prompt is long (documentation, a large context document), prompt caching can reduce costs by up to 90% on repeated calls:

const response = await claude.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: longSystemPrompt, // Your 2000-word system prompt
      cache_control: { type: 'ephemeral' }, // Cache this!
    },
  ],
  messages: [{ role: 'user', content: userMessage }],
})

The first call processes and caches the system prompt. Subsequent calls within the cache TTL (5 minutes) pay only the cache read price, which is ~10x cheaper than re-processing.

Error Handling in Production

import Anthropic from '@anthropic-ai/sdk'
 
export async function callClaude(prompt: string, retries = 2): Promise<string> {
  try {
    const response = await claude.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      messages: [{ role: 'user', content: prompt }],
    })
 
    const text = response.content[0]
    return text.type === 'text' ? text.text : ''
 
  } catch (error) {
    if (error instanceof Anthropic.RateLimitError && retries > 0) {
      await new Promise(r => setTimeout(r, 2000))
      return callClaude(prompt, retries - 1)
    }
 
    if (error instanceof Anthropic.APIStatusError) {
      if (error.status >= 500 && retries > 0) {
        await new Promise(r => setTimeout(r, 1000))
        return callClaude(prompt, retries - 1)
      }
      throw new Error(`Claude API error: ${error.status} ${error.message}`)
    }
 
    throw error
  }
}

Common Mistakes

1. Not setting max_tokens. Without it you might hit the model's ceiling unexpectedly or burn through tokens on a runaway response.

2. Putting conversation history in the system prompt. System prompts are for instructions, not history. Use the messages array for conversation turns.

3. Not handling stop_reason. Always check why the model stopped — end_turn vs max_tokens vs tool_use require different handling.

4. Building without observability. Log every API call in production: model, tokens used, latency, stop reason. You'll need this for debugging and cost tracking.

5. Exposing your API key client-side. Always proxy Claude calls through your backend. Never call the Anthropic API directly from the browser.

Key Takeaways

Always stream responses for user-facing features — it dramatically improves perceived performance
System prompts are your most powerful tool — invest time crafting them carefully
Tool use lets Claude pull real data instead of hallucinating it
Prompt caching cuts costs significantly for large, repeated system prompts
Handle rate limits and 5xx errors with retry logic with exponential backoff
Never expose your API key to the browser — always proxy through your backend

Using the Claude API in Real Projects: A Practical Developer Guide

This guide is about that second part. We'll cover the patterns that matter once you move past hello-world.

Setup

npm install @anthropic-ai/sdk

// lib/claude.ts
import Anthropic from '@anthropic-ai/sdk'
 
export const claude = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!,
})

Always initialize the client once and export it. Creating a new instance per request wastes resources and loses connection pooling benefits.

Streaming Responses

Nobody wants to wait 8 seconds for a response to appear all at once. Always stream for user-facing features:

// app/api/chat/route.ts (Next.js)
import { claude } from '@/lib/claude'
 
export async function POST(req: Request) {
  const { messages, systemPrompt } = await req.json()
 
  const stream = await claude.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    system: systemPrompt,
    messages,
    stream: true,
  })
 
  // Return a ReadableStream to the client
  const readable = new ReadableStream({
    async start(controller) {
      for await (const event of stream) {
        if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
          controller.enqueue(new TextEncoder().encode(event.delta.text))
        }
        if (event.type === 'message_stop') {
          controller.close()
        }
      }
    },
  })
 
  return new Response(readable, {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Transfer-Encoding': 'chunked',
    },
  })
}

// components/ChatStream.tsx — consuming streamed response
'use client'
 
import { useState } from 'react'
 
export default function ChatStream() {
  const [response, setResponse] = useState('')
  const [loading, setLoading] = useState(false)
 
  async function sendMessage(message: string) {
    setLoading(true)
    setResponse('')
 
    const res = await fetch('/api/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        messages: [{ role: 'user', content: message }],
        systemPrompt: 'You are a helpful coding assistant.',
      }),
    })
 
    const reader = res.body!.getReader()
    const decoder = new TextDecoder()
 
    while (true) {
      const { done, value } = await reader.read()
      if (done) break
      setResponse(prev => prev + decoder.decode(value))
    }
 
    setLoading(false)
  }
 
  return (
    <div>
      <button onClick={() => sendMessage('Explain async/await in JavaScript')}>
        Ask Claude
      </button>
      {loading && <p>Thinking...</p>}
      <div>{response}</div>
    </div>
  )
}

System Prompts: Your Most Powerful Lever

A weak system prompt produces generic responses. A strong one shapes every response in the conversation. Invest time here.

const CODE_REVIEW_SYSTEM_PROMPT = `You are a senior software engineer doing a thorough code review.
 
When reviewing code:
- Focus on correctness, security, and performance in that order
- Point out specific line numbers when referencing issues
- Explain WHY something is a problem, not just that it is one
- Suggest concrete improvements with example code
- Be direct but constructive — you're helping a colleague, not grading homework
 
Format your review as:
1. Summary (2-3 sentences)
2. Critical issues (must fix before merging)
3. Suggestions (nice to have)
4. What's done well
 
If there are no critical issues, say so clearly.`

const DOCUMENTATION_SYSTEM_PROMPT = `You are a technical writer who specializes in developer documentation.
 
Rules:
- Write for developers, not managers
- Use active voice
- Include runnable code examples for every concept
- Don't explain what something is — explain when and why to use it
- Keep paragraphs to 3 sentences maximum
- Use second person ("you") not third person ("the developer")`

Tool Use: Giving Claude Capabilities

Tool use (function calling) lets Claude decide to call a function when it needs external data:

import Anthropic from '@anthropic-ai/sdk'
 
const claude = new Anthropic()
 
const tools: Anthropic.Tool[] = [
  {
    name: 'get_post_by_slug',
    description: 'Fetch a blog post by its URL slug. Use this when the user asks about a specific post.',
    input_schema: {
      type: 'object',
      properties: {
        slug: {
          type: 'string',
          description: 'The URL slug of the post, e.g. "nextjs-server-components"',
        },
      },
      required: ['slug'],
    },
  },
  {
    name: 'search_posts',
    description: 'Search blog posts by keyword. Use when the user asks to find posts about a topic.',
    input_schema: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search query' },
        limit: { type: 'number', description: 'Max results to return (default 5)' },
      },
      required: ['query'],
    },
  },
]
 
async function chatWithTools(userMessage: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: 'user', content: userMessage },
  ]
 
  while (true) {
    const response = await claude.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      tools,
      messages,
    })
 
    if (response.stop_reason === 'end_turn') {
      const textBlock = response.content.find(b => b.type === 'text')
      return textBlock?.type === 'text' ? textBlock.text : ''
    }
 
    if (response.stop_reason === 'tool_use') {
      // Add Claude's response (with tool calls) to history
      messages.push({ role: 'assistant', content: response.content })
 
      // Execute each tool call
      const toolResults: Anthropic.ToolResultBlockParam[] = []
 
      for (const block of response.content) {
        if (block.type !== 'tool_use') continue
 
        let result: string
        if (block.name === 'get_post_by_slug') {
          const post = await getPostBySlug((block.input as any).slug)
          result = post ? JSON.stringify(post) : 'Post not found'
        } else if (block.name === 'search_posts') {
          const posts = await searchPosts((block.input as any).query)
          result = JSON.stringify(posts)
        } else {
          result = 'Unknown tool'
        }
 
        toolResults.push({
          type: 'tool_result',
          tool_use_id: block.id,
          content: result,
        })
      }
 
      // Add tool results to history and continue
      messages.push({ role: 'user', content: toolResults })
    }
  }
}

Prompt Caching: Cut Costs on Long System Prompts

If your system prompt is long (documentation, a large context document), prompt caching can reduce costs by up to 90% on repeated calls:

const response = await claude.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: longSystemPrompt, // Your 2000-word system prompt
      cache_control: { type: 'ephemeral' }, // Cache this!
    },
  ],
  messages: [{ role: 'user', content: userMessage }],
})

The first call processes and caches the system prompt. Subsequent calls within the cache TTL (5 minutes) pay only the cache read price, which is ~10x cheaper than re-processing.

Error Handling in Production

import Anthropic from '@anthropic-ai/sdk'
 
export async function callClaude(prompt: string, retries = 2): Promise<string> {
  try {
    const response = await claude.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      messages: [{ role: 'user', content: prompt }],
    })
 
    const text = response.content[0]
    return text.type === 'text' ? text.text : ''
 
  } catch (error) {
    if (error instanceof Anthropic.RateLimitError && retries > 0) {
      await new Promise(r => setTimeout(r, 2000))
      return callClaude(prompt, retries - 1)
    }
 
    if (error instanceof Anthropic.APIStatusError) {
      if (error.status >= 500 && retries > 0) {
        await new Promise(r => setTimeout(r, 1000))
        return callClaude(prompt, retries - 1)
      }
      throw new Error(`Claude API error: ${error.status} ${error.message}`)
    }
 
    throw error
  }
}

Common Mistakes

1. Not setting max_tokens. Without it you might hit the model's ceiling unexpectedly or burn through tokens on a runaway response.

2. Putting conversation history in the system prompt. System prompts are for instructions, not history. Use the messages array for conversation turns.

3. Not handling stop_reason. Always check why the model stopped — end_turn vs max_tokens vs tool_use require different handling.

4. Building without observability. Log every API call in production: model, tokens used, latency, stop reason. You'll need this for debugging and cost tracking.

5. Exposing your API key client-side. Always proxy Claude calls through your backend. Never call the Anthropic API directly from the browser.

Key Takeaways

Always stream responses for user-facing features — it dramatically improves perceived performance
System prompts are your most powerful tool — invest time crafting them carefully
Tool use lets Claude pull real data instead of hallucinating it
Prompt caching cuts costs significantly for large, repeated system prompts
Handle rate limits and 5xx errors with retry logic with exponential backoff
Never expose your API key to the browser — always proxy through your backend

Using the Claude API in Real Projects: A Practical Developer Guide

Sabir Lkhaloufi

Using the Claude API in Real Projects: A Practical Developer Guide

Setup

Streaming Responses

System Prompts: Your Most Powerful Lever

Tool Use: Giving Claude Capabilities

Prompt Caching: Cut Costs on Long System Prompts

Error Handling in Production

Common Mistakes

Key Takeaways

Popular Blogs

Claude AI vs ChatGPT: An Honest Comparison for Developers

AI Tools Every Developer Should Be Using in 2026

Using the Claude API in Real Projects: A Practical Developer Guide

Prompt Engineering for Developers: Write Prompts That Actually Work

Categories

Using the Claude API in Real Projects: A Practical Developer Guide

Sabir Lkhaloufi

Using the Claude API in Real Projects: A Practical Developer Guide

Setup

Streaming Responses

System Prompts: Your Most Powerful Lever

Tool Use: Giving Claude Capabilities

Prompt Caching: Cut Costs on Long System Prompts

Error Handling in Production

Common Mistakes

Key Takeaways

Popular Blogs

Claude AI vs ChatGPT: An Honest Comparison for Developers

AI Tools Every Developer Should Be Using in 2026

Using the Claude API in Real Projects: A Practical Developer Guide

Prompt Engineering for Developers: Write Prompts That Actually Work

Categories

Using the Claude API in Real Projects: A Practical Developer Guide

Popular Blogs

Categories

Related Posts

Using the Claude API in Real Projects: A Practical Developer Guide

Popular Blogs

Categories

Related Posts