Build Your First App That Talks to an AI Model
Write a small Python app that calls Anthropic's Claude API — install the SDK, keep your key out of source control, stream the response token-by-token, and handle the errors that actually happen in production.
What you'll build
You'll write a small command-line Python app that sends a prompt to Claude, streams the reply as it's generated, and fails gracefully on auth, rate-limit, and network errors. By the end you'll understand the request/response shape of the Messages API and how to wire a key in safely.
Prerequisites
- Python 3.8+ (
python3 --version). TheanthropicSDK requires 3.8 or newer. - An Anthropic Console account and an API key from https://console.anthropic.com/ → Settings → API Keys. New keys start with
sk-ant-. - A funded account or active credit — the API is not free, though the calls in this tutorial cost a fraction of a cent.
- A terminal. Commands below use macOS/Linux shells; Windows notes are called out where they differ.
1. Set up an isolated project
Never install SDKs into your system Python. Use a virtual environment.
mkdir claude-quickstart && cd claude-quickstart
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
Your prompt should now be prefixed with (.venv). Install the official SDK plus a helper to load environment variables from a file:
pip install anthropic python-dotenv
2. Store your API key as an environment variable
Hard-coding keys in source is the most common way they leak into Git history. Put yours in a .env file that you never commit.
echo 'ANTHROPIC_API_KEY=sk-ant-your-real-key-here' > .env
echo '.env' >> .gitignore
echo '.venv/' >> .gitignore
The SDK automatically reads the ANTHROPIC_API_KEY variable, so you don't have to pass the key explicitly in code. python-dotenv will load .env into the process environment at startup.
If you'd rather not use a file, export it in your shell instead:
export ANTHROPIC_API_KEY=sk-ant-.... Just remember it won't persist across new terminals unless added to your shell profile.
3. Make your first (non-streaming) request
Start simple to confirm the plumbing works. Create hello.py:
import os
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv() # reads .env into os.environ
# The client picks up ANTHROPIC_API_KEY from the environment automatically.
client = Anthropic()
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=300,
messages=[
{"role": "user", "content": "In two sentences, what is an API?"}
],
)
# content is a list of content blocks; the first is the text reply.
print(message.content[0].text)
print("---")
print("Tokens used:", message.usage.input_tokens, "in /", message.usage.output_tokens, "out")
Run it:
python hello.py
A few things worth knowing:
max_tokensis required and caps the output length. It is not a budget for the whole conversation.messagesis the full conversation history. Each turn is a dict with arole(userorassistant) andcontent.modelis a dated snapshot string.claude-sonnet-4-20250514is a good general-purpose default;claude-3-5-haiku-20241022is cheaper and faster for simple tasks.
4. Stream the response
For anything interactive, streaming makes the app feel instant — you print text as the model produces it instead of waiting for the whole reply. Use client.messages.stream() as a context manager. Create stream.py:
import sys
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
prompt = "Write a haiku about debugging at 2am."
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=300,
messages=[{"role": "user", "content": prompt}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print() # trailing newline
final = stream.get_final_message()
print("---")
print("Stop reason:", final.stop_reason)
python stream.py
You should see the haiku appear incrementally. stream.text_stream yields just the text deltas; get_final_message() returns the complete assembled Message once streaming finishes, including usage and stop_reason. A stop_reason of max_tokens means you hit your output cap and the reply was truncated.
5. Handle errors properly
Networks fail, keys get revoked, and rate limits exist. The SDK exposes typed exceptions so you can react appropriately. Here's a small, production-shaped wrapper — save as app.py:
import sys
from dotenv import load_dotenv
import anthropic
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
def ask(prompt: str) -> None:
try:
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=500,
messages=[{"role": "user", "content": prompt}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print()
except anthropic.AuthenticationError:
sys.exit("Auth failed: check ANTHROPIC_API_KEY is set and valid.")
except anthropic.RateLimitError:
sys.exit("Rate limited (429): slow down or upgrade your plan, then retry.")
except anthropic.BadRequestError as e:
sys.exit(f"Bad request (400): {e.message}")
except anthropic.APIConnectionError as e:
sys.exit(f"Network problem reaching the API: {e.__cause__}")
except anthropic.APIStatusError as e:
# Catch-all for other non-2xx responses (e.g. 500s).
sys.exit(f"API returned {e.status_code}: {e.message}")
if __name__ == "__main__":
question = " ".join(sys.argv[1:]) or "Say hello and tell me one fun fact."
ask(question)
Run it with your own prompt:
python app.py "Explain idempotency to a junior dev"
Key exceptions, from most to least specific:
| Exception | HTTP status | Typical cause |
|---|---|---|
AuthenticationError |
401 | Missing, wrong, or revoked key |
PermissionDeniedError |
403 | Key lacks access to the resource/model |
BadRequestError |
400 | Malformed params (e.g. missing max_tokens) |
RateLimitError |
429 | Too many requests or tokens per minute |
APIConnectionError |
— | DNS/TLS/timeout before a response |
APIStatusError |
other 4xx/5xx | Base class; catch last as a fallback |
For transient RateLimitError and 5xx responses, the SDK already retries with exponential backoff twice by default. Tune it with Anthropic(max_retries=5) or per-request via client.with_options(max_retries=5).
Verify it works
You're done when:
python hello.pyprints a two-sentence answer plus a token count line.python stream.pyprints a haiku that visibly appears word-by-word, followed byStop reason: end_turn.python app.py "..."streams answers and, if you temporarily break the key (export ANTHROPIC_API_KEY=bad), exits cleanly withAuth failed: ...instead of a raw traceback.
A successful non-streaming response object looks roughly like this when printed:
Message(id='msg_01...', content=[TextBlock(text='An API is...', type='text')],
model='claude-sonnet-4-20250514', role='assistant',
stop_reason='end_turn', usage=Usage(input_tokens=14, output_tokens=37))
Troubleshooting
AuthenticationError even though the key looks right. Confirm the variable is actually loaded: add import os; print(os.environ.get("ANTHROPIC_API_KEY", "NOT SET")[:10]). If it prints NOT SET, your .env isn't being found — run the script from the directory containing .env, or pass an explicit path: load_dotenv("/full/path/.env").
ModuleNotFoundError: No module named 'anthropic'. Your virtual environment isn't active, or you installed into a different Python. Re-run source .venv/bin/activate and pip install anthropic, then which python should point inside .venv.
BadRequestError: max_tokens: field required. You omitted max_tokens. Unlike some APIs it is mandatory on every call.
RateLimitError or 400 ... credit balance is too low. A 429 means you exceeded your tier's per-minute limits — wait and retry. The credit-balance message means you need to add funds in Console → Billing; it is not a code bug.
Next steps
- System prompts and multi-turn chat: pass a top-level
system="..."argument and append each assistant reply back intomessagesto hold a conversation. - Tool use (function calling): let Claude call your own functions by passing a
toolslist — see the Anthropic docs' Tool use guide. - Async at scale: swap
Anthropic()forAsyncAnthropic()andawaitthe calls to handle many requests concurrently. - Prompt engineering: read Anthropic's prompt engineering documentation to get more reliable, structured outputs.
You now have the core loop every AI app is built on: authenticate, send messages, stream tokens, and handle failure. Everything else is composition on top of these five steps.
Discussion 0
Join the discussion
Sign in with GitHub to comment and vote.
No comments yet
Be the first to weigh in.