Cookbook · FastAPI · 2026

⚡ FastAPI — Everyday Patterns

The wiring you paste into every API: the project layout, the DB session dep, settings, CORS, the error envelope, auth, pagination, health checks. Pydantic v2 and current FastAPI only — no @app.on_event, no orm_mode. Code, when to reach for it, and the gotcha.

The mission You know PHP/jQuery in your bones. This shelf is the FastAPI equivalent of your old functions.php kit — the reusable scaffolding you wire once and trust forever. Lift the snippet, swap the names, ship. Everything here is the 2026 idiom: lifespan over startup events, Annotated[...] dependencies, ConfigDict(from_attributes=True) over the old orm_mode. Most of it is already running in DocChat, so you have a live place to copy it into.

On this shelf — 13 recipes

Production project layout
DB session dependency
Settings dependency
App factory + lifespan
CORS the right way
Global error handler + envelope
Auth dependency & role guard
Pagination response
Response model from ORM
Health & readiness endpoints
Request-ID + timing middleware
File upload (validated)
Background task (fire-and-forget)

Structure & wiring 4 recipes

Production project layout

When: starting any API bigger than a toy — the folder tree that scales past one main.py without a rewrite.

project tree

# app/
#   main.py            — create_app(), lifespan, router mounting
#   core/
#     config.py        — Settings (pydantic-settings), get_settings()
#     db.py            — engine, SessionLocal, get_db() dependency
#     security.py      — JWT encode/decode, password hashing
#   models/            — SQLAlchemy ORM tables (DB shape)
#     user.py
#   schemas/           — Pydantic models (wire shape: Create/Read/Update)
#     user.py
#   routers/           — APIRouter per resource (HTTP layer only)
#     users.py
#   services/          — business logic, talks to models (no HTTP here)
#     user_service.py
#   deps.py            — shared Annotated dependency aliases
# tests/
# pyproject.toml  .env  .env.example

routers do HTTP (parse request, return response), services do logic (the part you unit-test without a client), models are the DB, schemas are the wire. Mixing them is the #1 reason a FastAPI app turns into spaghetti — keep each layer ignorant of the one above it.

don't import a router from a service or call HTTPException inside a service — that couples your logic to HTTP and makes it un-reusable. Services raise domain errors; routers translate them.

DB session dependency

When: every route that touches the database — one session per request, opened on entry, closed on exit, even if the route raises.

app/core/db.py

from collections.abc import Generator
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, Session
from app.core.config import get_settings

engine = create_engine(get_settings().database_url, pool_pre_ping=True)
SessionLocal = sessionmaker(bind=engine, autoflush=False, expire_on_commit=False)

def get_db() -> Generator[Session]:
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

app/deps.py

from typing import Annotated
from fastapi import Depends
from sqlalchemy.orm import Session
from app.core.db import get_db

# Alias once, reuse everywhere: def route(db: DbDep): ...
DbDep = Annotated[Session, Depends(get_db)]

yield makes get_db a context-managed dependency — the code after yield always runs on the way out, so the session can't leak. The Annotated alias is the 2026 idiom: you type db: DbDep instead of repeating Depends(get_db) in every signature, and it's reusable across files.

pool_pre_ping=True is not optional in production — without it, a connection killed by the DB or a proxy stays in the pool and the next request blows up with "server closed the connection." Also never share one session across requests; that's how you get stale data and cross-request corruption.

Settings dependency

When: anywhere you need config (DB URL, secret key, allowed origins) — typed, validated from env, and cached so it's parsed once.

app/core/config.py

from functools import lru_cache
from typing import Annotated
from fastapi import Depends
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env", extra="ignore")

    app_name: str = "DocChat API"
    database_url: str
    jwt_secret: str
    jwt_alg: str = "HS256"
    cors_origins: list[str] = Field(default_factory=list)

@lru_cache
def get_settings() -> Settings:
    return Settings()  # reads env once, then cached

SettingsDep = Annotated[Settings, Depends(get_settings)]

@lru_cache means the env is read and validated exactly once per process, not on every request — and because get_settings is a normal callable, you can call it directly at import time (in db.py) and inject it via Depends in routes. Same source of truth both ways.

a missing required field (database_url with no default) raises at startup, not at first request — that's the behaviour you want, but it means a broken .env kills the boot. In tests, override with app.dependency_overrides[get_settings] = ... rather than mutating real env.

App factory + lifespan

When: the entry point of every app — build the FastAPI instance in a function, open/close resources with lifespan, mount routers with prefixes and tags.

app/main.py

from contextlib import asynccontextmanager
from fastapi import FastAPI
from app.core.config import get_settings
from app.routers import users, health

@asynccontextmanager
async def lifespan(app: FastAPI):
    # startup: open pools, http clients, warm caches
    app.state.http = ...  # e.g. httpx.AsyncClient()
    yield
    # shutdown: close them in reverse
    await app.state.http.aclose()

def create_app() -> FastAPI:
    settings = get_settings()
    app = FastAPI(title=settings.app_name, lifespan=lifespan)

    app.include_router(health.router, tags=["meta"])
    app.include_router(users.router, prefix="/users", tags=["users"])
    return app

app = create_app()

lifespan is the modern replacement for the deprecated @app.on_event("startup"/"shutdown") pair — one async context manager, startup before yield, shutdown after. The factory (create_app) lets tests build a fresh app with overridden deps instead of importing a global. prefix + tags keep your URL structure and /docs grouping clean.

resources opened in lifespan live on app.state, not in module globals — reach them in routes via request.app.state.http. And if startup raises before yield, the app never serves traffic; wrap risky init so a failed cache warm doesn't take down the whole service.

Requests & responses 5 recipes

CORS the right way

When: a browser frontend on a different origin calls your API — the SPA, the Next.js app. Wire it from settings, never hardcode * with credentials.

app/main.py

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=settings.cors_origins,   # ["https://app.docchat.io"]
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

explicit origins from settings means staging and prod ship different allowlists from the same code — no edits, just env. allow_methods/allow_headers as * is fine; it's allow_origins that needs to be tight.

allow_origins=["*"] together with allow_credentials=True is silently ignored by browsers — cookies/Authorization won't be sent and you'll chase a phantom CORS bug. With credentials you must list real origins. Order matters too: add CORS before routers depend on it being there.

Global error handler + envelope

When: you want every error — yours and FastAPI's validation ones — to come back in one predictable JSON shape the frontend can switch on.

app/core/errors.py

from fastapi import FastAPI, Request, status
from fastapi.responses import JSONResponse
from fastapi.exceptions import RequestValidationError

class AppError(Exception):
    def __init__(self, code: str, message: str, status: int = 400):
        self.code, self.message, self.status = code, message, status

def register_errors(app: FastAPI) -> None:

    @app.exception_handler(AppError)
    async def _app_error(request: Request, exc: AppError):
        return JSONResponse(
            status_code=exc.status,
            content={"error": {"code": exc.code, "message": exc.message}},
        )

    @app.exception_handler(RequestValidationError)
    async def _validation(request: Request, exc: RequestValidationError):
        return JSONResponse(
            status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
            content={"error": {"code": "validation_error",
                                "message": "Invalid request",
                                "details": exc.errors()}},
        )

one envelope — {"error": {"code", "message"}} — means the frontend writes one error handler, not a different parse per status. Your services raise AppError("doc_not_found", ...) with a stable machine code; routers stay clean. Call register_errors(app) in the factory.

overriding RequestValidationError changes the default 422 body, so any client (or test) that read FastAPI's stock {"detail": [...]} shape breaks — keep exc.errors() under a details key so you don't lose the field-level info. And don't catch bare Exception and leak stack traces; log them, return a generic 500 envelope.

Auth dependency & role guard

When: protecting routes — decode the JWT and load the user (401 on failure), plus a factory that gates by role (403).

app/deps.py

from typing import Annotated
from fastapi import Depends, HTTPException, status
from fastapi.security import OAuth2PasswordBearer
import jwt  # pyjwt
from app.core.config import SettingsDep
from app.models.user import User

oauth2 = OAuth2PasswordBearer(tokenUrl="/auth/login")

async def get_current_user(
    token: Annotated[str, Depends(oauth2)],
    settings: SettingsDep,
    db: DbDep,
) -> User:
    creds_exc = HTTPException(status.HTTP_401_UNAUTHORIZED,
                              "Could not validate credentials",
                              headers={"WWW-Authenticate": "Bearer"})
    try:
        payload = jwt.decode(token, settings.jwt_secret, algorithms=[settings.jwt_alg])
        user_id = payload.get("sub")
    except jwt.PyJWTError:
        raise creds_exc
    user = db.get(User, user_id)
    if user is None:
        raise creds_exc
    return user

CurrentUser = Annotated[User, Depends(get_current_user)]

def require_role(role: str):
    def _guard(user: CurrentUser) -> User:
        if user.role != role:
            raise HTTPException(status.HTTP_403_FORBIDDEN, "Insufficient role")
        return user
    return _guard

# usage: def admin_route(user: Annotated[User, Depends(require_role("admin"))]): ...

require_role is a dependency factory — it returns a fresh dep configured for the role, so you write Depends(require_role("admin")) per route with zero duplication. 401 means "who are you?" (bad/absent token); 403 means "I know you, you can't" (wrong role) — keeping them distinct is what good APIs do.

use pyjwt and always pass algorithms=[...] explicitly — decoding without pinning the algorithm is the classic JWT vulnerability (an attacker switches to none or HS/RS confusion). Never log the token or the secret. And keep sub a stable id, not the email, so a user changing their email doesn't invalidate live tokens.

Pagination response

When: any list endpoint that can grow — return a page, not the whole table, in a consistent {items, total, limit, offset} shape.

app/schemas/page.py

from typing import Annotated
from pydantic import BaseModel
from fastapi import Query

class Paginated[T](BaseModel):
    items: list[T]
    total: int
    limit: int
    offset: int

class PageParams(BaseModel):
    limit: int = Query(20, ge=1, le=100)
    offset: int = Query(0, ge=0)

PageDep = Annotated[PageParams, Query()]

app/routers/users.py

from sqlalchemy import select, func

@router.get("", response_model=Paginated[UserRead])
def list_users(db: DbDep, page: PageDep):
    total = db.scalar(select(func.count()).select_from(User))
    rows = db.scalars(
        select(User).order_by(User.id).limit(page.limit).offset(page.offset)
    ).all()
    return Paginated(items=rows, total=total, limit=page.limit, offset=page.offset)

Paginated[T] is a Pydantic generic — declare response_model=Paginated[UserRead] once and the items are filtered/typed for free, with the page envelope documented in /docs. le=100 caps the page size so a client can't ask for 10 million rows and OOM you.

always pair limit/offset with a deterministic order_by — without it Postgres can return rows in different order between pages and the client sees duplicates or gaps. For deep pagination (offset in the thousands) offset gets slow; switch to keyset/cursor pagination — see the Postgres shelf.

Response model from ORM

When: returning DB rows as JSON — separate Pydantic schemas for Create / Read / Update so the wire shape never leaks internal columns.

app/schemas/user.py

from datetime import datetime
from pydantic import BaseModel, ConfigDict, EmailStr

class UserCreate(BaseModel):        # what the client sends to POST
    email: EmailStr
    password: str
    name: str

class UserUpdate(BaseModel):        # PATCH — all optional
    name: str | None = None

class UserRead(BaseModel):          # what we return — no password
    model_config = ConfigDict(from_attributes=True)
    id: int
    email: EmailStr
    name: str
    created_at: datetime

app/routers/users.py

@router.post("", response_model=UserRead, status_code=201)
def create_user(payload: UserCreate, db: DbDep):
    user = user_service.create(db, payload)   # returns an ORM User
    return user                                # serialized via UserRead

ConfigDict(from_attributes=True) is the Pydantic v2 name for what used to be orm_mode — it lets the schema read attributes off an ORM object (user.email) instead of needing a dict. With response_model=UserRead, FastAPI strips anything not on the schema, so a password_hash column physically cannot leak into the response.

don't reuse one mega-schema for input and output — the input has password, the output must not, and the DB-generated id/created_at shouldn't be client-settable. Three small schemas beat one with half the fields optional. Watch lazy-loaded relationships: serializing them inside the request can fire extra queries (N+1) — load them eagerly in the service.

Cross-cutting 4 recipes

Health & readiness endpoints

When: deploying anywhere with a load balancer or orchestrator (Docker, k8s, Render) — it needs a cheap liveness check and a real readiness check.

app/routers/health.py

from fastapi import APIRouter, HTTPException
from sqlalchemy import text
from app.deps import DbDep

router = APIRouter()

@router.get("/health")             # liveness — am I running?
def health():
    return {"status": "ok"}

@router.get("/ready")              # readiness — can I serve traffic?
def ready(db: DbDep):
    try:
        db.execute(text("SELECT 1"))
    except Exception:
        raise HTTPException(503, "db unreachable")
    return {"status": "ready"}

two checks, two jobs: /health is dirt-cheap (no I/O) so the orchestrator can ping it every second to decide whether to restart the container; /ready actually pings the DB so the load balancer only sends traffic once dependencies are live. Mixing them means a slow DB makes the platform kill a healthy process.

keep /health dependency-free — if it touches the DB, a DB blip triggers a restart loop instead of just pulling you out of rotation. And return 503 (not 500) from /ready on failure; most platforms treat 503 as "not ready, retry" and anything else as a hard error.

Request-ID + timing middleware

When: you want every request traceable across logs — a correlation id on the way out and one log line per request with method, path, status, duration.

app/core/middleware.py

import time, uuid, logging
from fastapi import Request

log = logging.getLogger("api.access")

async def request_context(request: Request, call_next):
    rid = request.headers.get("X-Request-ID") or uuid.uuid4().hex
    start = time.perf_counter()
    response = await call_next(request)
    dur_ms = (time.perf_counter() - start) * 1000
    response.headers["X-Request-ID"] = rid
    log.info(
        "%s %s -> %s %.1fms",
        request.method, request.url.path, response.status_code, dur_ms,
        extra={"request_id": rid},
    )
    return response

# wire in factory: app.middleware("http")(request_context)

honour an incoming X-Request-ID if the caller (a gateway, the frontend) sent one, otherwise mint a fresh hex — then echo it back and stamp it on every log line. When something breaks at 3am you grep one id and see the whole request's trail across services.

http middleware runs around the exception handlers, so an unhandled error still passes through here — but if you raise inside the middleware itself, you bypass your error envelope entirely. Do the risky work inside call_next's response, not before it. And use perf_counter, not time.time(), for durations — the wall clock can jump backwards.

File upload (validated)

When: accepting a file — DocChat's PDF ingest. Stream it, check the content type, cap the size before you trust it.

app/routers/documents.py

from fastapi import APIRouter, UploadFile, File, HTTPException

router = APIRouter()
MAX_BYTES = 25 * 1024 * 1024            # 25 MB
ALLOWED = {"application/pdf"}

@router.post("/documents", status_code=201)
async def upload(file: Annotated[UploadFile, File()], db: DbDep):
    if file.content_type not in ALLOWED:
        raise HTTPException(415, "Only PDF is supported")

    size = 0
    chunks: list[bytes] = []
    while chunk := await file.read(1 << 20):   # 1 MB at a time
        size += len(chunk)
        if size > MAX_BYTES:
            raise HTTPException(413, "File too large")
        chunks.append(chunk)

    data = b"".join(chunks)
    doc = document_service.ingest(db, file.filename, data)
    return {"id": doc.id, "bytes": size}

read in chunks and tally the size as you go, so an oversized upload is rejected with 413 mid-stream instead of after the whole thing is in memory. UploadFile spools large files to a temp file on disk automatically, so you don't blow up RAM. Needs python-multipart installed or FastAPI raises at import.

content_type comes from the client and can be spoofed — for anything security-sensitive, sniff the real magic bytes (a PDF starts with %PDF) rather than trusting the header. Also don't reach for the per-request size limit alone; set a limit at the reverse proxy too, so a huge body never even reaches Python.

Background task (fire-and-forget)

When: work that should happen after the response — a confirmation email, kicking off post-upload processing — without making the client wait.

app/routers/documents.py

from fastapi import BackgroundTasks

def process_document(doc_id: int):
    # runs after the response is sent: parse, chunk, embed...
    ...

@router.post("/documents/{doc_id}/process", status_code=202)
def start_processing(doc_id: int, tasks: BackgroundTasks):
    tasks.add_task(process_document, doc_id)
    return {"status": "accepted", "id": doc_id}

BackgroundTasks is perfect for short, best-effort work — inject it, add_task, return 202 Accepted, and FastAPI runs the function after the response flushes. Zero extra infrastructure. The client gets an instant ack instead of waiting on the side effect.

be honest about the limits: a background task runs in the same process, so if the worker restarts mid-task the work is silently lost, and a long/CPU-heavy job will hog the event loop or worker. For anything that must not be dropped, must retry, or takes real time (embedding a 200-page PDF) → reach for Celery / RQ / Arq with a proper queue. BackgroundTasks is for fire-and-forget, not guaranteed delivery.

Source Patterns follow the current FastAPI docs (dependencies with Annotated, lifespan, BackgroundTasks, security utilities) and Pydantic v2 (ConfigDict(from_attributes=True), pydantic-settings). All snippets are 2026-current — no deprecated @app.on_event or orm_mode.