Cookbook · FastAPI · 2026
The wiring you paste into every API: the project layout, the DB session dep, settings, CORS, the error envelope, auth, pagination, health checks. Pydantic v2 and current FastAPI only — no @app.on_event, no orm_mode. Code, when to reach for it, and the gotcha.
functions.php kit — the reusable scaffolding you wire once and trust forever. Lift the snippet, swap the names, ship. Everything here is the 2026 idiom: lifespan over startup events, Annotated[...] dependencies, ConfigDict(from_attributes=True) over the old orm_mode. Most of it is already running in DocChat, so you have a live place to copy it into.
When: starting any API bigger than a toy — the folder tree that scales past one main.py without a rewrite.
project tree
# app/ # main.py — create_app(), lifespan, router mounting # core/ # config.py — Settings (pydantic-settings), get_settings() # db.py — engine, SessionLocal, get_db() dependency # security.py — JWT encode/decode, password hashing # models/ — SQLAlchemy ORM tables (DB shape) # user.py # schemas/ — Pydantic models (wire shape: Create/Read/Update) # user.py # routers/ — APIRouter per resource (HTTP layer only) # users.py # services/ — business logic, talks to models (no HTTP here) # user_service.py # deps.py — shared Annotated dependency aliases # tests/ # pyproject.toml .env .env.example
routers do HTTP (parse request, return response), services do logic (the part you unit-test without a client), models are the DB, schemas are the wire. Mixing them is the #1 reason a FastAPI app turns into spaghetti — keep each layer ignorant of the one above it.
don't import a router from a service or call HTTPException inside a service — that couples your logic to HTTP and makes it un-reusable. Services raise domain errors; routers translate them.
When: every route that touches the database — one session per request, opened on entry, closed on exit, even if the route raises.
app/core/db.py
from collections.abc import Generator from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker, Session from app.core.config import get_settings engine = create_engine(get_settings().database_url, pool_pre_ping=True) SessionLocal = sessionmaker(bind=engine, autoflush=False, expire_on_commit=False) def get_db() -> Generator[Session]: db = SessionLocal() try: yield db finally: db.close()
app/deps.py
from typing import Annotated from fastapi import Depends from sqlalchemy.orm import Session from app.core.db import get_db # Alias once, reuse everywhere: def route(db: DbDep): ... DbDep = Annotated[Session, Depends(get_db)]
yield makes get_db a context-managed dependency — the code after yield always runs on the way out, so the session can't leak. The Annotated alias is the 2026 idiom: you type db: DbDep instead of repeating Depends(get_db) in every signature, and it's reusable across files.
pool_pre_ping=True is not optional in production — without it, a connection killed by the DB or a proxy stays in the pool and the next request blows up with "server closed the connection." Also never share one session across requests; that's how you get stale data and cross-request corruption.
When: anywhere you need config (DB URL, secret key, allowed origins) — typed, validated from env, and cached so it's parsed once.
app/core/config.py
from functools import lru_cache from typing import Annotated from fastapi import Depends from pydantic import Field from pydantic_settings import BaseSettings, SettingsConfigDict class Settings(BaseSettings): model_config = SettingsConfigDict(env_file=".env", extra="ignore") app_name: str = "DocChat API" database_url: str jwt_secret: str jwt_alg: str = "HS256" cors_origins: list[str] = Field(default_factory=list) @lru_cache def get_settings() -> Settings: return Settings() # reads env once, then cached SettingsDep = Annotated[Settings, Depends(get_settings)]
@lru_cache means the env is read and validated exactly once per process, not on every request — and because get_settings is a normal callable, you can call it directly at import time (in db.py) and inject it via Depends in routes. Same source of truth both ways.
a missing required field (database_url with no default) raises at startup, not at first request — that's the behaviour you want, but it means a broken .env kills the boot. In tests, override with app.dependency_overrides[get_settings] = ... rather than mutating real env.
When: the entry point of every app — build the FastAPI instance in a function, open/close resources with lifespan, mount routers with prefixes and tags.
app/main.py
from contextlib import asynccontextmanager from fastapi import FastAPI from app.core.config import get_settings from app.routers import users, health @asynccontextmanager async def lifespan(app: FastAPI): # startup: open pools, http clients, warm caches app.state.http = ... # e.g. httpx.AsyncClient() yield # shutdown: close them in reverse await app.state.http.aclose() def create_app() -> FastAPI: settings = get_settings() app = FastAPI(title=settings.app_name, lifespan=lifespan) app.include_router(health.router, tags=["meta"]) app.include_router(users.router, prefix="/users", tags=["users"]) return app app = create_app()
lifespan is the modern replacement for the deprecated @app.on_event("startup"/"shutdown") pair — one async context manager, startup before yield, shutdown after. The factory (create_app) lets tests build a fresh app with overridden deps instead of importing a global. prefix + tags keep your URL structure and /docs grouping clean.
resources opened in lifespan live on app.state, not in module globals — reach them in routes via request.app.state.http. And if startup raises before yield, the app never serves traffic; wrap risky init so a failed cache warm doesn't take down the whole service.
When: a browser frontend on a different origin calls your API — the SPA, the Next.js app. Wire it from settings, never hardcode * with credentials.
app/main.py
from fastapi.middleware.cors import CORSMiddleware app.add_middleware( CORSMiddleware, allow_origins=settings.cors_origins, # ["https://app.docchat.io"] allow_credentials=True, allow_methods=["*"], allow_headers=["*"], )
explicit origins from settings means staging and prod ship different allowlists from the same code — no edits, just env. allow_methods/allow_headers as * is fine; it's allow_origins that needs to be tight.
allow_origins=["*"] together with allow_credentials=True is silently ignored by browsers — cookies/Authorization won't be sent and you'll chase a phantom CORS bug. With credentials you must list real origins. Order matters too: add CORS before routers depend on it being there.
When: you want every error — yours and FastAPI's validation ones — to come back in one predictable JSON shape the frontend can switch on.
app/core/errors.py
from fastapi import FastAPI, Request, status from fastapi.responses import JSONResponse from fastapi.exceptions import RequestValidationError class AppError(Exception): def __init__(self, code: str, message: str, status: int = 400): self.code, self.message, self.status = code, message, status def register_errors(app: FastAPI) -> None: @app.exception_handler(AppError) async def _app_error(request: Request, exc: AppError): return JSONResponse( status_code=exc.status, content={"error": {"code": exc.code, "message": exc.message}}, ) @app.exception_handler(RequestValidationError) async def _validation(request: Request, exc: RequestValidationError): return JSONResponse( status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, content={"error": {"code": "validation_error", "message": "Invalid request", "details": exc.errors()}}, )
one envelope — {"error": {"code", "message"}} — means the frontend writes one error handler, not a different parse per status. Your services raise AppError("doc_not_found", ...) with a stable machine code; routers stay clean. Call register_errors(app) in the factory.
overriding RequestValidationError changes the default 422 body, so any client (or test) that read FastAPI's stock {"detail": [...]} shape breaks — keep exc.errors() under a details key so you don't lose the field-level info. And don't catch bare Exception and leak stack traces; log them, return a generic 500 envelope.
When: protecting routes — decode the JWT and load the user (401 on failure), plus a factory that gates by role (403).
app/deps.py
from typing import Annotated from fastapi import Depends, HTTPException, status from fastapi.security import OAuth2PasswordBearer import jwt # pyjwt from app.core.config import SettingsDep from app.models.user import User oauth2 = OAuth2PasswordBearer(tokenUrl="/auth/login") async def get_current_user( token: Annotated[str, Depends(oauth2)], settings: SettingsDep, db: DbDep, ) -> User: creds_exc = HTTPException(status.HTTP_401_UNAUTHORIZED, "Could not validate credentials", headers={"WWW-Authenticate": "Bearer"}) try: payload = jwt.decode(token, settings.jwt_secret, algorithms=[settings.jwt_alg]) user_id = payload.get("sub") except jwt.PyJWTError: raise creds_exc user = db.get(User, user_id) if user is None: raise creds_exc return user CurrentUser = Annotated[User, Depends(get_current_user)] def require_role(role: str): def _guard(user: CurrentUser) -> User: if user.role != role: raise HTTPException(status.HTTP_403_FORBIDDEN, "Insufficient role") return user return _guard # usage: def admin_route(user: Annotated[User, Depends(require_role("admin"))]): ...
require_role is a dependency factory — it returns a fresh dep configured for the role, so you write Depends(require_role("admin")) per route with zero duplication. 401 means "who are you?" (bad/absent token); 403 means "I know you, you can't" (wrong role) — keeping them distinct is what good APIs do.
use pyjwt and always pass algorithms=[...] explicitly — decoding without pinning the algorithm is the classic JWT vulnerability (an attacker switches to none or HS/RS confusion). Never log the token or the secret. And keep sub a stable id, not the email, so a user changing their email doesn't invalidate live tokens.
When: any list endpoint that can grow — return a page, not the whole table, in a consistent {items, total, limit, offset} shape.
app/schemas/page.py
from typing import Annotated from pydantic import BaseModel from fastapi import Query class Paginated[T](BaseModel): items: list[T] total: int limit: int offset: int class PageParams(BaseModel): limit: int = Query(20, ge=1, le=100) offset: int = Query(0, ge=0) PageDep = Annotated[PageParams, Query()]
app/routers/users.py
from sqlalchemy import select, func @router.get("", response_model=Paginated[UserRead]) def list_users(db: DbDep, page: PageDep): total = db.scalar(select(func.count()).select_from(User)) rows = db.scalars( select(User).order_by(User.id).limit(page.limit).offset(page.offset) ).all() return Paginated(items=rows, total=total, limit=page.limit, offset=page.offset)
Paginated[T] is a Pydantic generic — declare response_model=Paginated[UserRead] once and the items are filtered/typed for free, with the page envelope documented in /docs. le=100 caps the page size so a client can't ask for 10 million rows and OOM you.
always pair limit/offset with a deterministic order_by — without it Postgres can return rows in different order between pages and the client sees duplicates or gaps. For deep pagination (offset in the thousands) offset gets slow; switch to keyset/cursor pagination — see the Postgres shelf.
When: returning DB rows as JSON — separate Pydantic schemas for Create / Read / Update so the wire shape never leaks internal columns.
app/schemas/user.py
from datetime import datetime from pydantic import BaseModel, ConfigDict, EmailStr class UserCreate(BaseModel): # what the client sends to POST email: EmailStr password: str name: str class UserUpdate(BaseModel): # PATCH — all optional name: str | None = None class UserRead(BaseModel): # what we return — no password model_config = ConfigDict(from_attributes=True) id: int email: EmailStr name: str created_at: datetime
app/routers/users.py
@router.post("", response_model=UserRead, status_code=201) def create_user(payload: UserCreate, db: DbDep): user = user_service.create(db, payload) # returns an ORM User return user # serialized via UserRead
ConfigDict(from_attributes=True) is the Pydantic v2 name for what used to be orm_mode — it lets the schema read attributes off an ORM object (user.email) instead of needing a dict. With response_model=UserRead, FastAPI strips anything not on the schema, so a password_hash column physically cannot leak into the response.
don't reuse one mega-schema for input and output — the input has password, the output must not, and the DB-generated id/created_at shouldn't be client-settable. Three small schemas beat one with half the fields optional. Watch lazy-loaded relationships: serializing them inside the request can fire extra queries (N+1) — load them eagerly in the service.
When: deploying anywhere with a load balancer or orchestrator (Docker, k8s, Render) — it needs a cheap liveness check and a real readiness check.
app/routers/health.py
from fastapi import APIRouter, HTTPException from sqlalchemy import text from app.deps import DbDep router = APIRouter() @router.get("/health") # liveness — am I running? def health(): return {"status": "ok"} @router.get("/ready") # readiness — can I serve traffic? def ready(db: DbDep): try: db.execute(text("SELECT 1")) except Exception: raise HTTPException(503, "db unreachable") return {"status": "ready"}
two checks, two jobs: /health is dirt-cheap (no I/O) so the orchestrator can ping it every second to decide whether to restart the container; /ready actually pings the DB so the load balancer only sends traffic once dependencies are live. Mixing them means a slow DB makes the platform kill a healthy process.
keep /health dependency-free — if it touches the DB, a DB blip triggers a restart loop instead of just pulling you out of rotation. And return 503 (not 500) from /ready on failure; most platforms treat 503 as "not ready, retry" and anything else as a hard error.
When: you want every request traceable across logs — a correlation id on the way out and one log line per request with method, path, status, duration.
app/core/middleware.py
import time, uuid, logging from fastapi import Request log = logging.getLogger("api.access") async def request_context(request: Request, call_next): rid = request.headers.get("X-Request-ID") or uuid.uuid4().hex start = time.perf_counter() response = await call_next(request) dur_ms = (time.perf_counter() - start) * 1000 response.headers["X-Request-ID"] = rid log.info( "%s %s -> %s %.1fms", request.method, request.url.path, response.status_code, dur_ms, extra={"request_id": rid}, ) return response # wire in factory: app.middleware("http")(request_context)
honour an incoming X-Request-ID if the caller (a gateway, the frontend) sent one, otherwise mint a fresh hex — then echo it back and stamp it on every log line. When something breaks at 3am you grep one id and see the whole request's trail across services.
http middleware runs around the exception handlers, so an unhandled error still passes through here — but if you raise inside the middleware itself, you bypass your error envelope entirely. Do the risky work inside call_next's response, not before it. And use perf_counter, not time.time(), for durations — the wall clock can jump backwards.
When: accepting a file — DocChat's PDF ingest. Stream it, check the content type, cap the size before you trust it.
app/routers/documents.py
from fastapi import APIRouter, UploadFile, File, HTTPException router = APIRouter() MAX_BYTES = 25 * 1024 * 1024 # 25 MB ALLOWED = {"application/pdf"} @router.post("/documents", status_code=201) async def upload(file: Annotated[UploadFile, File()], db: DbDep): if file.content_type not in ALLOWED: raise HTTPException(415, "Only PDF is supported") size = 0 chunks: list[bytes] = [] while chunk := await file.read(1 << 20): # 1 MB at a time size += len(chunk) if size > MAX_BYTES: raise HTTPException(413, "File too large") chunks.append(chunk) data = b"".join(chunks) doc = document_service.ingest(db, file.filename, data) return {"id": doc.id, "bytes": size}
read in chunks and tally the size as you go, so an oversized upload is rejected with 413 mid-stream instead of after the whole thing is in memory. UploadFile spools large files to a temp file on disk automatically, so you don't blow up RAM. Needs python-multipart installed or FastAPI raises at import.
content_type comes from the client and can be spoofed — for anything security-sensitive, sniff the real magic bytes (a PDF starts with %PDF) rather than trusting the header. Also don't reach for the per-request size limit alone; set a limit at the reverse proxy too, so a huge body never even reaches Python.
When: work that should happen after the response — a confirmation email, kicking off post-upload processing — without making the client wait.
app/routers/documents.py
from fastapi import BackgroundTasks def process_document(doc_id: int): # runs after the response is sent: parse, chunk, embed... ... @router.post("/documents/{doc_id}/process", status_code=202) def start_processing(doc_id: int, tasks: BackgroundTasks): tasks.add_task(process_document, doc_id) return {"status": "accepted", "id": doc_id}
BackgroundTasks is perfect for short, best-effort work — inject it, add_task, return 202 Accepted, and FastAPI runs the function after the response flushes. Zero extra infrastructure. The client gets an instant ack instead of waiting on the side effect.
be honest about the limits: a background task runs in the same process, so if the worker restarts mid-task the work is silently lost, and a long/CPU-heavy job will hog the event loop or worker. For anything that must not be dropped, must retry, or takes real time (embedding a 200-page PDF) → reach for Celery / RQ / Arq with a proper queue. BackgroundTasks is for fire-and-forget, not guaranteed delivery.
Annotated, lifespan, BackgroundTasks, security utilities) and Pydantic v2 (ConfigDict(from_attributes=True), pydantic-settings). All snippets are 2026-current — no deprecated @app.on_event or orm_mode.