Module 8 · Capstone · Deep Dive
This is the week it all becomes one thing. Seven modules of separate skills — Python, FastAPI, Postgres, auth, React, Next.js, RAG, deploy — now snap together into a single deployed app you can demo and talk through in an interview.
ArchitectureAssemblyShip
DocChat is a classic decoupled full-stack app: a Next.js frontend, a FastAPI backend, a Postgres database with vectors, and an LLM for generation. Each box owns one job and talks to the next over a clear boundary.
┌──────────┐ HTTPS ┌────────────┐ HTTP+JWT ┌──────────┐
│ Browser │ ─────────▶ │ Next.js │ ───────────▶ │ FastAPI │
│ (user) │ │ (frontend) │ │ (API) │
└──────────┘ ◀───────── └────────────┘ ◀─────────── └────┬─────┘
HTML/JSON JSON │
│ SQL + vector search
▼
┌───────────────────┐
│ Postgres+pgvector │
│ (data + embeddings)│
└─────────┬─────────┘
│ retrieved chunks
▼
┌───────────────────┐
│ LLM (Claude / │
│ OpenAI) → answer │
└───────────────────┘
Read each arrow as a contract:
| Arrow | What crosses it |
|---|---|
| Browser → Next.js | The user's clicks and form submits. Next.js serves the UI (server + client components) and runs in the browser. |
| Next.js → FastAPI | fetch calls carrying JSON and a JWT in the Authorization header. This is the network boundary where CORS lives. |
| FastAPI → Postgres | SQLAlchemy sessions: row reads/writes and vector similarity search (ORDER BY embedding <=> query) via pgvector. |
| FastAPI → LLM | A prompt built from the user's question plus the retrieved chunks. The model returns the grounded answer text. |
Three tables carry the whole app. You built these across Modules 3 and 6 — here's how they relate:
users documents chunks
───── ───────── ──────
id (PK) id (PK) id (PK)
email user_id (FK→users) document_id (FK→documents)
password_hash filename content (the text slice)
created_at status embedding (vector — pgvector)
created_at chunk_index
One user owns many documents; one document is split into many chunks; each chunk stores both its text and its embedding (the vector). Retrieval searches the embedding column; generation reads the matching content.
users + password_hash + JWT — built in Lesson 3.3 · Auth with JWT.embedding column & vector search — Lesson 6.1 · RAG Foundations.user_id, taken from the JWT — never from the request body. A user must only ever retrieve over their own documents. Forgetting this is the most common security bug in this kind of app.
Trace one full journey. Every step maps to a module you've already built — this is the "story" you'll tell an interviewer.
| Step | What happens | Built in |
|---|---|---|
| 1. Sign up / log in | Frontend posts credentials → FastAPI hashes/verifies → returns a JWT. Frontend stores it and sends it on every later call. | 3.3 Auth |
| 2. Upload a PDF | User picks a file → frontend POSTs it to /documents with the JWT → a documents row is created with status processing. | 2.2 FastAPI |
| 3. Ingest → chunk → embed | Backend extracts the text, splits it into chunks, embeds each chunk, and writes chunks rows with their vectors. Status flips to ready. | 6.2 Building RAG |
| 4. Ask a question | User types a question → frontend POSTs it to /ask with the JWT and a document id. | 5.2 Next.js |
| 5. Retrieve → generate | Backend embeds the question, runs vector search for the closest chunks, injects them into a prompt, and calls the LLM. | 6.2 Building RAG |
| 6. Display answer + sources | Frontend renders the answer and lists the source chunks it was grounded in — so the user can trust it. | 4.2 Hooks |
Two repos (or one monorepo with two folders). Keep them clean — recruiters open these. A practical, 2026-standard layout:
docchat-api/
docchat-api/ ├── app/ │ ├── main.py # FastAPI app, CORS, router includes │ ├── config.py # settings from env (Pydantic Settings) │ ├── database.py # engine + session dependency │ ├── models.py # User, Document, Chunk (SQLAlchemy) │ ├── schemas.py # Pydantic request/response models │ ├── auth.py # hashing, JWT create/verify, get_current_user │ ├── routers/ │ │ ├── auth.py # /register, /login │ │ ├── documents.py # /documents (upload, list) │ │ └── chat.py # /ask │ └── rag/ │ ├── ingest.py # extract → chunk → embed → store │ └── retrieve.py # embed question → vector search → prompt ├── alembic/ # migrations ├── .env # DATABASE_URL, JWT_SECRET, LLM_API_KEY (gitignored) ├── Dockerfile └── requirements.txt
docchat-web/
docchat-web/ ├── app/ │ ├── layout.tsx # root layout │ ├── page.tsx # landing │ ├── login/page.tsx # auth form │ └── chat/page.tsx # upload + ask UI ├── components/ │ ├── UploadBox.tsx │ ├── ChatWindow.tsx │ └── SourceList.tsx ├── lib/ │ └── api.ts # fetch wrapper that attaches the JWT ├── .env.local # NEXT_PUBLIC_API_URL (gitignored) └── package.json
The single most important file for integration is lib/api.ts — one place that knows the backend URL and attaches the token to every request. Centralise it and the whole frontend wires up cleanly.
You don't wire everything at once. Assemble in the order data flows, proving each seam before adding the next. Wire the backend spine first (it's the contract the frontend depends on), then the frontend, then the AI.
| # | Milestone | Proves the seam |
|---|---|---|
| 1 | DB + models migrated; FastAPI boots | Postgres ↔ SQLAlchemy |
| 2 | Register / login returns a JWT | Auth works end to end |
| 3 | Protected /documents upload + list | JWT actually guards routes |
| 4 | Ingestion pipeline fills chunks | RAG write path |
| 5 | /ask retrieves + generates an answer | RAG read path + LLM |
| 6 | Next.js login page hits the API | CORS + token storage |
| 7 | Upload + chat UI wired to the API | Full happy path |
| 8 | Deploy both; set production env vars | It's live → 7.2 Deploy & CI |
/docs) at every milestone — when the backend is solid, the frontend is just wiring. Backend spine before frontend skin.
Three integration concerns connect the boxes. Get these three right and the app holds together.
The frontend and backend live on different origins (different URL/port). Browsers block cross-origin requests unless the API explicitly allows them. In FastAPI:
from fastapi.middleware.cors import CORSMiddleware app.add_middleware( CORSMiddleware, allow_origins=[settings.FRONTEND_URL], # e.g. http://localhost:3000 allow_credentials=True, allow_methods=["*"], allow_headers=["*"], )
Forget this and the browser console shows the classic "blocked by CORS policy" error — even though the API itself is fine.
Login returns a token. The frontend stores it and attaches it as Authorization: Bearer <token> on every protected request. The backend's get_current_user dependency decodes it, finds the user, and scopes all queries to them.
// lib/api.ts — attach the token once, everywhere export async function apiFetch(path, options = {}) { const token = localStorage.getItem("token"); return fetch(`${process.env.NEXT_PUBLIC_API_URL}${path}`, { ...options, headers: { ...options.headers, Authorization: `Bearer ${token}` }, }); }
The two apps share zero code but must agree on the boundary. Each reads its config from env:
| App | Var | Purpose |
|---|---|---|
| backend | DATABASE_URL | Postgres connection string |
| backend | JWT_SECRET | signs/verifies tokens |
| backend | LLM_API_KEY | calls the model |
| backend | FRONTEND_URL | the CORS allow-list origin |
| frontend | NEXT_PUBLIC_API_URL | where the API lives |
The two values that must match across apps: the frontend's NEXT_PUBLIC_API_URL points at the backend, and the backend's FRONTEND_URL points back at the frontend. Mismatch either and you get CORS errors or failed fetches.
.env in a Laravel/Slim app — never commit secrets, read them at runtime. Next.js adds one twist: only NEXT_PUBLIC_-prefixed vars reach the browser.
Work the milestones in order. A checklist for the run:
1. Backend up → uvicorn app.main:app --reload → open /docs 2. Register a user in /docs → copy the returned JWT 3. Authorize in /docs → upload a small PDF → confirm chunks exist 4. Call /ask with a question → confirm a grounded answer comes back 5. Frontend up → npm run dev → log in from the UI (CORS must pass) 6. Upload + ask from the UI → answer + sources render 7. Commit. You have a working capstone.
If a step fails, the drills page walks the exact breakages — CORS, 401s, empty retrieval, env mismatch — and how to trace each one.
These test whether you understand the architecture, not just the code. Answer from memory.
Which boundary is where CORS is configured?
Where do the document embeddings actually live?
What does the JWT carry to scope queries safely?
Which piece should you wire and prove up first?
In the "ask" flow, what happens just before generation?