Module 8 · Capstone · Drills

Drills: Capstone Build

Integration is where projects break. These drills are the breakages you will hit wiring DocChat together — CORS, 401s, empty retrieval, env mismatch — and the exact way to trace and fix each one. Do them at the keyboard, with your real app.

How to use this page Each drill is a real failure mode of a two-app system. Try to fix it yourself against your own running DocChat before revealing the solution. The goal isn't to memorise fixes — it's to learn the tracing habit: read the error, find which boundary it's on, check that boundary's config. Tick each box as you go; progress saves in this browser.

A · Wiring the two apps Integration

Drill 1 CORS

You log in from the Next.js UI and the console shows: Access to fetch ... blocked by CORS policy. The API works fine in /docs. Fix it.

Show solution

The API runs on a different origin than the frontend, so the browser blocks it. The API must allow the frontend's origin. In app/main.py:

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:3000"],  # the frontend origin
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

CORS is enforced by the browser, not the server — that's why /docs and curl work but the React app doesn't. The fix is always on the backend, on the Next.js↔FastAPI boundary.

Drill 2 JWT

Login succeeds and returns a token, but every later request to /documents returns 401 Unauthorized. Attach the token correctly.

Show solution

The token has to ride on every protected call as a Bearer header. Centralise it in one fetch wrapper so you can't forget:

// lib/api.ts
export async function apiFetch(path, options = {}) {
  const token = localStorage.getItem("token");
  return fetch(`${process.env.NEXT_PUBLIC_API_URL}${path}`, {
    ...options,
    headers: {
      "Content-Type": "application/json",
      ...options.headers,
      Authorization: `Bearer ${token}`,   # the missing piece
    },
  });
}

A 401 means "I don't know who you are" — the header is missing or malformed. A 403 would mean "I know you, but you can't have this". Different boundary, different fix.

B · Tracing a broken pipeline Debugging

Drill 3 trace the ask

A user asks a question and gets back an empty or nonsense answer — no error, just bad output. Trace the /ask pipeline to find where it breaks.

Show solution

Walk the pipeline in order and print at each stage — the break is wherever the data stops looking right:

1. Question received?     → log the incoming question string
2. Question embedded?     → log the vector length (should be non-zero)
3. Chunks retrieved?      → log how many rows vector search returned
4. Prompt built?          → log the final prompt (chunks present?)
5. LLM called?            → log the raw model response

Most common find: step 3 returns zero chunks — meaning ingestion never populated chunks for this document, or the query is scoped to the wrong user_id/document_id. If chunks come back but the answer is empty, the break is step 4: the chunks weren't injected into the prompt.

Binary-search the pipeline: check the middle (retrieval) first. If chunks are there, the bug is downstream (prompt/LLM); if not, it's upstream (ingestion/scoping).

Drill 4 env vars

Locally everything works; deployed, the frontend can't reach the API. Verify the env vars across both apps to find the mismatch.

Show solution

The two boundary values must agree. Print/inspect them on each side:

# backend (.env): the API must allow the deployed frontend origin
FRONTEND_URL=https://docchat-web.vercel.app

# frontend (.env): must point at the deployed API, not localhost
NEXT_PUBLIC_API_URL=https://docchat-api.onrender.com

Classic mistake: NEXT_PUBLIC_API_URL still says http://localhost:8000 in production, or FRONTEND_URL still says localhost:3000 so CORS rejects the real site. Both must be the deployed URLs, and they must point at each other.

Remember: NEXT_PUBLIC_ vars are baked in at build time — change one and you must rebuild/redeploy the frontend for it to take effect.

C · The full build checklist

Tick each only when it genuinely works in your app — not when you "wrote the code".

D · Build challenge Ship

The happy path, end to end Get the full upload → ask → answer flow working from the browser: sign up, upload a PDF, ask a question about it, and see a grounded answer with its sources. One complete pass through every layer. When it runs once, you have a capstone.

Challenge · troubleshoot the common breakages

It won't work first try. Use this map to locate the failure by its symptom, then fix the boundary it points to.

Show troubleshooting guide
SymptomBoundaryFix
"Blocked by CORS policy" in consoleNext.js ↔ FastAPIAdd the frontend origin to allow_origins (Drill 1)
401 on every protected callfrontend token handlingAttach Authorization: Bearer in api.ts (Drill 2)
Answer empty or irrelevantRAG pipelineTrace retrieval — usually zero chunks or wrong scoping (Drill 3)
Works local, fails deployedenv varsPoint both URL vars at the deployed apps; rebuild frontend (Drill 4)
Upload succeeds, ask finds nothingingestion write pathConfirm chunks rows + embeddings were actually written

The discipline: read the error → name the boundary → check that boundary's config. Almost every integration bug is one of these five.

E · Rapid recall Flashcards

Click a card to flip it. Say the answer out loud before you flip — that's the rep that locks the architecture in.

Name the four architecture layers.
Next.js (frontend) → FastAPI (API) → Postgres+pgvector (data) → LLM (generation).
click to flip
Where do the embeddings live?
In the chunks.embedding column in Postgres (via pgvector) — not in the model.
click to flip
What does the JWT secure?
It identifies the user on every call, so the API can scope all queries to their data.
click to flip
Why is CORS needed at all?
Frontend and API are different origins; the browser blocks cross-origin calls unless the API allows them.
click to flip
The ingestion order?
Extract text → chunk → embed → store chunks (with vectors) in Postgres.
click to flip
Retrieval → generation: what's the handoff?
Vector search returns the closest chunks; those are injected into the prompt the LLM generates from.
click to flip
Next The build runs and you can demo it. Now learn to talk about it — every layer, in three minutes, under interview pressure: Lesson 8.2 — Interview Mastery.