Module 1 · Python · Deep Dive

Python Foundations

From PHP habits to fluent Python — variables, types, strings, and the collections you'll use in every backend you ever write.

BasicIntermediateBuild

Why this matters The entire backend of DocChat — FastAPI routes, the RAG pipeline, database calls — is Python. You can't fake fluency here; interviewers hear it in five minutes. The good news: you already know how programs are shaped from PHP. This lesson swaps the grammar and rewires a few instincts, then drills them until they're automatic.
In this lesson
  1. Running Python & the REPL
  2. Variables & dynamic typing
  3. Core types & truthiness
  4. Strings & f-strings
  5. Lists, dicts, tuples, sets
  6. Control flow
  7. Build: a tiny word-counter
  8. Check yourself

1 · Running Python

Python runs your file top to bottom, like a PHP script without the HTML. Two ways to run it:

# 1. Run a file
python hello.py

# 2. The REPL — an interactive shell. Type code, see results instantly.
python
>>> 2 + 2
4

The REPL (Read-Eval-Print Loop) is your laboratory — use it constantly to test a line before you commit it to a file. There's no PHP equivalent you used day-to-day; lean on it.

PHP bridge: a .py file ≈ a .php script, but there are no <?php ?> tags — the whole file is code.

2 · Variables & dynamic typing

No $, no ;, no var/let. You assign with = and the variable simply exists:

name = "Sam"
age  = 29
is_active = True

Like PHP, Python is dynamically typed — a variable's type is whatever you last assigned. Unlike PHP, names are case-sensitive and conventionally written snake_case (not $camelCase).

Naming conventions (interviewers notice) snake_case for variables and functions · PascalCase for classes · UPPER_CASE for constants. Writing getUserName in Python signals "I just arrived from another language."

3 · Core types & truthiness

TypeExampleNotes
str"hello"Text. Single or double quotes — same thing.
int42Whole numbers, unlimited size.
float3.14Decimals.
boolTrue / FalseCapitalised. Are subtypes of int.
NoneTypeNonePython's null — "no value".

Check a type with type(x). Convert with int("5"), str(5), float("1.5").

Truthiness matters because you'll lean on it constantly. These are all "falsy": False, None, 0, "", [], {}. Everything else is "truthy".

items = []
if not items:
    print("the list is empty")   # runs — [] is falsy
PHP bridge: very close to PHP's truthiness, but note "0" (the string zero) is truthy in Python, unlike PHP where it's falsy.

4 · Strings & f-strings

Forget the . concatenation operator. The modern way to build strings is the f-string — a string prefixed with f, with {expressions} inlined:

name = "Sam"
count = 3
print(f"Hi {name}, you have {count} docs")
print(f"Next year: {count + 1}")   # expressions work too

Strings are objects with useful methods — these come up in interviews and in real ingestion code (you'll clean document text in the RAG module):

s = "  Hello World  "
s.strip()            # "Hello World"  (trim whitespace)
s.lower()            # "  hello world  "
s.replace("World", "UAE")
s.split()            # ['Hello', 'World']  (split on whitespace)
",".join(["a", "b"])  # "a,b"  (join is called on the separator!)

5 · The four collections

This is the heart of the lesson. Master these four and most Python clicks into place.

list ordered, changeable

docs = ["intro.pdf", "report.pdf"]
docs.append("notes.pdf")   # add to end
docs[0]                    # "intro.pdf"  (zero-indexed)
docs[-1]                   # "notes.pdf"  (negative = from the end)
docs[0:2]                  # ['intro.pdf', 'report.pdf']  (a "slice")
len(docs)                 # 3
PHP bridge: a Python list ≈ a PHP numerically-indexed array. append$a[] = x.

dict key → value pairs

doc = {"title": "Intro", "pages": 12}
doc["title"]            # "Intro"
doc["author"] = "Sam"   # add/update a key
doc.get("missing")       # None  (safe — no error)
"title" in doc          # True
doc.keys(), doc.values(), doc.items()
PHP bridge: a dict ≈ a PHP associative array (["k" => "v"]). It's the single most-used structure in web code — JSON becomes a dict.

tuple ordered, unchangeable

point = (10, 20)     # round brackets, fixed once created
x, y = point        # "unpacking" — x=10, y=20

Use a tuple when a group of values shouldn't change (coordinates, a row from a database). Unpacking is everywhere in Python.

set unique, unordered

tags = {"ai", "uae", "ai"}   # {"ai", "uae"} — duplicates dropped
"ai" in tags                 # True (very fast membership test)

Reach for a set when you need uniqueness or fast "is this in here?" checks.

6 · Control flow

Blocks are defined by a colon + indentation, not braces. Four spaces per level. This is syntax, not style — get it wrong and Python raises IndentationError.

if count > 10:
    tier = "heavy"
elif count > 0:
    tier = "light"
else:
    tier = "empty"

# Loop over ITEMS directly — not an index counter
for doc in docs:
    print(doc)

# Need the index too? enumerate gives both
for i, doc in enumerate(docs):
    print(i, doc)

while count > 0:
    count -= 1      # no count-- in Python
The #1 beginner trap Looping with for i in range(len(docs)): docs[i] is "C/PHP brain". The Pythonic way is for doc in docs:. Only reach for range() when you genuinely need numbers.

7 · Build it

Your tangible win Build a word counter in the REPL or a file. Given a blob of text, print the 3 most common words. You'll use exactly this shape when you chunk documents for RAG later.

Try it yourself first, then compare. A clean version:

wordcount.py
text = """the cat sat on the mat the cat ran"""

counts = {}
for word in text.split():
    counts[word] = counts.get(word, 0) + 1

# sort items by count, descending, take top 3
top = sorted(counts.items(), key=lambda kv: kv[1], reverse=True)[:3]
for word, n in top:
    print(f"{word}: {n}")

Don't worry if lambda is fuzzy — you'll meet it properly in Lesson 1.3. The point: a dict + a loop solves a real problem in six lines.

8 · Check yourself

Answer from memory — retrieval is what moves this from "I read it" to "I know it".

Recall quiz

Which value is truthy in Python?

Which collection forbids duplicate members?

How do you safely read a maybe-missing dict key?

What defines a block of code in Python?

Which builds a string with inlined variables?

Primary source ⭐ The Official Python Tutorial — §3 An Informal Introduction. The canonical, authoritative reference for everything above. For the PHP-to-Python angle, skim From PHP to Python.