Module 1 · Python · Deep Dive
From PHP habits to fluent Python — variables, types, strings, and the collections you'll use in every backend you ever write.
BasicIntermediateBuild
Python runs your file top to bottom, like a PHP script without the HTML. Two ways to run it:
# 1. Run a file python hello.py # 2. The REPL — an interactive shell. Type code, see results instantly. python >>> 2 + 2 4
The REPL (Read-Eval-Print Loop) is your laboratory — use it constantly to test a line before you commit it to a file. There's no PHP equivalent you used day-to-day; lean on it.
PHP bridge: a.py file ≈ a .php script, but there are no <?php ?> tags — the whole file is code.
No $, no ;, no var/let. You assign with = and the variable simply exists:
name = "Sam" age = 29 is_active = True
Like PHP, Python is dynamically typed — a variable's type is whatever you last assigned. Unlike PHP, names are case-sensitive and conventionally written snake_case (not $camelCase).
snake_case for variables and functions · PascalCase for classes · UPPER_CASE for constants. Writing getUserName in Python signals "I just arrived from another language."
| Type | Example | Notes |
|---|---|---|
str | "hello" | Text. Single or double quotes — same thing. |
int | 42 | Whole numbers, unlimited size. |
float | 3.14 | Decimals. |
bool | True / False | Capitalised. Are subtypes of int. |
NoneType | None | Python's null — "no value". |
Check a type with type(x). Convert with int("5"), str(5), float("1.5").
Truthiness matters because you'll lean on it constantly. These are all "falsy": False, None, 0, "", [], {}. Everything else is "truthy".
items = [] if not items: print("the list is empty") # runs — [] is falsyPHP bridge: very close to PHP's truthiness, but note
"0" (the string zero) is truthy in Python, unlike PHP where it's falsy.
Forget the . concatenation operator. The modern way to build strings is the f-string — a string prefixed with f, with {expressions} inlined:
name = "Sam" count = 3 print(f"Hi {name}, you have {count} docs") print(f"Next year: {count + 1}") # expressions work too
Strings are objects with useful methods — these come up in interviews and in real ingestion code (you'll clean document text in the RAG module):
s = " Hello World " s.strip() # "Hello World" (trim whitespace) s.lower() # " hello world " s.replace("World", "UAE") s.split() # ['Hello', 'World'] (split on whitespace) ",".join(["a", "b"]) # "a,b" (join is called on the separator!)
This is the heart of the lesson. Master these four and most Python clicks into place.
docs = ["intro.pdf", "report.pdf"] docs.append("notes.pdf") # add to end docs[0] # "intro.pdf" (zero-indexed) docs[-1] # "notes.pdf" (negative = from the end) docs[0:2] # ['intro.pdf', 'report.pdf'] (a "slice") len(docs) # 3PHP bridge: a Python list ≈ a PHP numerically-indexed array.
append ≈ $a[] = x.
doc = {"title": "Intro", "pages": 12}
doc["title"] # "Intro"
doc["author"] = "Sam" # add/update a key
doc.get("missing") # None (safe — no error)
"title" in doc # True
doc.keys(), doc.values(), doc.items()
PHP bridge: a dict ≈ a PHP associative array (["k" => "v"]). It's the single most-used structure in web code — JSON becomes a dict.
point = (10, 20) # round brackets, fixed once created x, y = point # "unpacking" — x=10, y=20
Use a tuple when a group of values shouldn't change (coordinates, a row from a database). Unpacking is everywhere in Python.
tags = {"ai", "uae", "ai"} # {"ai", "uae"} — duplicates dropped
"ai" in tags # True (very fast membership test)
Reach for a set when you need uniqueness or fast "is this in here?" checks.
Blocks are defined by a colon + indentation, not braces. Four spaces per level. This is syntax, not style — get it wrong and Python raises IndentationError.
if count > 10: tier = "heavy" elif count > 0: tier = "light" else: tier = "empty" # Loop over ITEMS directly — not an index counter for doc in docs: print(doc) # Need the index too? enumerate gives both for i, doc in enumerate(docs): print(i, doc) while count > 0: count -= 1 # no count-- in Python
for i in range(len(docs)): docs[i] is "C/PHP brain". The Pythonic way is for doc in docs:. Only reach for range() when you genuinely need numbers.
Try it yourself first, then compare. A clean version:
wordcount.py
text = """the cat sat on the mat the cat ran""" counts = {} for word in text.split(): counts[word] = counts.get(word, 0) + 1 # sort items by count, descending, take top 3 top = sorted(counts.items(), key=lambda kv: kv[1], reverse=True)[:3] for word, n in top: print(f"{word}: {n}")
Don't worry if lambda is fuzzy — you'll meet it properly in Lesson 1.3. The point: a dict + a loop solves a real problem in six lines.
Answer from memory — retrieval is what moves this from "I read it" to "I know it".
Which value is truthy in Python?
Which collection forbids duplicate members?
How do you safely read a maybe-missing dict key?
What defines a block of code in Python?
Which builds a string with inlined variables?