Module 1 · Python · Deep Dive

Functions, OOP & Modules

From loose scripts to reusable, organised code — functions, classes, dataclasses, and the module system that every FastAPI app is built from.

BasicIntermediateAdvancedBuild

Why this matters Last lesson you learned Python's raw materials. Now you make them reusable and organised — the difference between a throwaway script and a real backend. Every DocChat route is a function with type hints; every request body is a class; every file you import is a module. Interviewers probe two things here relentlessly: the mutable-default-argument trap, and whether you actually understand self. Nail both today.
In this lesson
  1. Functions & parameters
  2. *args, **kwargs & type hints
  3. Scope & the mutable-default trap
  4. First-class functions & lambda
  5. Classes & OOP
  6. Inheritance & @property
  7. @dataclass — the modern data holder
  8. Modules, packages & venv
  9. @staticmethod & @classmethod
  10. Comparison & identity dunders
  11. Collection dunders & duck typing
  12. Build: a Document class
  13. Check yourself

1 · Functions & parameters

You define a function with def, a colon, and an indented body. return hands a value back; a function with no return gives back None.

def greet(name):
    return f"Hello, {name}"

print(greet("Sam"))   # Hello, Sam
PHP bridge: def greet($name) {} becomes def greet(name): — same idea, no braces, no $, no function keyword.

defaults default arguments

Give a parameter a fallback value and callers can omit it:

def greet(name, greeting="Hello"):
    return f"{greeting}, {name}"

greet("Sam")                # "Hello, Sam"
greet("Sam", "Marhaba")     # "Marhaba, Sam"

keywords keyword arguments

You can pass arguments by name, in any order — this makes calls self-documenting:

greet(name="Sam", greeting="Hi")
greet(greeting="Hi", name="Sam")   # same result — order doesn't matter
Positional then keyword Positional arguments must come before keyword arguments. greet(name="Sam", "Hi") is a syntax error; greet("Sam", greeting="Hi") is fine.

docstrings documenting a function

A string as the first line of a function body is its docstring — it shows up in help() and in your editor's tooltips. Write them for anything non-obvious.

def word_count(text):
    """Return the number of whitespace-separated words in text."""
    return len(text.split())

2 · *args, **kwargs & type hints

Sometimes you don't know how many arguments you'll get. *args collects extra positional arguments into a tuple; **kwargs collects extra keyword arguments into a dict.

def log(*args, **kwargs):
    print("positional:", args)     # a tuple
    print("keyword:", kwargs)      # a dict

log(1, 2, user="sam", level="info")
# positional: (1, 2)
# keyword: {'user': 'sam', 'level': 'info'}

The names args and kwargs are convention, not magic — the * and ** do the work. You'll see **kwargs constantly when wrapping or forwarding to other functions.

type hints annotating signatures

Python is dynamically typed, but you can annotate what you expect. Hints don't enforce anything at runtime — they power your editor, tools like mypy, and (crucially) FastAPI and Pydantic, which read them to validate requests.

def repeat(text: str, times: int = 2) -> str:
    return text * times

repeat("ab", 3)   # "ababab"

Read text: str as "text, expected to be a str", and -> str as "returns a str". You'll write hints on every FastAPI route, so get comfortable now.

PHP bridge: like PHP 7+ type declarations (function repeat(string $text): string), except Python does not enforce them at runtime — they're documentation plus tooling fuel.
Interview hook — "Do Python type hints do anything at runtime?" Answer: no, not by themselves. The interpreter ignores them. They're checked by external tools (mypy, your IDE) and used by libraries like Pydantic/FastAPI that explicitly inspect them. Saying this correctly signals real Python literacy.

3 · Scope & the mutable-default trap

Variables created inside a function are local — they don't leak out. To read a module-level (global) name you can just reference it; to reassign one from inside a function you'd need the global keyword (which you should almost never do).

counter = 0

def bump():
    global counter      # needed only to REASSIGN a global
    counter += 1

bump()
print(counter)        # 1

Prefer passing values in and returning them out over reaching for global — it keeps functions testable, which interviewers like to hear.

The mutable-default-argument trap A default argument is evaluated once, when the function is defined — not on each call. So a mutable default (a list or dict) is shared across every call. This is the single most famous Python gotcha.
# BUG: the list is created once and reused
def add_tag(tag, tags=[]):
    tags.append(tag)
    return tags

add_tag("ai")     # ['ai']
add_tag("uae")    # ['ai', 'uae']  ← leaked from the previous call!

The fix is the universal idiom: default to None, then create a fresh object inside.

# FIX: None sentinel, fresh list each call
def add_tag(tag, tags=None):
    if tags is None:
        tags = []
    tags.append(tag)
    return tags

add_tag("ai")     # ['ai']
add_tag("uae")    # ['uae']  ← correct, independent
Interview hook — "What's wrong with def f(x, items=[]):?" The mutable default is created once and persists between calls, so it accumulates state. The fix: default to None and assign a new list inside. This question appears in real UAE screening rounds — say "evaluated once at definition time" and you've passed it.

4 · First-class functions & lambda

In Python, functions are values. You can store one in a variable, pass it to another function, or return it. This is the foundation of decorators, callbacks, and FastAPI's route registration.

def shout(text):
    return text.upper()

f = shout            # no parentheses — store the function itself
print(f("hi"))       # "HI"

# pass a function as an argument
def apply(fn, value):
    return fn(value)

apply(shout, "hello")   # "HELLO"

A lambda is a tiny anonymous function for one-off use — most often as a key= argument to sorted():

docs = [{"title": "B", "pages": 5}, {"title": "A", "pages": 9}]
docs.sort(key=lambda d: d["pages"], reverse=True)
# sorted by pages, biggest first
PHP bridge: a lambda ≈ a PHP fn($d) => $d['pages'] arrow function. key=lambda is the same role as PHP's usort comparator, but cleaner.

5 · Classes & OOP

A class is a blueprint; an instance is one object built from it. __init__ is the constructor; self is the current instance, passed automatically as the first parameter of every method.

class Document:
    def __init__(self, title, body):
        self.title = title      # instance attribute
        self.body = body

    def word_count(self):
        return len(self.body.split())

doc = Document("Intro", "the cat sat")
print(doc.title)         # "Intro"
print(doc.word_count())  # 3
What is self, really? self is just the instance, handed to the method automatically. doc.word_count() is sugar for Document.word_count(doc). You must list self as the first parameter of every instance method, but you never pass it manually.
PHP bridge: self.title is Python's $this->title; __init__ is PHP's __construct. The big difference: Python makes self an explicit first parameter rather than an implicit $this.

class vs instance attributes

An attribute set on self is per-instance. An attribute set directly on the class is shared by all instances — useful for constants and defaults.

class Document:
    extension = ".txt"          # class attribute — shared

    def __init__(self, title):
        self.title = title         # instance attribute — per object

a = Document("A")
b = Document("B")
print(a.extension, b.extension)    # .txt .txt  (same shared value)

__repr__ a readable object

By default, printing an object shows an ugly <__main__.Document object at 0x...>. Define __repr__ to control how it appears — invaluable for debugging.

class Document:
    def __init__(self, title, body):
        self.title = title
        self.body = body

    def __repr__(self):
        return f"Document(title={self.title!r}, words={len(self.body.split())})"

print(Document("Intro", "a b c"))
# Document(title='Intro', words=3)

The !r in an f-string calls repr() on the value — that's why the title comes out quoted.

6 · Inheritance & @property

A class can inherit from another, reusing and extending its behaviour. Call the parent's constructor with super().__init__(...).

class Document:
    def __init__(self, title, body):
        self.title = title
        self.body = body

class PDFDocument(Document):
    def __init__(self, title, body, pages):
        super().__init__(title, body)   # run the parent's __init__
        self.pages = pages

pdf = PDFDocument("Report", "...text...", 12)
print(pdf.title, pdf.pages)             # Report 12

@property computed attributes

A @property turns a method into something you access like an attribute — no parentheses. Use it for values derived from other data.

class Document:
    def __init__(self, body):
        self.body = body

    @property
    def word_count(self):
        return len(self.body.split())

doc = Document("the cat sat")
print(doc.word_count)   # 3  — no () because it's a property
PHP bridge: @property is like a PHP magic __get getter, but explicit and per-attribute — far cleaner and discoverable.

7 · @dataclass — the modern data holder

Most classes just hold data. Writing __init__ and __repr__ by hand for those is tedious. The @dataclass decorator generates them for you from the annotated fields.

from dataclasses import dataclass

@dataclass
class Document:
    title: str
    body: str
    pages: int = 1           # field with a default

doc = Document("Intro", "the cat sat")
print(doc)                       # Document(title='Intro', body='the cat sat', pages=1)
print(doc.title)                 # "Intro"

You still add methods normally — the dataclass only generates the boilerplate (constructor, __repr__, equality). It's the idiomatic 2026 way to model a record.

Bridge to next module — read this twice A @dataclass is the exact mental model for Pydantic's BaseModel, which you meet in Module 2 (FastAPI). Same shape: a class, typed fields, defaults. The difference: Pydantic also validates and parses incoming data against those types at runtime. When you understand dataclasses, Pydantic is a two-minute leap — and every FastAPI request body is a Pydantic model.

8 · Modules, packages & venv

Every .py file is a module. You pull names from other modules with import.

# two import styles
import math
math.sqrt(9)              # 3.0

from math import sqrt, pi
sqrt(9)                   # 3.0  — name imported directly

Your own files work the same way. Say you have textutils.py next to your script:

textutils.py
def word_count(text: str) -> int:
    return len(text.split())
main.py
from textutils import word_count
print(word_count("the cat sat"))   # 3
PHP bridge: importrequire/use, but Python imports a whole module namespace at once — no manual file paths, just the module name. A folder with an __init__.py is a package (PHP's namespace-per-directory idea).

__main__ the import guard

When Python runs a file directly, it sets that file's __name__ to "__main__". When the file is imported, __name__ is the module's name instead. The guard below means "only run this when executed directly, not when imported":

def main():
    print("running directly")

if __name__ == "__main__":
    main()

Without the guard, your script's top-level code would fire every time another file imported it. This pattern is on nearly every Python entry point you'll ever see.

venv + pip isolated dependencies

A virtual environment is a per-project sandbox for packages, so Project A's FastAPI version can't collide with Project B's. Create, activate, install, then freeze your dependency list:

# create a venv in a .venv folder
python -m venv .venv

# activate it
source .venv/bin/activate     # macOS / Linux
.venv\Scripts\activate        # Windows

# install packages into THIS project only
pip install fastapi uvicorn

# record exact versions so others can reproduce it
pip freeze > requirements.txt

# later, on another machine:
pip install -r requirements.txt
Always work inside a venv Activating shows the env name in your prompt, e.g. (.venv). Commit requirements.txt to git, but never the .venv folder — add it to .gitignore. This is exactly how DocChat's dependencies will be managed.
PHP bridge: requirements.txtcomposer.json; pip installcomposer require; the .venv folder ≈ vendor/. Same dependency-management instincts, different commands.

9 · @staticmethod & @classmethod

Not every method needs an instance. Two decorators change what gets passed as the first argument. A normal method receives self (the instance); a @classmethod receives cls (the class itself); a @staticmethod receives nothing automatic — it's just a plain function that lives inside the class for namespacing.

class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email

    @classmethod
    def from_row(cls, row):        # cls is User (or a subclass)
        return cls(row["name"], row["email"])

    @staticmethod
    def is_valid_email(email):    # no self, no cls — just a utility
        return "@" in email

row = {"name": "Sam", "email": "sam@x.ae"}
u = User.from_row(row)              # build a User straight from a DB row
User.is_valid_email("sam@x.ae")    # True — called on the class, no instance

The classic use of @classmethod is an alternative constructor: __init__ takes the "normal" arguments, while from_row, from_json, from_file etc. each build an instance from a different source and then call cls(...). Because it uses cls (not the hard-coded User), a subclass that calls SubUser.from_row(row) correctly gets a SubUser back.

A @staticmethod earns its place when a helper is logically part of the class but needs neither the instance nor the class — like is_valid_email above. It's a namespacing choice, nothing more.

PHP bridge: @staticmethod ≈ a PHP public static function called as User::isValidEmail(). @classmethod is the piece PHP lacks a clean equivalent for: a static-style method that still receives the class as cls, so it works polymorphically through static::-like late binding.
Interview hook — "Difference between @classmethod and @staticmethod?" A classmethod receives the class as cls and is typically an alternative constructor (User.from_row(row)); a staticmethod receives nothing automatic and is just a namespaced utility. The tell of a strong answer is naming the alternative-constructor pattern — it's the single most common real use.

10 · Comparison & identity dunders

"Dunder" methods (double-underscore, like __init__) let your objects plug into Python's built-in operators. You already met __repr__; here are the ones interviewers actually ask about.

__eq__ + __hash__ value equality

By default a == b is identity — true only if they're the same object. Define __eq__ to compare by value instead. But there's a catch every interviewer loves: defining __eq__ sets __hash__ to None, which makes the object unhashable — you can no longer put it in a set or use it as a dict key. If you want both, define __hash__ too.

class Tag:
    def __init__(self, name):
        self.name = name

    def __eq__(self, other):
        return isinstance(other, Tag) and self.name == other.name

    def __hash__(self):
        return hash(self.name)      # keep it hashable for sets / dict keys

Tag("ai") == Tag("ai")      # True — compared by value
{Tag("ai"), Tag("ai")}        # {Tag('ai')} — one element, thanks to __hash__
A dataclass does this for you @dataclass generates __eq__ automatically. By default it stays unhashable (matching the rule above); add @dataclass(frozen=True) and you also get a __hash__, because a frozen instance is immutable and safe to hash.

__lt__ / __gt__ ordering

Define __lt__ (less-than) and Python can sort() your objects and answer <. Rather than writing all four of __lt__, __le__, __gt__, __ge__ by hand, define __eq__ plus one ordering method and let functools.total_ordering fill in the rest.

from functools import total_ordering

@total_ordering
class Version:
    def __init__(self, n):
        self.n = n

    def __eq__(self, other):
        return self.n == other.n

    def __lt__(self, other):
        return self.n < other.n

sorted([Version(3), Version(1), Version(2)])   # sorted ascending
Version(1) >= Version(1)                       # True — synthesised from __eq__ + __lt__

__str__ vs __repr__ two kinds of string

These look similar but serve different readers. __repr__ is for developers — unambiguous, debug-friendly, ideally something you could paste back into code. __str__ is for end users — friendly and readable. print(obj) and str(obj) use __str__; the REPL, containers, and repr(obj) use __repr__. If you define only one, define __repr__str() falls back to it.

class Money:
    def __init__(self, dirhams):
        self.dirhams = dirhams

    def __repr__(self):
        return f"Money(dirhams={self.dirhams!r})"   # for devs / debugging

    def __str__(self):
        return f"AED {self.dirhams:.2f}"            # for users

m = Money(42)
print(m)        # AED 42.00          (uses __str__)
m               # Money(dirhams=42)  (REPL uses __repr__)
repr(m)         # "Money(dirhams=42)"
Interview answer — __str__ vs __repr__ Say this almost verbatim: "__repr__ is for developers — unambiguous and ideally eval-able, shown in the REPL, debugger, and inside containers. __str__ is the friendly, user-facing form used by print() and str(). If I write only one I write __repr__, because str() falls back to it but never the other way around." That last sentence is what separates a memorised answer from an understood one.

11 · Collection dunders & duck typing

A few more dunders make your object behave like a built-in collection — Python never checks its type, only whether it supports the right methods. That's duck typing: "if it has __len__ and __getitem__, it walks like a list." Define them and len(), indexing, iteration, and in all start working on your object.

class Library:
    def __init__(self, docs):
        self.docs = docs

    def __len__(self):              # powers len(lib)
        return len(self.docs)

    def __getitem__(self, i):         # powers lib[0] AND iteration
        return self.docs[i]

    def __contains__(self, title):    # powers the `in` operator
        return any(d == title for d in self.docs)

lib = Library(["intro", "setup", "deploy"])
len(lib)                 # 3            — __len__
lib[0]                  # "intro"      — __getitem__
"setup" in lib          # True         — __contains__
for d in lib: ...        # works! — Python falls back to __getitem__

Note the bonus: with __getitem__ alone Python can iterate by calling it with 0, 1, 2… until IndexError. Implement __contains__ only if you want in to be smarter or faster than that default scan.

PHP bridge: these are Python's version of PHP's SPL interfaces — __len__Countable::count, __getitem__ArrayAccess::offsetGet, iteration ≈ IteratorAggregate. Python just uses dunder methods instead of explicit implements declarations — the duck-typing philosophy.

12 · Build it

Your tangible win Build a real Document class with a constructor, a word_count() method, a summary() method, and a clean __repr__ — then split it into its own module with a __main__ guard. This is the literal seed of DocChat's document model.

Try it yourself first, then compare. A clean version:

document.py
class Document:
    """A single uploaded document in DocChat."""

    def __init__(self, title: str, body: str):
        self.title = title
        self.body = body

    def word_count(self) -> int:
        return len(self.body.split())

    def summary(self, limit: int = 8) -> str:
        words = self.body.split()
        return " ".join(words[:limit]) + ("…" if len(words) > limit else "")

    def __repr__(self) -> str:
        return f"Document(title={self.title!r}, words={self.word_count()})"


def main():
    doc = Document("Welcome", "the cat sat on the mat and then the cat ran away")
    print(doc)                  # Document(title='Welcome', words=12)
    print(doc.word_count())     # 12
    print(doc.summary())        # the cat sat on the mat and then…


if __name__ == "__main__":
    main()

Run it with python document.py. Then in another file, from document import Document — note the main() demo does not fire on import, thanks to the guard. That separation is what makes code reusable.

13 · Check yourself

Answer from memory — retrieval is what moves this from "I read it" to "I know it".

Recall quiz

What does self refer to inside a method?

Why is def f(x, items=[]): a known trap?

What collects extra keyword arguments into a dict?

What does @dataclass save you from writing?

When is a file's __name__ set to "__main__"?

Which method is the typical alternative constructor?

How should you describe __repr__ against __str__?

Primary source ⭐ The Official Python Tutorial — §9 Classes. The canonical, authoritative reference for everything in this lesson. For the dataclass details, see the dataclasses module docs.