Module 1 Book Prose#

Perceptrons, MLPs, and representation#

Deep learning begins with simple units composed into layers. A perceptron applies a weighted sum and nonlinearity; multilayer perceptrons stack these transformations to learn hierarchical features.

Depth and Width#

Adding hidden layers increases expressive power but also optimization difficulty. Students should understand depth as a modeling choice tied to data complexity, not as a default.

Activations#

ReLU and its variants dominate modern feedforward networks because they mitigate vanishing gradients in many settings. Sigmoid and tanh remain relevant for gating and certain output heads.

Connection to the Rest of the Course#

Module 1 establishes the feedforward picture that backpropagation (Module 2), optimization (Module 3), and specialized architectures (Modules 4–7) extend.