Module 4 Assignment: CNN architecture memo

Module 4 Assignment: CNN architecture memo#

Theme#

Convolutional neural networks for vision

Exercises#

  1. Design a CNN for a small grayscale image classification problem.

  2. Specify convolution, activation, pooling, flattening, and classification-head choices.

  3. Use the starter model to inspect output shapes after each stage.

  4. Explain why local receptive fields and weight sharing fit image data.

Submission#

Submit a 600-900 word technical memo plus any code, plots, or shape traces needed to support your claims. Use the starter cell as a minimum reproducible experiment, then make at least one meaningful modification.

Rubric#

  • Correct use of module vocabulary and notation

  • Clear connection between design choices and data/problem structure

  • Evidence from the starter experiment or your own extension

  • Concise reflection on limitations, failure modes, or next steps

import torch
from torch import nn

model = nn.Sequential(
    nn.Conv2d(1, 4, kernel_size=3, padding=1), nn.ReLU(), nn.MaxPool2d(2),
    nn.Conv2d(4, 8, kernel_size=3, padding=1), nn.ReLU(), nn.AdaptiveAvgPool2d((1, 1)),
    nn.Flatten(), nn.Linear(8, 2)
)

x = torch.randn(5, 1, 16, 16)
for layer in model:
    x = layer(x)
    print(f"{layer.__class__.__name__:>18}: {tuple(x.shape)}")
            Conv2d: (5, 4, 16, 16)
              ReLU: (5, 4, 16, 16)
         MaxPool2d: (5, 4, 8, 8)
            Conv2d: (5, 8, 8, 8)
              ReLU: (5, 8, 8, 8)
 AdaptiveAvgPool2d: (5, 8, 1, 1)
           Flatten: (5, 8)
            Linear: (5, 2)

Reflection prompts#

  • What changed when you modified the starter experiment?

  • Which result surprised you, and what diagnostic would you run next?

  • What assumption would you document before handing this model to another practitioner?