Module 4 Assignment: CNN architecture memo

Module 4 Assignment: CNN architecture memo#

Theme#

Convolutional neural networks for vision

Scenario#

A computer-vision team is deciding whether a small CNN is adequate for grayscale inspection images before moving to larger pretrained models.

Exercises#

Design a CNN for a small grayscale image classification problem.
Specify convolution, activation, pooling, flattening, and output-head choices.
Run the starter model and inspect output shapes after each stage.
Explain how local structure and weight sharing support the design.

Evidence Requirements#

A short explanation of the data, tensors, objective, and evaluation signal used in the starter experiment.
At least one meaningful modification to the starter code, with the changed variable named explicitly.
A comparison against the unmodified starter result or another defensible baseline.
A limitation statement that separates what the toy experiment demonstrates from what a production model would require.

Submission#

Submit a 600-900 word technical memo plus code, plots, tables, or shape traces needed to support your claims. The memo should read like a review artifact for another AI practitioner: concise, reproducible, and honest about uncertainty.

Rubric Focus#

Technical correctness and appropriate neural-network vocabulary.
Evidence from the starter experiment or a documented extension.
Connection between design choices and data/problem structure.
Clear treatment of limitations, failure modes, or next experimental gates.

import torch
from torch import nn

model = nn.Sequential(
    nn.Conv2d(1, 4, kernel_size=3, padding=1), nn.ReLU(), nn.MaxPool2d(2),
    nn.Conv2d(4, 8, kernel_size=3, padding=1), nn.ReLU(), nn.AdaptiveAvgPool2d((1, 1)),
    nn.Flatten(), nn.Linear(8, 2)
)

x = torch.randn(5, 1, 16, 16)
for layer in model:
    x = layer(x)
    print(f"{layer.__class__.__name__:>18}: {tuple(x.shape)}")

            Conv2d: (5, 4, 16, 16)
              ReLU: (5, 4, 16, 16)
         MaxPool2d: (5, 4, 8, 8)
            Conv2d: (5, 8, 8, 8)
              ReLU: (5, 8, 8, 8)
 AdaptiveAvgPool2d: (5, 8, 1, 1)
           Flatten: (5, 8)
            Linear: (5, 2)

Reflection Prompts#

What changed when you modified the starter experiment, and why should that change matter?
Which result surprised you, and what diagnostic would you run next?
What assumption would you document before handing this model to another practitioner?
Which failure mode from the module reading is most relevant to your result?