Module 3 Lab: SGD vs Adam on a synthetic surface#
Compare optimizer behavior on a noisy nonlinear classification problem.
Run the setup cell, inspect the printed diagnostics, and then complete the exercises at the end. The lab is intentionally small enough to run in GitHub Codespaces without a GPU.
import torch
from torch import nn
import matplotlib.pyplot as plt
torch.manual_seed(13)
X = torch.randn(220, 4)
y = ((X[:, 0] * X[:, 1] + X[:, 2] - 0.25 * X[:, 3]) > 0).long()
def run(opt_name):
model = nn.Sequential(nn.Linear(4, 20), nn.ReLU(), nn.Linear(20, 2))
loss_fn = nn.CrossEntropyLoss()
opt = torch.optim.Adam(model.parameters(), lr=0.03) if opt_name == "Adam" else torch.optim.SGD(model.parameters(), lr=0.15)
losses = []
for _ in range(120):
opt.zero_grad()
loss = loss_fn(model(X), y)
loss.backward()
opt.step()
losses.append(loss.item())
acc = (model(X).argmax(1) == y).float().mean().item()
return losses, acc
sgd_losses, sgd_acc = run("SGD")
adam_losses, adam_acc = run("Adam")
print(f"SGD accuracy: {sgd_acc:.3f}")
print(f"Adam accuracy: {adam_acc:.3f}")
plt.figure(figsize=(5, 3))
plt.plot(sgd_losses, label="SGD")
plt.plot(adam_losses, label="Adam")
plt.legend()
plt.close()
SGD accuracy: 0.891
Adam accuracy: 1.000
Lab exercises#
Change one model or data parameter and rerun the lab.
Record whether the metric improved, worsened, or stayed roughly the same.
Add one sentence connecting the result to Optimization, loss, and regularization.
Identify one limitation of this toy setup before applying the idea to a real dataset.
# Reflection workspace
observation = ""
next_experiment = ""
print({"observation": observation, "next_experiment": next_experiment})
{'observation': '', 'next_experiment': ''}