Module 5 Assignment: Sequence modeling design note#
Theme#
Sequence models: RNNs and LSTMs
Exercises#
Define a sequence prediction task and identify the input/output alignment.
Compare when a simple RNN, LSTM, GRU, or one-dimensional convolution would be appropriate.
Run the starter LSTM on synthetic ordered data and inspect tensor shapes.
Explain one long-range dependency risk and one mitigation strategy.
Submission#
Submit a 600-900 word technical memo plus any code, plots, or shape traces needed to support your claims. Use the starter cell as a minimum reproducible experiment, then make at least one meaningful modification.
Rubric#
Correct use of module vocabulary and notation
Clear connection between design choices and data/problem structure
Evidence from the starter experiment or your own extension
Concise reflection on limitations, failure modes, or next steps
import torch
from torch import nn
torch.manual_seed(5)
X = torch.randn(12, 6, 3) # batch, time, features
lstm = nn.LSTM(input_size=3, hidden_size=8, batch_first=True)
head = nn.Linear(8, 1)
sequence_output, (h_n, c_n) = lstm(X)
prediction = head(sequence_output[:, -1, :])
print("sequence output shape:", tuple(sequence_output.shape))
print("final prediction shape:", tuple(prediction.shape))
sequence output shape: (12, 6, 8)
final prediction shape: (12, 1)
Reflection prompts#
What changed when you modified the starter experiment?
Which result surprised you, and what diagnostic would you run next?
What assumption would you document before handing this model to another practitioner?