1. nn.Conv2d#
A class in PyTorch used to implement a 2D Convolutional Layer
Important parameters
- in_channels
The number of channels in the input data (for example, an RGB image has 3, a grayscale image has 1) - out_channels
The number of output channels, representing the number of convolutional kernels.
# Assume you have an RGB image (3 channels) of size 32×32:
conv = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
# The output shape you get is: [batch_size, 16, 32, 32]
- kernel_size
The size of the convolutional kernel, generally 3 ⇌ (3,3) or 5 ⇌ (5,5), can also be (3,5) etc. - stride
The step size of the convolutional kernel, controlling the size of the output feature map. ex: 2, (3,5) - padding
The number of zeros added to the input boundary, controlling the spatial size of the output feature map. - bias
Whether to add a bias term, default is True.
2. nn.MaxPool2d#
A class in PyTorch used to implement a 2D Max Pool Layer
Padding and stride are the same as in nn.Conv2d
.
No learnable parameters
ex:
nn.MaxPool2d(3)
• kernel_size=3: indicates using a 3x3 window to slide over the input feature map.
Default:
• stride=kernel_size, meaning the window slides 3 pixels at a time (non-overlapping pooling). Different from the default parameters of nn.Conv2d
.
• padding=0, no edge padding is applied.
3. LeNet#
An early successful neural network