CNN - Fight4354

CNN

Jun 21, 202513

AI Translation

This post is translated from Chinese into English through AI.View Original

AI-generated summary

The document discusses three key components in PyTorch for building neural networks: 1. **nn.Conv2d**: This class implements a 2D convolutional layer. Important parameters include: - `in_channels`: Number of input channels (e.g., 3 for RGB images). - `out_channels`: Number of output channels, indicating the number of convolutional kernels. - `kernel_size`: Size of the convolutional kernel, typically (3,3) or (5,5). - `stride`: Step size for moving the kernel, affecting the output feature map size. - `padding`: Number of zeros added to the input's borders. - `bias`: Indicates whether to include a bias term, defaulting to True. 2. **nn.MaxPool2d**: This class implements a 2D max pooling layer, sharing padding and stride parameters with `nn.Conv2d`. It has no learnable parameters. By default, it uses a kernel size of 3, with a stride equal to the kernel size (non-overlapping pooling) and no padding. 3. **LeNet**: An early successful neural network architecture.

1. nn.Conv2d#

A class in PyTorch used to implement a 2D Convolutional Layer
Important parameters

in_channels
The number of channels in the input data (for example, an RGB image has 3, a grayscale image has 1)
out_channels
The number of output channels, representing the number of convolutional kernels.

# Assume you have an RGB image (3 channels) of size 32×32:
conv = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
# The output shape you get is: [batch_size, 16, 32, 32]

kernel_size
The size of the convolutional kernel, generally 3 ⇌ (3,3) or 5 ⇌ (5,5), can also be (3,5) etc.
stride
The step size of the convolutional kernel, controlling the size of the output feature map. ex: 2, (3,5)
padding
The number of zeros added to the input boundary, controlling the spatial size of the output feature map.
bias
Whether to add a bias term, default is True.

2. nn.MaxPool2d#

A class in PyTorch used to implement a 2D Max Pool Layer
Padding and stride are the same as in nn.Conv2d.
No learnable parameters
ex:
nn.MaxPool2d(3)
• kernel_size=3: indicates using a 3x3 window to slide over the input feature map.
Default:
• stride=kernel_size, meaning the window slides 3 pixels at a time (non-overlapping pooling). Different from the default parameters of nn.Conv2d.
• padding=0, no edge padding is applied.

3. LeNet#

An early successful neural network