LeNet¶

Note

现在我们已经有了组成卷积神经网络（Convolutional Neural Networks, CNN）的所有模块，可以开始组装了。
LeNet是最早发布的卷积神经网络之一，它由 Yann LeCun 在1989年提出，目的是识别手写数字。

结构¶

总的来说, LeNet (LeNet-5) 由两部分组成:

由两个卷积层构成的卷积编码器
三个全连接层

jupyter

Tip

一般来说，随着网络的深入：
分辨率会越来越低，使得神经元的感受野增加，能捕捉更复杂的模式；
通道数越来越多，使得我们能捕捉数量更多的更分化的模式。

更简洁的结构图：

jupyter

实现¶

import torch
from torch import nn

# 使用nn.Sequential定义
# LeNet使用的是 sigmoid & avg_pooling
net = torch.nn.Sequential(nn.Conv2d(1, 6, kernel_size=5,padding=2), nn.Sigmoid(),
                          nn.AvgPool2d(kernel_size=2, stride=2),
                          nn.Conv2d(6, 16, kernel_size=5), nn.Sigmoid(),
                          nn.AvgPool2d(kernel_size=2, stride=2), 
                          nn.Flatten(),
                          nn.Linear(16 * 5 * 5, 120), nn.Sigmoid(),
                          nn.Linear(120, 84), nn.Sigmoid(), 
                          nn.Linear(84, 10))

# (batch_size, channel, h, w)
X = torch.rand(size=(1, 1, 28, 28), dtype=torch.float32)
# check shape
for layer in net:
    X = layer(X)
    # 层名: output shape
    print(layer.__class__.__name__, 'output shape: \t', X.shape)

Conv2d output shape: 	 torch.Size([1, 6, 28, 28])
Sigmoid output shape: 	 torch.Size([1, 6, 28, 28])
AvgPool2d output shape: 	 torch.Size([1, 6, 14, 14])
Conv2d output shape: 	 torch.Size([1, 16, 10, 10])
Sigmoid output shape: 	 torch.Size([1, 16, 10, 10])
AvgPool2d output shape: 	 torch.Size([1, 16, 5, 5])
Flatten output shape: 	 torch.Size([1, 400])
Linear output shape: 	 torch.Size([1, 120])
Sigmoid output shape: 	 torch.Size([1, 120])
Linear output shape: 	 torch.Size([1, 84])
Sigmoid output shape: 	 torch.Size([1, 84])
Linear output shape: 	 torch.Size([1, 10])

训练¶

import d2l

# 获取数据
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size=batch_size)

# 训练，最好还是GPU
lr, num_epochs = 0.01, 10
d2l.train_image_classifier(net, train_iter, test_iter, lr, num_epochs)

loss 0.343, train acc 0.870700, test acc  0.864500

深度学习手册

LeNet¶

结构¶

实现¶

训练¶