Keras顺序API

Note

本节我们使用Fashion-MNIST数据集来演示keras顺序API的使用,包括模型的定义、训练、测试和预测、保存和加载。

Fashion-MNIST数据集

from tensorflow import keras


#@save
def load_fashion_mnist():
    """
    载入Fashion-MNIST数据集
    训练集、验证集、测试集的数量分别为55000、5000、10000
    """
    # 使用keras.datasets载入数据
    (X_train_val, y_train_val), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()
    # 分割为训练集和验证集,对输入做归一化
    X_val, X_train = X_train_val[: 5000] / 255.0, X_train_val[5000:] / 255.0
    y_val, y_train = y_train_val[: 5000], y_train_val[5000:]
    # 对测试集的输入做归一化
    X_test = X_test / 255.0
    return (X_train, y_train), (X_val, y_val), (X_test, y_test)


(X_train, y_train), (X_val, y_val), (X_test, y_test) = load_fashion_mnist()
# 查看shape
X_train.shape, y_train.shape
((55000, 28, 28), (55000,))

Tip

那些会复用的函数或类的开头会打上#@save标记,它们会保存在utils.py中,这样其他地方若需要使用只需:import utils

定义模型

# keras的顺序API
model = keras.models.Sequential()
# 添加各层
# Flatten: 用于将输入层的数据压缩成一维数据,需指定input_shape
model.add(keras.layers.Flatten(input_shape=[28, 28]))
# Dense: 全连接层,需指定输出维度
model.add(keras.layers.Dense(300, activation="relu"))
# 两种激活函数的指定方式,效果是一样的
model.add(keras.layers.Dense(100, activation=keras.layers.ReLU()))
model.add(keras.layers.Dense(10, activation="softmax"))
# 直接传各层,效果同上
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(300, activation="relu"),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(10, activation="softmax")
])
# 展示模型的各层,None表示未定的batch_size
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_1 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 300)               235500    
_________________________________________________________________
dense_4 (Dense)              (None, 100)               30100     
_________________________________________________________________
dense_5 (Dense)              (None, 10)                1010      
=================================================================
Total params: 266,610
Trainable params: 266,610
Non-trainable params: 0
_________________________________________________________________

训练

keras模型在训练前需要先编译(compile),编译会指定损失函数、优化器和训练时要追踪的指标。

若y_true为index,使用sparse_categorical_crossentropy

若y_true为one-hot,使用category_crossentropy

# 编译
model.compile(loss="sparse_categorical_crossentropy",
              # SGD的默认学习率为0.01
              optimizer="sgd",
              metrics=["accuracy"])
# 训练
# history记录了训练时的损失和指标
history = model.fit(X_train, y_train,
                    # 这里未指定batch_size,使用默认值32
                    epochs=10, 
                    validation_data=(X_val, y_val))
Epoch 1/10
1719/1719 [==============================] - 4s 2ms/step - loss: 0.7220 - accuracy: 0.7640 - val_loss: 0.5040 - val_accuracy: 0.8304
Epoch 2/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.4905 - accuracy: 0.8304 - val_loss: 0.4590 - val_accuracy: 0.8444
Epoch 3/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.4448 - accuracy: 0.8436 - val_loss: 0.4411 - val_accuracy: 0.8466
Epoch 4/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.4165 - accuracy: 0.8545 - val_loss: 0.4432 - val_accuracy: 0.8476
Epoch 5/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.3965 - accuracy: 0.8597 - val_loss: 0.3724 - val_accuracy: 0.8746
Epoch 6/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.3815 - accuracy: 0.8646 - val_loss: 0.3662 - val_accuracy: 0.8752
Epoch 7/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.3685 - accuracy: 0.8687 - val_loss: 0.3720 - val_accuracy: 0.8692
Epoch 8/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.3568 - accuracy: 0.8718 - val_loss: 0.3508 - val_accuracy: 0.8778
Epoch 9/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.3472 - accuracy: 0.8765 - val_loss: 0.3468 - val_accuracy: 0.8790
Epoch 10/10
1719/1719 [==============================] - 3s 2ms/step - loss: 0.3373 - accuracy: 0.8796 - val_loss: 0.3452 - val_accuracy: 0.8750
import pandas as pd
import matplotlib.pyplot as plt

# 使用pandas和matplotlib展示history记录
pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1) # set the vertical range to [0-1]
plt.show()
../_images/1.sequential_11_0.png

测试和预测

# 测试
model.evaluate(X_test, y_test)
313/313 [==============================] - 0s 652us/step - loss: 0.3850 - accuracy: 0.8608
[0.385049432516098, 0.86080002784729]
# 假设X_new是要进行预测的数据
X_new = X_test[:3]
# 预测得到各类别的概率
y_proba = model.predict(X_new)
y_proba.round(2)
array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.04, 0.  , 0.12, 0.  , 0.84],
       [0.  , 0.  , 0.99, 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]],
      dtype=float32)
import numpy as np

# 获取类别,across columns
y_pred = np.argmax(y_proba, axis=-1)
# 对X_new的预测是正确的
y_pred, y_test[: 3]
(array([9, 2, 1]), array([9, 2, 1], dtype=uint8))

保存和加载

# 保存模型
model.save("my_fashion_mnist_model")
INFO:tensorflow:Assets written to: my_fashion_mnist_model/assets
# 加载模型
my_model = keras.models.load_model("my_fashion_mnist_model")
# 和之前模型测试的结果一样
my_model.evaluate(X_test, y_test)
313/313 [==============================] - 0s 713us/step - loss: 0.3850 - accuracy: 0.8608
[0.385049432516098, 0.86080002784729]