使用 Keras 和 Tensorflow

Evaluate 可以輕鬆整合到您的 Keras 和 Tensorflow 工作流中。我們將演示兩種將 Evaluate 融入模型訓練的方法，使用 Fashion MNIST 示例資料集。我們將訓練一個標準分類器來預測此資料集中的兩個類別，並展示如何在訓練期間使用指標作為回撥，或在訓練後用於評估。

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
import evaluate

# We pull example code from Keras.io's guide on classifying with MNIST
# Located here: https://keras.io/examples/vision/mnist_convnet/

# Model / data parameters
input_shape = (28, 28, 1)

# Load the data and split it between train and test sets
(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()


# Only select tshirts/tops and trousers, classes 0 and 1
def get_tshirts_tops_and_trouser(x_vals, y_vals):
    mask = np.where((y_vals == 0) | (y_vals == 1))
    return x_vals[mask], y_vals[mask]

x_train, y_train = get_tshirts_tops_and_trouser(x_train, y_train)
x_test, y_test = get_tshirts_tops_and_trouser(x_test, y_test)


# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255

x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)


model = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(1, activation="sigmoid"),
    ]
)

回撥 (Callbacks)

假設我們想在模型訓練時跟蹤模型指標。我們可以使用回撥 (Callback) 在每個 epoch 結束後計算該指標。

我們將在這裡定義一個回撥，它將接受一個指標名稱和我們的訓練資料，並讓它在 epoch 結束後計算指標。

class MetricsCallback(keras.callbacks.Callback):

    def __init__(self, metric_name, x_data, y_data) -> None:
        super(MetricsCallback, self).__init__()

        self.x_data = x_data
        self.y_data = y_data
        self.metric_name = metric_name
        self.metric = evaluate.load(metric_name)

    def on_epoch_end(self, epoch, logs=dict()):
        m = self.model 
        # Ensure we get labels of "1" or "0"
        training_preds = np.round(m.predict(self.x_data))
        training_labels = self.y_data

        # Compute score and save
        score = self.metric.compute(predictions = training_preds, references = training_labels)
        
        logs.update(score)

我們可以將這個類傳遞給 callbacks 關鍵字引數，以便在訓練期間使用它。

batch_size = 128
epochs = 2

model.compile(loss="binary_crossentropy", optimizer="adam")

model_history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1, 
callbacks = [MetricsCallback(x_data = x_train, y_data = y_train, metric_name = "accuracy")])

使用 Evaluate 指標進行…評估！

我們也可以在模型訓練後使用相同的指標！在這裡，我們展示瞭如何在測試集上訓練模型後檢查其準確性。

acc = evaluate.load("accuracy")
# Round the predictions to turn them into "0" or "1" labels
test_preds = np.round(model.predict(x_test))
test_labels = y_test

print("Test accuracy is : ", acc.compute(predictions = test_preds, references = test_labels))
# Test accuracy is : 0.9855

< > 在 GitHub 上更新