Evaluate 文件
Scikit-Learn
加入 Hugging Face 社群
並獲得增強的文件體驗
開始使用
Scikit-Learn
要執行 scikit-learn 示例,請確保您已安裝以下庫:
pip install -U scikit-learn
evaluate
中的指標可以輕鬆地與 Scikit-Learn 的估計器(estimator)或管道(pipeline)整合。
然而,這些指標需要我們從模型生成預測。估計器的預測和標籤可以傳遞給 evaluate
指標以計算所需的值。
import numpy as np
np.random.seed(0)
import evaluate
from sklearn.compose import ColumnTransformer
from sklearn.datasets import fetch_openml
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
從 https://www.openml.org/d/40945 載入資料
X, y = fetch_openml("titanic", version=1, as_frame=True, return_X_y=True)
或者,也可以直接從 frame 屬性獲取 X 和 y
X = titanic.frame.drop('survived', axis=1)
y = titanic.frame['survived']
我們為數值和分類資料建立預處理管道。請注意,pclass 既可以作為分類特徵,也可以作為數值特徵處理。
numeric_features = ["age", "fare"]
numeric_transformer = Pipeline(
steps=[("imputer", SimpleImputer(strategy="median")), ("scaler", StandardScaler())]
)
categorical_features = ["embarked", "sex", "pclass"]
categorical_transformer = OneHotEncoder(handle_unknown="ignore")
preprocessor = ColumnTransformer(
transformers=[
("num", numeric_transformer, numeric_features),
("cat", categorical_transformer, categorical_features),
]
)
將分類器附加到預處理管道。現在我們有了一個完整的預測管道。
clf = Pipeline(
steps=[("preprocessor", preprocessor), ("classifier", LogisticRegression())]
)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
由於 Evaluate
指標使用列表作為參考和預測的輸入,我們需要將它們轉換為 Python 列表。
# Evaluate metrics accept lists as inputs for values of references and predictions
y_test = y_test.tolist()
y_pred = y_pred.tolist()
# Accuracy
accuracy_metric = evaluate.load("accuracy")
accuracy = accuracy_metric.compute(references=y_test, predictions=y_pred)
print("Accuracy:", accuracy)
# Accuracy: 0.79
只要它們與任務和預測相容,您就可以將任何合適的 evaluate
指標與估計器一起使用。