资讯

历史

科技

环境与自然

成长

游戏

财经

文学与艺术

美食

健康

家居

文化

情感

汽车

三农

军事

旅行

运动

教育

生活

星座命理

用scikit-learn高效评估AI模型

创作时间:

作者:

@小白创作中心

用scikit-learn高效评估AI模型

引用

CSDN

等

来源

https://m.blog.csdn.net/2401_85761003/article/details/139987256

https://blog.csdn.net/Kawords/article/details/139800121

https://blog.csdn.net/qq_42034590/article/details/134408369

https://blog.csdn.net/qq_52964132/article/details/140121999

https://m.blog.csdn.net/2401_85743969/article/details/140067217

https://blog.csdn.net/weixin_44012667/article/details/143829069

https://blog.csdn.net/m0_64827698/article/details/136748724

https://blog.csdn.net/m0_74783475/article/details/138190201

https://cloud.baidu.com/article/3309426

10.

http://www.runoob.com/sklearn/sklearn-model-evaluation.html

11.

https://www.cnblogs.com/wang_yb/p/17999175

12.

https://scikit-learn.cn/stable/model_selection.html

13.

https://www.cnblogs.com/wzzkaifa/articles/7069529.html

14.

https://www.cda.cn/discuss/post/details/607ff641ff0ef91896ff3453

在机器学习领域，模型评估是确保模型泛化能力和预测准确度的关键步骤。scikit-learn作为Python中一个功能强大且易于使用的机器学习库，提供了丰富的模型评估工具和方法。本文将详细介绍如何使用scikit-learn进行模型评估，包括交叉验证、混淆矩阵和各种评分指标的使用方法。

交叉验证

交叉验证是一种用于评估模型性能的技术，它通过将数据集分成多个子集，并多次训练和测试模型来获得更稳定、可靠的评估结果。scikit-learn提供了多种交叉验证的方法，如K-fold交叉验证、留一法交叉验证和分层K-fold交叉验证等。

K-fold交叉验证

K-fold交叉验证是最常见的交叉验证方法。它将数据集平均分割成K个子集，然后进行K次训练和测试。每次选择其中一个子集作为测试集，其余K-1个子集作为训练集。最后计算K次的结果平均值作为模型性能的评估。

以下是使用scikit-learn实现K-fold交叉验证的示例代码：

from sklearn.model_selection import KFold
from sklearn.datasets import load_iris
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# 加载数据集
iris = load_iris()
X, y = iris.data, iris.target

# 设置K-折交叉验证器
kf = KFold(n_splits=5, shuffle=True, random_state=1)

# 初始化准确率列表
accuracies = []

# 执行K-折交叉验证
for train_index, test_index in kf.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

    # 创建SVM模型并训练
    model = SVC()
    model.fit(X_train, y_train)

    # 预测测试集并计算准确率
    predictions = model.predict(X_test)
    accuracy = accuracy_score(y_test, predictions)
    accuracies.append(accuracy)

# 计算平均准确率
average_accuracy = sum(accuracies) / len(accuracies)
print(f"Average accuracy: {average_accuracy}")

留一法交叉验证

留一法交叉验证是一种特殊情况的K-折交叉验证，其中K等于样本数量。每次留出一个样本作为测试集，其余作为训练集。以下是使用scikit-learn实现LOOCV的示例代码：

from sklearn.model_selection import LeaveOneOut

loo = LeaveOneOut()

# 使用LOO进行交叉验证
for train_index, test_index in loo.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

    model.fit(X_train, y_train)
    predictions = model.predict(X_test)

    # 记录准确率
    accuracy = accuracy_score(y_test, predictions)
    print(f"Accuracy for this fold: {accuracy}")

分层交叉验证

分层交叉验证特别适用于分类问题，可以确保每个折中的类别分布与原始数据集保持一致。这对于处理类别不平衡的数据集非常有用。

混淆矩阵

混淆矩阵是评估分类模型性能的重要工具，它可以帮助我们了解模型在不同类别上的预测情况。scikit-learn提供了confusion_matrix函数来计算混淆矩阵。

二分类混淆矩阵

from sklearn.metrics import confusion_matrix
import numpy as np

# 二分类
y_true = np.array([0, 1, 0, 1, 0, 1, 1, 0, 0, 1])  # 实际标签
y_pred = np.array([0, 0, 0, 1, 0, 1, 1, 1, 0, 1])  # 预测标签

cm = confusion_matrix(y_true, y_pred)

print("Confusion Matrix:")
print(cm)

# 输出
Confusion Matrix:
[[4 1]
 [1 4]]

多分类混淆矩阵

# 多分类
y_true = np.array([0, 1, 2, 1, 0, 2, 1, 0, 0, 1])  # 实际标签
y_pred = np.array([0, 2, 1, 1, 0, 2, 1, 0, 0, 1])  # 预测标签

cm = confusion_matrix(y_true, y_pred)

print("Confusion Matrix:")
print(cm)

# 输出
Confusion Matrix:
[[4 0 0]
 [0 3 1]
 [0 1 1]]

评分指标

scikit-learn提供了多种评分指标，用于评估模型的性能。除了常见的准确率（accuracy）之外，还有精确率（precision）、召回率（recall）、F1分数（F1 score）等指标。这些指标可以根据具体问题选择使用。

自定义评分器

在某些情况下，我们可能需要根据特定需求创建自定义的评分函数。scikit-learn允许我们通过make_scorer函数来实现这一点。以下是一个创建自定义评分器的例子：

import numpy as np
from sklearn.metrics import make_scorer
from sklearn.datasets import fetch_california_housing
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score

# 定义自定义评分函数
def r_squared(y_true, y_pred):
    mean_y_true = np.mean(y_true)
    ss_res = np.sum((y_true - y_pred) ** 2)
    ss_tot = np.sum((y_true - mean_y_true) ** 2)
    r2 = 1 - (ss_res / ss_tot)
    return r2

# 创建评分器对象
r2_score = make_scorer(r_squared)

# 加载数据集
X, y = fetch_california_housing(return_X_y=True)

# 创建模型
model = RandomForestRegressor()

# 使用自定义评分器进行交叉验证
scores = cross_val_score(model, X, y, cv=5, scoring=r2_score)

# 输出结果
print(f"R2 Squared: {scores.mean():.2f} +/- {scores.std():.2f}")

超参数调整

为了提升模型性能，我们通常需要对模型的超参数进行调整。scikit-learn提供了多种超参数调优方法，包括网格搜索、随机搜索、贝叶斯优化和遗传算法等。

网格搜索

网格搜索通过预定义的参数组合进行穷举搜索，评估每一种参数组合的性能，选择性能最佳的参数组合。

from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

param_grid = {'C': [0.1, 1, 10], 'kernel': ['linear', 'rbf']}
grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train, y_train)
print(grid_search.best_params_)

随机搜索

随机搜索在预定义的参数空间中随机选择参数组合进行评估，通常比网格搜索更快，特别是在参数空间较大时。

from sklearn.model_selection import RandomizedSearchCV
from sklearn.svm import SVC
from scipy.stats import uniform

param_dist = {'C': uniform(0.1, 10), 'kernel': ['linear', 'rbf']}
random_search = RandomizedSearchCV(SVC(), param_dist, n_iter=10, cv=5)
random_search.fit(X_train, y_train)
print(random_search.best_params_)

贝叶斯优化

贝叶斯优化通过构建一个代理模型（如高斯过程）来预测不同参数组合的性能，并选择最有希望的参数组合进行评估。

from skopt import BayesSearchCV
from sklearn.svm import SVC

param_space = {'C': (0.1, 10), 'kernel': ['linear', 'rbf']}
bayes_search = BayesSearchCV(SVC(), param_space, n_iter=10, cv=5)
bayes_search.fit(X_train, y_train)
print(bayes_search.best_params_)

遗传算法

遗传算法模拟自然选择和遗传过程，通过交叉、变异等操作在参数空间中搜索最优解。

from deap import base, creator, tools, algorithms
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
import random

def eval_params(params):
    model = SVC(**params)
    score = cross_val_score(model, X_train, y_train, cv=5).mean()
    return score,

creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)

toolbox = base.Toolbox()
toolbox.register("attr_C", random.uniform, 0.1, 10)
toolbox.register("attr_kernel", random.choice, ['linear', 'rbf'])
toolbox.register("individual", tools.initCycle, creator.Individual,
                 (toolbox.attr_C, toolbox.attr_kernel), n=1)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)

# 定义遗传算法的其他参数和操作
# ...

# 运行遗传算法
# ...

通过以上方法，我们可以有效地评估和优化AI模型的性能。scikit-learn提供了丰富的工具和方法，使得模型评估和调优变得更加简单和高效。掌握这些技巧，将帮助我们在AI项目中取得更好的成果。