基于YOLOv8-pose的手部关键点检测模型训练与优化

创作时间:

作者:

@小白创作中心

基于YOLOv8-pose的手部关键点检测模型训练与优化

引用

CSDN

https://blog.csdn.net/qq_40387714/article/details/141254361

本文将介绍如何基于YOLOv8-pose模型进行手部关键点检测的训练，并对训练结果进行详细分析，包括准确性评估、损失函数分析以及可视化结果展示。通过调整训练参数和优化策略，可以显著提升模型的检测效果。

前言

对YOLOv8-pose手部关键点检测模型进行训练，并分析训练结果，从而调优训练超参数。

手部关键点检测数据集：基于YOLOv8-pose的手部关键点检测（1）- 手部关键点数据集获取（数据集下载、数据清洗、处理与增强）

1.训练参数设置

1.1 data.yaml

同手部检测中，只需增加kpt_shape: [21, 2]。

1.2 setting.yaml

同手部检测中，也需要关闭马赛克增强；同时马赛克增强概率要置为1。

# Train settings -------------------------------------------------------------------------------------------------------
task: pose              # (str) YOLO task, i.e. detect, segment, classify, pose
mode: train             # (str) YOLO mode, i.e. train, val, predict, export, track, benchmark
data: ./hand-pose.yaml  # (str, optional) path to data file, i.e. coco8.yaml
epochs: 500             # (int) number of epochs to train for
batch: 256              # (int) number of images per batch (-1 for AutoBatch)
imgsz: 480              # (int | list) input images size as int for train and val modes, or list[w,h] for predict and export modes
patience: 300           # (int) epochs to wait for no observable improvement for early stopping of training
device: 7               # (int | str | list, optional) device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu
project: ./             # (str, optional) project name
multi_scale: True       # (bool) Whether to use multiscale during training
close_mosaic: 0         # (int) disable mosaic augmentation for final epochs (0 to disable)
resume: True            # (bool) resume training from last checkpoint
# Hyperparameters ------------------------------------------------------------------------------------------------------
lr0: 0.01               # (float) initial learning rate (i.e. SGD=1E-2, Adam=1E-3)
lrf: 0.01               # (float) final learning rate (lr0 * lrf)
box: 8.0                # (float) box loss gain
cls: 0.5                # (float) cls loss gain (scale with pixels)
dfl: 1.5                # (float) dfl loss gain
pose: 14.0              # (float) pose loss gain
kobj: 0.0               # (float) keypoint obj loss gain
degrees: 0.0          # (float) image rotation (+/- deg)
translate: 0.1          # (float) image translation (+/- fraction)
scale: 0.5              # (float) image scale (+/- gain)
fliplr: 0.5             # (float) image flip left-right (probability)

如下图所示，handpose数据集的bbox约为[0.5, 0.5, 0.6, 0.6]，如果关闭马赛克增强，会导致网络预测检测框为具体数值，而不是依靠外观特征去预测边界。