梯度下降过程可视化

创作时间:

作者:

@小白创作中心

梯度下降过程可视化

引用

CSDN

https://blog.csdn.net/weixin_43589323/article/details/137237632

梯度下降是机器学习和深度学习中的基础优化算法。本文通过自定义实现的可视化工具，直观展示了不同学习率下梯度下降的过程，帮助读者深入理解这一算法的核心原理。

最近看到一些关于梯度下降的内容，虽然有很多教程和代码都有可视化的过程，但是总体上感觉不够直观，尤其是没有箭头能够显示前后的位置，于是自己动手写了一个函数用于自动求导并可视化梯度下降的过程。

首先查看几个效果图：

可视化f(x) = x^2

学习率η = 0.1，较慢的收敛

学习率η = 0.5，比较好的收敛

学习率η = 0.9，振荡的收敛

可视化f(x) = 0.15πcos(0.15πx)

学习率η = 0.1，起始位置为15，收敛到一个极小值

学习率η = 0.2，起始位置为-8，收敛到一个极小值

学习率η = 3，起始位置为15，振荡收敛到圆点附近

总结

目标函数的最小化过程受到起始值和学习率（迭代步长）的影响，为了能够让收敛过程更加快速准确，需要认真对待初始化过程。

下面是绘图的源码

import torch
from torch import Tensor
from typing import Callable
import matplotlib.pyplot as plt
from matplotlib_inline import backend_inline

def use_svg_display():
    """Use the svg format to display a plot in Jupyter."""
    backend_inline.set_matplotlib_formats('svg')

def set_figsize(figsize=(4.5, 3.5)):
    """Set the figure size for matplotlib."""
    use_svg_display()
    plt.rcParams['figure.figsize'] = figsize

class GradientDesenctVisualization(object):
    def __init__(self, func: Callable, 
                 steps: int=10, 
                 eta: float=0.1, 
                 init_point: Tensor=None):
        """visualize gradient descent progress
        Args:
            func (Callable): objective function
            steps (int, optional): total steps to perform gradient descent. Defaults to 10.
            eta (float, optional): learing rate. Defaults to 0.1.
            init_point (Tensor, optional) start point of the progress. Defaults to None.
        """
        self.func = func
        self.steps = steps
        self.eta = eta
        self.init_point = init_point
        if self.init_point is None:     # if not given, initialize with normal distribution
            self.init_point = torch.randn(1, requires_grad=True)
        assert self.init_point.requires_grad is True
        
     
    def evoluation(self):
        """perform gradient descent progress
        """
        x = self.init_point
        eta = self.eta
        func = self.func
        steps = self.steps
        
        # record evoluation of x
        x_evoluation = [x.data.item()]
        for _ in range(steps):
            y = func(x)         # compute output
            y.backward()        # backward
            x.data -= eta * x.grad.data # compute next point
            x.grad.data.zero_() # clear grad
            x_evoluation.append(x.data.item()) # record evoluation 
        self.x_evoluation = x_evoluation
    
    def show_trace(self, bound: float=None):
        """plot gradient descent progress
        """
        x_evoluation = self.x_evoluation
        f = self.func
        
        # get bound of x_evoluation 
        if bound is None:
            bound = max(abs(min(x_evoluation)), abs(max(x_evoluation)))
        f_line = torch.arange(-bound, bound, 0.01)
        set_figsize()
        
        # plot graph of objective funciton
        plt.plot(f_line, [f(x) for x in f_line], '-', c='b')
        
        # plot the gradient descent progress
        points = list(zip(x_evoluation, [f(x) for x in x_evoluation]))
        # use annotate to plot the arrow
        for i in range(len(points) - 1):
            plt.gca().annotate("", xy=points[i+1], xytext=points[i],
                    arrowprops=dict(arrowstyle="->", lw=1.0, fc="red", ec="red"))
            
        #  plot the start position of the gradinet descent progress
        ax1 = plt.plot(points[0][0], points[0][1], 'ro')
        
        # plot the final position of the gradinet descent progress
        ax2 = plt.plot(points[-1][0], points[-1][1], 'go')
        
        plt.legend([ax1[0], ax2[0]], ['start', 'end'])
        
        plt.title(f'$\eta$ = {self.eta:.3f}')
        plt.xlim(-bound, bound)
        plt.xlabel('x')
        plt.ylabel('func')
        plt.grid()
        plt.show()

使用方法

import torch
c = torch.tensor(0.15 * torch.pi)
def func(x):  # 目标函数
    return x * torch.cos(c * x)
steps = 10	# 迭代次数
eta = 0.2	# 学习率
init_point = torch.tensor([-8.0], requires_grad=True) # 初始位置
# 初始化
gdv= GradientDesenctVisualization(func, steps, eta, init_point)
# 进行计算
gdv.evoluation()
# 可视化过程
gdv.show_trace()

热门推荐

如何判断自己购买的房屋是否是串串房