PyTorch中softmax函数的使用详解

创作时间:

作者:

@小白创作中心

PyTorch中softmax函数的使用详解

引用

CSDN

https://blog.csdn.net/weixin_45646640/article/details/129696487

softmax

softmax又称归一化指数函数。主要存在于两个包Package中，分别是：

torch.nn.Softmax(dim=None)

和

torch.nn.functional.softmax(input, dim=None, _stacklevel=3, dtype=None)

torch.nn.Softmax

torch.nn.Softmax中只有一个参数dim，用于制定归一化维度。其中，dim=0指代的是行，dim=1指代的是列。

import torch
import torch.nn as nn

input_0 = torch.Tensor([1, 2, 3, 4])
input_1 = torch.Tensor([[1, 2, 3, 4], [5, 6, 7, 8]])

# Parameter --- dim
softmax_0 = nn.Softmax(dim=0)
softmax_1 = nn.Softmax(dim=1)

# Output tensors
output_0 = softmax_0(input_0)  # dim=0
output_1 = softmax_1(input_1)  # dim=1
output_2 = softmax_0(input_1)  # dim=0

# Print
print(output_0)
print(output_1)
print(output_2)

输出结果为：

tensor([0.0321, 0.0871, 0.2369, 0.6439])
tensor([[0.0321, 0.0871, 0.2369, 0.6439],
        [0.0321, 0.0871, 0.2369, 0.6439]])
tensor([[0.0180, 0.0180, 0.0180, 0.0180],
        [0.9820, 0.9820, 0.9820, 0.9820]])

torch.nn.functional.softmax

torch.nn.functional.softmax除了dim参数外，还多了一个input参数，用于输入张量tensor。

import torch
import torch.nn.functional as F

input_0 = torch.Tensor([1, 2, 3, 4])
input_1 = torch.Tensor([[1, 2, 3, 4], [5, 6, 7, 8]])

output_0 = F.softmax(input_0)
output_1 = F.softmax(input_1, dim=0)
output_2 = F.softmax(input_1, dim=1)

print(output_0)
print(output_1)
print(output_2)

输出结果为：

tensor([0.0321, 0.0871, 0.2369, 0.6439])
tensor([[0.0180, 0.0180, 0.0180, 0.0180],
        [0.9820, 0.9820, 0.9820, 0.9820]])
tensor([[0.0321, 0.0871, 0.2369, 0.6439],
        [0.0321, 0.0871, 0.2369, 0.6439]])

例如，在一个基于ConvNet的简单神经网络中，最后一层返回一个[-infinity, infinity]的值logits，通过softmax函数或层，将其收敛或映射到[0, 1]，表示模型对每个类别的预测概率，dim参数指示需要为总和为1的维度。

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

print("Pytorch Version: ", torch.__version__)
import numpy as np
import matplotlib.pyplot as plt

# 首先定义一个基于ConvNet的简单神经网络
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4 * 4 * 50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4 * 4 * 50)  # 相当于reshape，展平特征向量
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)  # 得到概率值

softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)