Loss Function

Overview

Table of Content

Classification

Cross-Entropy
NLL
Hinge
Huber
Kullback-Leibler

Regression

MAE (L1)
MSE (L2)

Metric Learning

Dice
Contrastive
N-pair
Triplet (Margin-based Loss)

Brief Description

Measuring "Distance" between the answer we expected and the true answer.

Beneral

Cross Entropy Loss

Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1.

Cross-entropy loss increases as the predicted probability diverges from the actual label. A perfect model would have a log loss of 0.

Formula: (K class, y is one-hot vector, log is natural log)

$$ \operatorname{CE}(\mathbb{y}, \mathbb{\hat{y}}) = \displaystyle -\sum_{i=1}^K y_i \log(\hat{y}_i) $$

Binary Classification Problem

The last output layer of neural network should be sigmoid

(Another definition of Cross-Entropy)

The cross-entropy error function in neural network - Question 2

$$ \operatorname{CE}(\mathbb{y}, \mathbb{\hat{y}}) = \displaystyle -\sum_{i=1}^2 y_i \log(\hat{y}_i) = -y_i \log(\hat{y}_i) + (1 - y_i)\log(1 - \hat{y}_i) $$

Multi-class Classification Problem

The last output layer of neural network should be softmax

Mean Square Error Loss

Negative Log Likelihood Loss

NLL Loss vs. Cross Entropy Loss

import torch
import torch.nn.functional as F

thetensor = torch.randn(3, 3)
target = torch.tensor([0, 2, 1])

# NLL Loss
sm = F.softmax(thetensor, dim=1)
log = torch.log(sm)
nllloss = F.nll_loss(log, target)

# CE Loss
celoss = F.cross_entropy(thetensor, target)

print(nllloss, 'should be equal to', celoss)

Metric Learning

Dice Loss

Dice Loss in Action

Dice Loss PR · Issue #1249 · pytorch/pytorch

Contrastive Loss

Multi-class N-pair loss

Triplet Loss

Wiki - Triplet Loss
C4W4L04 Triplet loss - YouTube
Lossless Triplet loss - Towards Data Science
Siamese Network & Triplet Loss - Towards Data Science

$$ L = \max(d(a, b) - d(a, c) + \mathit{margin}, 0) $$

Distance measurement: $d(a, b) \sim$
- $\operatorname{sum}(|A-B|)$
- $(\operatorname{sum}((A-B)^2))^{1/2}$
- $(\operatorname{sum}((A-B)^3))^{1/3}$
- $\operatorname{Jaccard}(A, B)$
- $\operatorname{Cosine}(A, B)$

Margin-based Loss

$A$: document

$B$: positive sample document

$C$: negative sample document

prediction $B$ or $C$

$$ L = \max{0, M - \cos(r_A, r_B) + \cos(r_A, r_C)} $$

Let similarity between $A$ and positive sample has "$M$" larger than negetive sample

--

Resources

Loss Functions
Wiki - Loss function
- Category:Loss functions
- Loss functions for classification
  - Cross entropy

Article

Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those confusing names
機器/深度學習: 基礎介紹-損失函數(loss function) - Tommy Huang - Medium
損失函數的設計(Loss Function) - Cinnamon AI Taiwan - Medium

Github

eriklindernoren/ML-From-Scratch - Loss Functions

Paper

[1905.10675] Constellation Loss: Improving the efficiency of deep metric learning loss functions for optimal embedding
[1909.05235] SoftTriple Loss: Deep Metric Learning Without Triplet Sampling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss_Function.md

Loss_Function.md

Loss Function

Overview

Table of Content

Brief Description

Beneral

Cross Entropy Loss

Binary Classification Problem

Multi-class Classification Problem

Mean Square Error Loss

Negative Log Likelihood Loss

Metric Learning

Dice Loss

Contrastive Loss

Multi-class N-pair loss

Triplet Loss

Resources

Article

Github

Paper

Files

Loss_Function.md

Latest commit

History

Loss_Function.md

File metadata and controls

Loss Function

Overview

Table of Content

Brief Description

Beneral

Cross Entropy Loss

Binary Classification Problem

Multi-class Classification Problem

Mean Square Error Loss

Negative Log Likelihood Loss

Metric Learning

Dice Loss

Contrastive Loss

Multi-class N-pair loss

Triplet Loss

Resources

Article

Github

Paper