Siamese Architecture

An energy-based model (EBM) to learn a similarity metric from data.

Nov 01, 2022

Introduction

It is a discriminative training method for applications where the number of categories is very large, where the number of samples per category is small, and where only a subset of the categories is known at the time of training, e.g., face recognition and face verification.

The Framework

Given a family of functions G_w(X) parameterized by W, we seek to find a value of the parameter W such that the scalar energy function:

is small if X_1 and X_2 belong to the same category, and large if they belong to different categories.

For G_w(X), we can choose architecture that extracts robust representation, such as a convolutional network.

Contrastive Loss Function

The loss function needs a contrastive term to ensure not only that the energy for a pair of inputs from the same category is low, but also that the energy for a pair from different categories is large.

The total loss function over a data set is given by:

The instance loss function is composed of terms for the similar (y =1) case (L+), and the dissimilar (y = 0) case (L−):

The loss functions for the similar and dissimilar cases are given by:

Sources:

Neculoiu, Paul, Maarten Versteegh, and Mihai Rotaru. "Learning text similarity with siamese recurrent networks." Proceedings of the 1st Workshop on Representation Learning for NLP. 2016.

Chopra, Sumit, Raia Hadsell, and Yann LeCun. "Learning a similarity metric discriminatively, with application to face verification." 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). Vol. 1. IEEE, 2005.

"1-min read" Articles on NLP | ML | System Design

Discussion about this post