collectors¶

Collectors have two main functions: synthesizing (or collecting) samples and compute metric matrix (which will be passed to selectors and losses).

All methods are listed below:

method	description
BaseCollector	Base class.
DefaultCollector	Do nothing.
ProxyCollector	Maintain a set of proxies
MoCoCollector	paper: Momentum Contrast for Unsupervised Visual Representation Learning
SimSiamCollector	paper: Exploring Simple Siamese Representation Learning
HDMLCollector	paper: Hardness-Aware Deep Metric Learning
DAMLCollector	paper: Deep Adversarial Metric Learning
DVMLCollector	paper: Deep Variational Metric Learning

Notes

embedders have significent difference with collectors. embedders also take charge of generating embeddings which will be used to compute metrics.

Class¶

DefaultCollector¶

class gedml.core.collectors.iteration_collectors.default_collector.DefaultCollector(*args, **kwargs)[source]¶

Bases: gedml.core.collectors.base_collector.BaseCollector

This is the default collector which directly computes metric matrix using embeddings.

forward(data, embeddings, labels) → tuple[source]¶: Do nothing. Copy embeddings as proxies and copy labels as proxies labels.

ProxyCollector¶

class gedml.core.collectors.iteration_collectors.proxy_collector.ProxyCollector(num_classes=100, embeddings_dim=128, centers_per_class=1, regularize_func='softtriple', regularize_weight=0, *args, **kwargs)[source]¶

Bases: gedml.core.collectors.base_collector.BaseCollector

Maintain proxy parameters to support proxy-based metric learning methods.

Parameters

num_classes (int) – Number of classes. default: 100.
embeddings_dim (int) – Dimension of embeddings. default: 128.
centers_per_class (int) – Number of centers per class. default: 1

forward(data, embeddings, labels) → tuple[source]¶: Compute similarity (or distance) matrix between embeddings and proxies.

initiate_params()[source]¶: Initiate proxies.

MoCoCollector¶

class gedml.core.collectors.iteration_collectors.moco_collector.MoCoCollector(query_trunk, query_embedder, embeddings_dim=128, bank_size=65536, m=0.999, T=0.07, *args, **kwargs)[source]¶

Bases: gedml.core.collectors.base_collector.BaseCollector

Paper: Momentum Contrast for Unsupervised Visual Representation Learning

Use Momentum Contrast (MoCo) for unsupervised visual representation learning. This code is modified from: https://github.com/facebookresearch/moco. In this paper, a dynamic dictionary with a queue and a moving-averaged encoder are built.

Parameters

query_trunk (torch.nn.Module) – default: ResNet50
query_embedder (torch.nn.Module) – multi-layer perceptron
embeddings_dim (int) – dimension of embeddings. default: 128
bank_size (int) – size of the memory bank. default: 65536
m (float) – weight of moving-average. default: 0.999
T (float) – coefficient of softmax

forward(data, embeddings) → tuple[source]¶

Maintain a large memory bank to boost unsupervised learning performance.

Parameters

data (torch.Tensor) – A batch of key images. size: \(B \times C \times H \times W\)
embeddings (torch.Tensor) – A batch of query embeddings. size: \(B \times dim\)

initiate_params()[source]¶: Cancel the gradient of key_trunk and key_embedder

update(trainer)[source]¶: Update key encoder using moving-average.

SimSiamCollector¶

class gedml.core.collectors.iteration_collectors.simsiam_collector.SimSiamCollector(*args, **kwargs)[source]¶

Bases: gedml.core.collectors.base_collector.BaseCollector

Paper: Exploring Simple Siamese Representation Learning

This method use none of the following to learn meaningful representations:

negative sample pairs;
large batches;
momentum encoders.

And a stop-gradient operation plays an essential role in preventing collapsing.

forward(data, embeddings, labels) → tuple[source]¶

For simplicity, two data streams will be combined together and be passed through embeddings parameter. In function collect, two data streams will be split (first half for first stream; second half for second stream).

Parameters

data (torch.Tensor) – A batch of key images (not used). size: \(B \times C \times H \times W\)
embeddings (torch.Tensor) – A batch of query embeddings. size: \(2B \times dim\)
labels (torch.Tensor) – Labels of the input. size: \(2B \times 1\)

HDMLCollector¶

class gedml.core.collectors.iteration_collectors.hdml_collector.HDMLCollector(generator, embedder, classifier, alpha=90.0, beta=10000.0, coef_lambda=0.5, soft_weight=10000.0, d_plus_scheme='positive_distance', d_plus=0.5, *args, **kwargs)[source]¶

Bases: gedml.core.collectors.base_collector.BaseCollector

Use variational autoencoder to decompose intra-class invariance and intra-class variance.

Paper: Hardness-Aware Deep Metric Learning

Four types of loss: (loss_avg = loss_m, loss_gen = loss_recon + loss_soft)

loss_recon
loss_soft
loss_syn
loss_m

Parameters

generator (torch.nn.Module) – multi-layer perceptron
embedder (torch.nn.Module) – multi-layer perceptron
classifier (torch.nn.Module) – multi-layer perceptrons
alpha (float) – 90.0 (NPairLoss) or 7.0 (TripletLoss)
beta (float) – 1.0e4
coef_lambda (float) – 0.5
soft_weight (float) – 1.0e4
d_plus_scheme (str) – default: positive_distance
d_plus (float) – Constant or positive pair distance. default: 0.5

forward(data, embeddings, features, labels) → tuple[source]¶

Define four kinds of losses.

\(loss_{total} = w_{recon} \times loss_{recon} + w_{soft} \times loss_{soft} + w_m \times loss_m + w_{syn} \times loss_{syn}\)

\(loss_{recon} = mean(|f_{pos} - f_{pos-recon}|^2_2)\)

\(loss_{soft} = CrossEntropy(Prob_{recon}, Labels_{recon})\)

\(loss_m = loss_{metric}(matrix_{m})\)

\(loss_syn = loss_{metric}(matrix_{syn})\)

update(trainer)[source]¶

In HDML paper, an adaptive weighting method is proposed. Therefore, before each epoch loss_avg and loss_gen must be updated from outside trainer

\(loss_{avg} = loss_m\)

\(loss_{gen} = loss_{recon} + loss_{soft}\)

DAMLCollector¶

class gedml.core.collectors.iteration_collectors.daml_collector.DAMLCollector(embedder, generator, lambda_0=1, lambda_1=1, lambda_2=50, alpha=1, *args, **kwargs)[source]¶

Bases: gedml.core.collectors.base_collector.BaseCollector

NOTE: only support Triplet-Loss.

Paper: Deep Adversarial Metric Learning

Training steps:

pretrain the deep metric learning model without the hard negative generator;
initialize the generator adversarial to the pre-trained metric;
jointly optimize both networks during each iteration end-to-end

Three losses for hard negative generation:

the synthetic samples should be close to the anchor in the original feature space;
the synthetic samples should perserve the annotation information;
the synthetic samples should be misclassified by the learned metric

Default backbone structure:

trunk: GoogLeNet
embedder: one-layer perceptron
generator: three-layer perceptron

Parameters

embedder (torch.nn.Module) – embedder model (default: one-layer perceptron)
generator (torch.nn.Module) – generator model (default: three-layer perceptron)
lambda_0 (int) – default: 1
lambda_1 (int) – default: 1
lambda_2 (int) – default: 50
alpha (int) – default: 1

forward(data, embeddings, features, labels) → tuple[source]¶

There are four losses to be computed in collect function (All losses will be computed in this function, i.e. NOT pass to selectors or losses modules)

\(loss_{total} = \lambda_0 \times loss_m + \lambda_1 \times loss_{reg} + \lambda_2 \times loss_{adv} + loss_{hard}\)

\(loss_m = mean(ReLU(D_{ap emb} - D_{an emb} - \alpha))\)

\(loss_{adv} = mean(ReLU(D_{an feat} - D_{ap feat} - \alpha))\)

\(loss_{reg} = mean(|f_{syn} - f_{neg}|^2_2)\)

\(loss_{hard} = mean(|f_{syn} - f_{anchor}|^2_2)\)

update(trainer)[source]¶: Define the interface that collector can update itself by giving specific information (default do nothing)

DVMLCollector¶

class gedml.core.collectors.iteration_collectors.dvml_collector.DVMLCollector(embedder_mean, embedder_std, decoder, T=20, phase=1, lambda_1=None, lambda_2=None, lambda_3=None, lambda_4=None, *args, **kwargs)[source]¶

Bases: gedml.core.collectors.base_collector.BaseCollector

Paper: Deep Variational Metric Learning

Four losses:

loss_kl: KL divergence between learned distribution and isotropic multivariate Gaussian
loss_recon: reconstruction loss of original images and images generated by the decoder
metric learning loss of learned intra-class invariance
metric learning loss of the combination of sampled intra-class variance and learned intra-class invariance

Default parameters recommended in the paper:

lr = 0.0001
T = 20 (for sample generation)
batch_size = 128 (pair-based) or 120 (triplet-based)

There are two phases during training:

1. first phase: cut off the back-propagation of the gradients from the decoder network: lambda_1 = 1, lambda_2 = 1, lambda_3 = 0.1, lambda_4 = 1,

2. second phase: release the constraint: lambda_1 = 0.8, lambda_2 = 1, lambda_3 = 0.2, lambda_4 = 0.8,

Parameters

embedder_mean (torch.nn.Module) – multi-layer perceptron
embedder_std (torch.nn.Module) – multi-layer perceptron
decoder (torch.nn.Module) – multi-layer perceptron
T (int) – default: 20
phase (int) – 1 for first phase and 2 for second phase
lambda_1 (int) – first phase: 1; second phase: 0.8
lambda_2 (int) – first phase: 1; second phase: 1
lambda_3 (int) – first phase: 0.1; second phase: 0.2
lambda_4 (int) – first phase: 1; second phase: 0.8

forward(data, embeddings, features, labels) → tuple[source]¶

Four losses should be computed in function collect:

\(loss_{total} = \lambda_1 \times loss_{kl} + \lambda_2 \times loss_{recon} + \lambda_3 \times loss_{syn} + \lambda_4 \times loss_{invariant}\)

\(loss_{kl} = KL(p_{dist}, q_{dist})\)

\(loss_{recon} = mean(|f_{decode} - f_{ori}|^2_2)\)

\(loss_{syn} = loss_{metric}(matrix_{syn}, labels)\)

\(loss_{invariant} = loss_{metric}(matrix_{inv}, labels)\)

update(trainer)[source]¶: Define the interface that collector can update itself by giving specific information (default do nothing)

GlobalProxyCollector¶

class gedml.core.collectors.epoch_collectors.global_proxy_collector.GlobalProxyCollector(optimizer_name='Adam', optimizer_param={'lr': 0.001}, dataloader_param={'batch_size': 120, 'drop_last': False, 'num_workers': 8, 'shuffle': True}, max_iter=50000, error_bound=0.001, total_patience=10, auth_weight=1.0, repre_weight=1.0, disc_weight=1.0, *args, **kwargs)[source]¶

Bases: gedml.core.collectors.iteration_collectors.proxy_collector.ProxyCollector, gedml.core.collectors.epoch_collectors._default_global_collector._DefaultGlobalCollector

Compute the global proxies before updating other parameters.

_DefaultGlobalCollector¶

class gedml.core.collectors.epoch_collectors._default_global_collector._DefaultGlobalCollector(dataloader_param={'batch_size': 120, 'drop_last': False, 'num_workers': 8, 'shuffle': True}, *args, **kwargs)[source]¶: Bases: object

BaseCollector¶

class gedml.core.collectors.base_collector.BaseCollector(metric, **kwargs)[source]¶

Bases: gedml.core.modules.with_recorder.WithRecorder

Base class of collector module, which defines main collector method in function collect and update, and defines default parameters in function output_list, input_list and _default_next_module.

Parameters: metric (metric instance) – metric to compute matrix (e.g. euclidean or cosine)

Example

>>> metric = MetricFactory(is_normalize=True, metric_name="cosine")
>>> data = torch.randn(10, 3, 227, 227)
>>> embeddings = torch.randn(10, 128)
>>> labels = torch.randint(0, 3, size=(10,))
>>> collector = DefaultCollector(metric=metric)
>>> # collector forward
>>> output_dict = collector(data, embeddings, labels)

forward(data, embeddings, labels) → tuple[source]¶

In collect function, three kinds of operation may be done:

maintain sets of parameters about collecting (or synthesizing) samples
compute metric matrix and pass to next module
compute some regularization term using embeddings

Parameters

data (torch.Tensor) – Images with RGB channels. size: \(B \times C \times H \times W\)
embeddings (torch.Tensor) – Embedding. size: \(B \times dim\)
lables (torch.Tensor) – Ground truth of dataset. size: \(B \times 1\)

Returns

include metric matrix, labels etc according to function output_list.

Let \(B_{row}\) be the length of rows and \(B_{col}\) be the length of columns, typical output type is listed below:

metric matrix (torch.Tensor): size: \(B_{row} \times B_{col}\)
labels of rows (torch.Tensor): size: \(B_{row} \times 1\) or \(B_{row} \times B_{col}\)
labels of columns (torch.Tensor): size: \(1 \times B_{col}\) or \(B_{row} \times B_{col}\)
is_from_same_source (bool): indicate whether row vectors and column vectors are from the same data

Return type

tuple

update(*args, **kwargs)[source]¶: Define the interface that collector can update itself by giving specific information (default do nothing)