collectors¶
Collectors have two main functions: synthesizing (or collecting) samples and compute metric matrix (which will be passed to selectors and losses).
All methods are listed below:
method |
description |
---|---|
BaseCollector |
Base class. |
DefaultCollector |
Do nothing. |
ProxyCollector |
Maintain a set of proxies |
MoCoCollector |
paper: Momentum Contrast for Unsupervised Visual Representation Learning |
SimSiamCollector |
paper: Exploring Simple Siamese Representation Learning |
HDMLCollector |
paper: Hardness-Aware Deep Metric Learning |
DAMLCollector |
paper: Deep Adversarial Metric Learning |
DVMLCollector |
paper: Deep Variational Metric Learning |
Notes
embedders
have significent difference with collectors
. embedders
also take charge of generating embeddings which will be used to compute metrics.
Class¶
DefaultCollector¶
- class gedml.core.collectors.iteration_collectors.default_collector.DefaultCollector(*args, **kwargs)[source]¶
Bases:
gedml.core.collectors.base_collector.BaseCollector
This is the default collector which directly computes metric matrix using embeddings.
ProxyCollector¶
- class gedml.core.collectors.iteration_collectors.proxy_collector.ProxyCollector(num_classes=100, embeddings_dim=128, centers_per_class=1, regularize_func='softtriple', regularize_weight=0, *args, **kwargs)[source]¶
Bases:
gedml.core.collectors.base_collector.BaseCollector
Maintain proxy parameters to support proxy-based metric learning methods.
- Parameters
num_classes (int) – Number of classes. default: 100.
embeddings_dim (int) – Dimension of embeddings. default: 128.
centers_per_class (int) – Number of centers per class. default: 1
MoCoCollector¶
- class gedml.core.collectors.iteration_collectors.moco_collector.MoCoCollector(query_trunk, query_embedder, embeddings_dim=128, bank_size=65536, m=0.999, T=0.07, *args, **kwargs)[source]¶
Bases:
gedml.core.collectors.base_collector.BaseCollector
Paper: Momentum Contrast for Unsupervised Visual Representation Learning
Use Momentum Contrast (MoCo) for unsupervised visual representation learning. This code is modified from: https://github.com/facebookresearch/moco. In this paper, a dynamic dictionary with a queue and a moving-averaged encoder are built.
- Parameters
query_trunk (torch.nn.Module) – default: ResNet50
query_embedder (torch.nn.Module) – multi-layer perceptron
embeddings_dim (int) – dimension of embeddings. default: 128
bank_size (int) – size of the memory bank. default: 65536
m (float) – weight of moving-average. default: 0.999
T (float) – coefficient of softmax
SimSiamCollector¶
- class gedml.core.collectors.iteration_collectors.simsiam_collector.SimSiamCollector(*args, **kwargs)[source]¶
Bases:
gedml.core.collectors.base_collector.BaseCollector
Paper: Exploring Simple Siamese Representation Learning
This method use none of the following to learn meaningful representations:
negative sample pairs;
large batches;
momentum encoders.
And a stop-gradient operation plays an essential role in preventing collapsing.
- forward(data, embeddings, labels) tuple [source]¶
For simplicity, two data streams will be combined together and be passed through
embeddings
parameter. In functioncollect
, two data streams will be split (first half for first stream; second half for second stream).- Parameters
data (torch.Tensor) – A batch of key images (not used). size: \(B \times C \times H \times W\)
embeddings (torch.Tensor) – A batch of query embeddings. size: \(2B \times dim\)
labels (torch.Tensor) – Labels of the input. size: \(2B \times 1\)
HDMLCollector¶
- class gedml.core.collectors.iteration_collectors.hdml_collector.HDMLCollector(generator, embedder, classifier, alpha=90.0, beta=10000.0, coef_lambda=0.5, soft_weight=10000.0, d_plus_scheme='positive_distance', d_plus=0.5, *args, **kwargs)[source]¶
Bases:
gedml.core.collectors.base_collector.BaseCollector
Use variational autoencoder to decompose intra-class invariance and intra-class variance.
Paper: Hardness-Aware Deep Metric Learning
Four types of loss: (loss_avg = loss_m, loss_gen = loss_recon + loss_soft)
loss_recon
loss_soft
loss_syn
loss_m
- Parameters
generator (torch.nn.Module) – multi-layer perceptron
embedder (torch.nn.Module) – multi-layer perceptron
classifier (torch.nn.Module) – multi-layer perceptrons
alpha (float) – 90.0 (NPairLoss) or 7.0 (TripletLoss)
beta (float) – 1.0e4
coef_lambda (float) – 0.5
soft_weight (float) – 1.0e4
d_plus_scheme (str) – default:
positive_distance
d_plus (float) – Constant or positive pair distance. default: 0.5
- forward(data, embeddings, features, labels) tuple [source]¶
Define four kinds of losses.
\(loss_{total} = w_{recon} \times loss_{recon} + w_{soft} \times loss_{soft} + w_m \times loss_m + w_{syn} \times loss_{syn}\)
\(loss_{recon} = mean(|f_{pos} - f_{pos-recon}|^2_2)\)
\(loss_{soft} = CrossEntropy(Prob_{recon}, Labels_{recon})\)
\(loss_m = loss_{metric}(matrix_{m})\)
\(loss_syn = loss_{metric}(matrix_{syn})\)
DAMLCollector¶
- class gedml.core.collectors.iteration_collectors.daml_collector.DAMLCollector(embedder, generator, lambda_0=1, lambda_1=1, lambda_2=50, alpha=1, *args, **kwargs)[source]¶
Bases:
gedml.core.collectors.base_collector.BaseCollector
NOTE: only support Triplet-Loss.
Paper: Deep Adversarial Metric Learning
Training steps:
pretrain the deep metric learning model without the hard negative generator;
initialize the generator adversarial to the pre-trained metric;
jointly optimize both networks during each iteration end-to-end
Three losses for hard negative generation:
the synthetic samples should be close to the anchor in the original feature space;
the synthetic samples should perserve the annotation information;
the synthetic samples should be misclassified by the learned metric
Default backbone structure:
trunk:
GoogLeNet
embedder: one-layer perceptron
generator: three-layer perceptron
- Parameters
embedder (torch.nn.Module) – embedder model (default: one-layer perceptron)
generator (torch.nn.Module) – generator model (default: three-layer perceptron)
lambda_0 (int) – default: 1
lambda_1 (int) – default: 1
lambda_2 (int) – default: 50
alpha (int) – default: 1
- forward(data, embeddings, features, labels) tuple [source]¶
There are four losses to be computed in
collect
function (All losses will be computed in this function, i.e. NOT pass toselectors
orlosses
modules)\(loss_{total} = \lambda_0 \times loss_m + \lambda_1 \times loss_{reg} + \lambda_2 \times loss_{adv} + loss_{hard}\)
\(loss_m = mean(ReLU(D_{ap emb} - D_{an emb} - \alpha))\)
\(loss_{adv} = mean(ReLU(D_{an feat} - D_{ap feat} - \alpha))\)
\(loss_{reg} = mean(|f_{syn} - f_{neg}|^2_2)\)
\(loss_{hard} = mean(|f_{syn} - f_{anchor}|^2_2)\)
DVMLCollector¶
- class gedml.core.collectors.iteration_collectors.dvml_collector.DVMLCollector(embedder_mean, embedder_std, decoder, T=20, phase=1, lambda_1=None, lambda_2=None, lambda_3=None, lambda_4=None, *args, **kwargs)[source]¶
Bases:
gedml.core.collectors.base_collector.BaseCollector
Paper: Deep Variational Metric Learning
Four losses:
loss_kl: KL divergence between learned distribution and isotropic multivariate Gaussian
loss_recon: reconstruction loss of original images and images generated by the decoder
metric learning loss of learned intra-class invariance
metric learning loss of the combination of sampled intra-class variance and learned intra-class invariance
Default parameters recommended in the paper:
lr = 0.0001
T = 20 (for sample generation)
batch_size = 128 (pair-based) or 120 (triplet-based)
There are two phases during training:
1. first phase: cut off the back-propagation of the gradients from the decoder network:
lambda_1
= 1,lambda_2
= 1,lambda_3
= 0.1,lambda_4
= 1,2. second phase: release the constraint:
lambda_1
= 0.8,lambda_2
= 1,lambda_3
= 0.2,lambda_4
= 0.8,- Parameters
embedder_mean (torch.nn.Module) – multi-layer perceptron
embedder_std (torch.nn.Module) – multi-layer perceptron
decoder (torch.nn.Module) – multi-layer perceptron
T (int) – default: 20
phase (int) – 1 for
first phase
and 2 forsecond phase
lambda_1 (int) – first phase: 1; second phase: 0.8
lambda_2 (int) – first phase: 1; second phase: 1
lambda_3 (int) – first phase: 0.1; second phase: 0.2
lambda_4 (int) – first phase: 1; second phase: 0.8
- forward(data, embeddings, features, labels) tuple [source]¶
Four losses should be computed in function
collect
:\(loss_{total} = \lambda_1 \times loss_{kl} + \lambda_2 \times loss_{recon} + \lambda_3 \times loss_{syn} + \lambda_4 \times loss_{invariant}\)
\(loss_{kl} = KL(p_{dist}, q_{dist})\)
\(loss_{recon} = mean(|f_{decode} - f_{ori}|^2_2)\)
\(loss_{syn} = loss_{metric}(matrix_{syn}, labels)\)
\(loss_{invariant} = loss_{metric}(matrix_{inv}, labels)\)
GlobalProxyCollector¶
- class gedml.core.collectors.epoch_collectors.global_proxy_collector.GlobalProxyCollector(optimizer_name='Adam', optimizer_param={'lr': 0.001}, dataloader_param={'batch_size': 120, 'drop_last': False, 'num_workers': 8, 'shuffle': True}, max_iter=50000, error_bound=0.001, total_patience=10, auth_weight=1.0, repre_weight=1.0, disc_weight=1.0, *args, **kwargs)[source]¶
Bases:
gedml.core.collectors.iteration_collectors.proxy_collector.ProxyCollector
,gedml.core.collectors.epoch_collectors._default_global_collector._DefaultGlobalCollector
Compute the global proxies before updating other parameters.
_DefaultGlobalCollector¶
BaseCollector¶
- class gedml.core.collectors.base_collector.BaseCollector(metric, **kwargs)[source]¶
Bases:
gedml.core.modules.with_recorder.WithRecorder
Base class of collector module, which defines main collector method in function
collect
andupdate
, and defines default parameters in functionoutput_list
,input_list
and_default_next_module
.- Parameters
metric (metric instance) – metric to compute matrix (e.g. euclidean or cosine)
Example
>>> metric = MetricFactory(is_normalize=True, metric_name="cosine") >>> data = torch.randn(10, 3, 227, 227) >>> embeddings = torch.randn(10, 128) >>> labels = torch.randint(0, 3, size=(10,)) >>> collector = DefaultCollector(metric=metric) >>> # collector forward >>> output_dict = collector(data, embeddings, labels)
- forward(data, embeddings, labels) tuple [source]¶
In
collect
function, three kinds of operation may be done:maintain sets of parameters about collecting (or synthesizing) samples
compute metric matrix and pass to next module
compute some regularization term using embeddings
- Parameters
data (torch.Tensor) – Images with RGB channels. size: \(B \times C \times H \times W\)
embeddings (torch.Tensor) – Embedding. size: \(B \times dim\)
lables (torch.Tensor) – Ground truth of dataset. size: \(B \times 1\)
- Returns
include metric matrix, labels etc according to function
output_list
.Let \(B_{row}\) be the length of rows and \(B_{col}\) be the length of columns, typical output type is listed below:
metric matrix (torch.Tensor): size: \(B_{row} \times B_{col}\)
labels of rows (torch.Tensor): size: \(B_{row} \times 1\) or \(B_{row} \times B_{col}\)
labels of columns (torch.Tensor): size: \(1 \times B_{col}\) or \(B_{row} \times B_{col}\)
is_from_same_source (bool): indicate whether row vectors and column vectors are from the same data
- Return type
tuple