transforms

There are two kinds of transforms:

  1. augumentation method. Define how an image is augumented.

  2. wrapping method. Define how to combine multi streams.

Class

ConvertToBGR

class gedml.core.transforms.img_transforms.ConvertToBGR[source]

Bases: object

Converts a PIL image from RGB to BGR.

Multiplier

class gedml.core.transforms.img_transforms.Multiplier(multiple)[source]

Bases: object

Multiply the pixel value by a constant.

TwoCropsTransformWrapper

class gedml.core.transforms.wrapper_transforms.TwoCropsTransformWrapper(base_transform)[source]

Bases: object

Take two random crops of one image as the query and key. modified from: https://github.com/facebookresearch/moco

DefaultTransformWrapper

class gedml.core.transforms.wrapper_transforms.DefaultTransformWrapper(base_transform)[source]

Bases: object

Default wrapper.

Resize

class torchvision.transforms.transforms.Resize(size, interpolation=2)[source]

Bases: torch.nn.modules.module.Module

Resize the input image to the given size. The image can be a PIL Image or a torch Tensor, in which case it is expected to have […, H, W] shape, where … means an arbitrary number of leading dimensions

Parameters
  • size (sequence or int) – Desired output size. If size is a sequence like (h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). In torchscript mode padding as single int is not supported, use a tuple or list of length 1: [size, ].

  • interpolation (int, optional) – Desired interpolation enum defined by `filters`_. Default is PIL.Image.BILINEAR. If input is Tensor, only PIL.Image.NEAREST, PIL.Image.BILINEAR and PIL.Image.BICUBIC are supported.

forward(img)[source]
Parameters

img (PIL Image or Tensor) – Image to be scaled.

Returns

Rescaled image.

Return type

PIL Image or Tensor

RandomHorizontalFlip

class torchvision.transforms.transforms.RandomHorizontalFlip(p=0.5)[source]

Bases: torch.nn.modules.module.Module

Horizontally flip the given image randomly with a given probability. The image can be a PIL Image or a torch Tensor, in which case it is expected to have […, H, W] shape, where … means an arbitrary number of leading dimensions

Parameters

p (float) – probability of the image being flipped. Default value is 0.5

forward(img)[source]
Parameters

img (PIL Image or Tensor) – Image to be flipped.

Returns

Randomly flipped image.

Return type

PIL Image or Tensor

RandomResizedCrop

class torchvision.transforms.transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=2)[source]

Bases: torch.nn.modules.module.Module

Crop the given image to random size and aspect ratio. The image can be a PIL Image or a Tensor, in which case it is expected to have […, H, W] shape, where … means an arbitrary number of leading dimensions

A crop of random size (default: of 0.08 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 4/3) of the original aspect ratio is made. This crop is finally resized to given size. This is popularly used to train the Inception networks.

Parameters
  • size (int or sequence) – expected output size of each edge. If size is an int instead of sequence like (h, w), a square output size (size, size) is made. If provided a tuple or list of length 1, it will be interpreted as (size[0], size[0]).

  • scale (tuple of float) – range of size of the origin size cropped

  • ratio (tuple of float) – range of aspect ratio of the origin aspect ratio cropped.

  • interpolation (int) – Desired interpolation enum defined by `filters`_. Default is PIL.Image.BILINEAR. If input is Tensor, only PIL.Image.NEAREST, PIL.Image.BILINEAR and PIL.Image.BICUBIC are supported.

forward(img)[source]
Parameters

img (PIL Image or Tensor) – Image to be cropped and resized.

Returns

Randomly cropped and resized image.

Return type

PIL Image or Tensor

static get_params(img: torch.Tensor, scale: List[float], ratio: List[float]) Tuple[int, int, int, int][source]

Get parameters for crop for a random sized crop.

Parameters
  • img (PIL Image or Tensor) – Input image.

  • scale (list) – range of scale of the origin size cropped

  • ratio (list) – range of aspect ratio of the origin aspect ratio cropped

Returns

params (i, j, h, w) to be passed to crop for a random

sized crop.

Return type

tuple

RandomGrayscale

class torchvision.transforms.transforms.RandomGrayscale(p=0.1)[source]

Bases: torch.nn.modules.module.Module

Randomly convert image to grayscale with a probability of p (default 0.1). The image can be a PIL Image or a Tensor, in which case it is expected to have […, 3, H, W] shape, where … means an arbitrary number of leading dimensions

Parameters

p (float) – probability that image should be converted to grayscale.

Returns

Grayscale version of the input image with probability p and unchanged with probability (1-p). - If input image is 1 channel: grayscale version is 1 channel - If input image is 3 channel: grayscale version is 3 channel with r == g == b

Return type

PIL Image or Tensor

forward(img)[source]
Parameters

img (PIL Image or Tensor) – Image to be converted to grayscale.

Returns

Randomly grayscaled image.

Return type

PIL Image or Tensor

ColorJitter

class torchvision.transforms.transforms.ColorJitter(brightness=0, contrast=0, saturation=0, hue=0)[source]

Bases: torch.nn.modules.module.Module

Randomly change the brightness, contrast and saturation of an image.

Parameters
  • brightness (float or tuple of float (min, max)) – How much to jitter brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.

  • contrast (float or tuple of float (min, max)) – How much to jitter contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.

  • saturation (float or tuple of float (min, max)) – How much to jitter saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.

  • hue (float or tuple of float (min, max)) – How much to jitter hue. hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

forward(img)[source]
Parameters

img (PIL Image or Tensor) – Input image.

Returns

Color jittered image.

Return type

PIL Image or Tensor

static get_params(brightness, contrast, saturation, hue)[source]

Get a randomized transform to be applied on image.

Arguments are same as that of __init__.

Returns

Transform which randomly adjusts brightness, contrast and saturation in a random order.

CenterCrop

class torchvision.transforms.transforms.CenterCrop(size)[source]

Bases: torch.nn.modules.module.Module

Crops the given image at the center. The image can be a PIL Image or a torch Tensor, in which case it is expected to have […, H, W] shape, where … means an arbitrary number of leading dimensions

Parameters

size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made. If provided a tuple or list of length 1, it will be interpreted as (size[0], size[0]).

forward(img)[source]
Parameters

img (PIL Image or Tensor) – Image to be cropped.

Returns

Cropped image.

Return type

PIL Image or Tensor

ToTensor

class torchvision.transforms.transforms.ToTensor[source]

Bases: object

Convert a PIL Image or numpy.ndarray to tensor. This transform does not support torchscript.

Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0] if the PIL Image belongs to one of the modes (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1) or if the numpy.ndarray has dtype = np.uint8

In the other cases, tensors are returned without scaling.

Note

Because the input image is scaled to [0.0, 1.0], this transformation should not be used when transforming target image masks. See the references for implementing the transforms for image masks.

Normalize

class torchvision.transforms.transforms.Normalize(mean, std, inplace=False)[source]

Bases: torch.nn.modules.module.Module

Normalize a tensor image with mean and standard deviation. Given mean: (mean[1],...,mean[n]) and std: (std[1],..,std[n]) for n channels, this transform will normalize each channel of the input torch.*Tensor i.e., output[channel] = (input[channel] - mean[channel]) / std[channel]

Note

This transform acts out of place, i.e., it does not mutate the input tensor.

Parameters
  • mean (sequence) – Sequence of means for each channel.

  • std (sequence) – Sequence of standard deviations for each channel.

  • inplace (bool,optional) – Bool to make this operation in-place.

forward(tensor: torch.Tensor) torch.Tensor[source]
Parameters

tensor (Tensor) – Tensor image to be normalized.

Returns

Normalized Tensor image.

Return type

Tensor

Compose

class torchvision.transforms.transforms.Compose(transforms)[source]

Bases: object

Composes several transforms together. This transform does not support torchscript. Please, see the note below.

Parameters

transforms (list of Transform objects) – list of transforms to compose.

Example

>>> transforms.Compose([
>>>     transforms.CenterCrop(10),
>>>     transforms.ToTensor(),
>>> ])

Note

In order to script the transformations, please use torch.nn.Sequential as below.

>>> transforms = torch.nn.Sequential(
>>>     transforms.CenterCrop(10),
>>>     transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
>>> )
>>> scripted_transforms = torch.jit.script(transforms)

Make sure to use only scriptable transformations, i.e. that work with torch.Tensor, does not require lambda functions or PIL.Image.