mxnet.gluon.contrib.data.vision.dataloader

Contrib Vision DataLoaders.

Functions

create_bbox_augment(data_shape[, rand_crop, ...])

Create augmenters for bbox/object detection.

create_image_augment(data_shape[, resize, ...])

Creates an augmenter block.

Classes

BboxLabelTransform([coord_normalized])

Transform to convert 1-D bbox label to 2-D as in shape Nx5.

ImageBboxDataLoader(batch_size, data_shape)

Image iterator with a large number of augmentation choices for detection.

ImageDataLoader(batch_size, data_shape[, ...])

Image data loader with a large number of augmentation choices.

class mxnet.gluon.contrib.data.vision.dataloader.ImageBboxDataLoader(batch_size, data_shape, path_imgrec=None, path_imglist=None, path_root='.', part_index=0, num_parts=1, aug_list=None, imglist=None, coord_normalized=True, dtype='float32', shuffle=False, sampler=None, last_batch=None, batch_sampler=None, batchify_fn=None, num_workers=0, pin_memory=False, pin_device_id=0, prefetch=None, thread_pool=False, timeout=120, try_nopython=None, **kwargs)[source]

Bases: object

Image iterator with a large number of augmentation choices for detection.

Parameters:
  • batch_size (int) – Number of examples per batch.

  • data_shape (tuple) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.

  • path_imgrec (str) – Path to image record file (.rec). Created with tools/im2rec.py or bin/im2rec.

  • path_imglist (str) – Path to image list (.lst). Created with tools/im2rec.py or with custom script. Format: Tab separated record of index, one or more labels and relative_path_from_root.

  • imglist (list) – A list of images with the label(s). Each item is a list [imagelabel: float or list of float, imgpath].

  • path_root (str) – Root folder of image files.

  • shuffle (bool) – Whether to shuffle all images at the start of each iteration or not. Can be slow for HDD.

  • aug_list (list or None) – Augmenter list for generating distorted images

  • part_index (int) – Partition index.

  • num_parts (int) – Total number of partitions.

  • last_batch ({'keep', 'discard', 'rollover'}) –

    How to handle the last batch if batch_size does not evenly divide len(dataset).

    keep - A batch with less samples than previous batches is returned. discard - The last batch is discarded if its incomplete. rollover - The remaining samples are rolled over to the next epoch.

  • kwargs – More arguments for creating augmenter. See mx.gluon.contrib.data.create_bbox_augment.

class mxnet.gluon.contrib.data.vision.dataloader.ImageDataLoader(batch_size, data_shape, path_imgrec=None, path_imglist=None, path_root='.', part_index=0, num_parts=1, aug_list=None, imglist=None, dtype='float32', shuffle=False, sampler=None, last_batch=None, batch_sampler=None, batchify_fn=None, num_workers=0, pin_memory=False, pin_device_id=0, prefetch=None, thread_pool=False, timeout=120, try_nopython=None, **kwargs)[source]

Bases: object

Image data loader with a large number of augmentation choices. This loader supports reading from both .rec files and raw image files.

To load input images from .rec files, use path_imgrec parameter and to load from raw image files, use path_imglist and path_root parameters.

To use data partition (for distributed training) or shuffling, specify path_imgidx parameter.

Parameters:
  • batch_size (int) – Number of examples per batch.

  • data_shape (tuple) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.

  • path_imgrec (str) – Path to image record file (.rec). Created with tools/im2rec.py or bin/im2rec.

  • path_imglist (str) – Path to image list (.lst). Created with tools/im2rec.py or with custom script. Format: Tab separated record of index, one or more labels and relative_path_from_root.

  • imglist (list) – A list of images with the label(s). Each item is a list [imagelabel: float or list of float, imgpath].

  • path_root (str) – Root folder of image files. Whether to shuffle all images at the start of each iteration or not. Can be slow for HDD.

  • part_index (int) – Partition index.

  • num_parts (int) – Total number of partitions.

  • dtype (str) – Label data type. Default: float32. Other options: int32, int64, float64

  • last_batch ({'keep', 'discard', 'rollover'}) –

    How to handle the last batch if batch_size does not evenly divide len(dataset).

    keep - A batch with less samples than previous batches is returned. discard - The last batch is discarded if its incomplete. rollover - The remaining samples are rolled over to the next epoch.

  • kwargs – More arguments for creating augmenter. See mx.gluon.contrib.vision.dataloader.create_image_augment.

mxnet.gluon.contrib.data.vision.dataloader.create_image_augment(data_shape, resize=0, rand_crop=False, rand_resize=False, rand_mirror=False, mean=None, std=None, brightness=0, contrast=0, saturation=0, hue=0, pca_noise=0, rand_gray=0, inter_method=2, dtype='float32')[source]

Creates an augmenter block.

Parameters:
  • data_shape (tuple of int) – Shape for output data

  • resize (int) – Resize shorter edge if larger than 0 at the begining

  • rand_crop (bool) – Whether to enable random cropping other than center crop

  • rand_resize (bool) – Whether to enable random sized cropping, require rand_crop to be enabled

  • rand_gray (float) – [0, 1], probability to convert to grayscale for all channels, the number of channels will not be reduced to 1

  • rand_mirror (bool) – Whether to apply horizontal flip to image with probability 0.5

  • mean (np.ndarray or None) – Mean pixel values for [r, g, b]

  • std (np.ndarray or None) – Standard deviations for [r, g, b]

  • brightness (float) – Brightness jittering range (percent)

  • contrast (float) – Contrast jittering range (percent)

  • saturation (float) – Saturation jittering range (percent)

  • hue (float) – Hue jittering range (percent)

  • pca_noise (float) – Pca noise level (percent)

  • inter_method (int, default=2(Area-based)) –

    Interpolation method for all resizing operations

    Possible values: 0: Nearest Neighbors Interpolation. 1: Bilinear interpolation. 2: Bicubic interpolation over 4x4 pixel neighborhood. 3: Area-based (resampling using pixel area relation). It may be a preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the Nearest Neighbors method. (used by default). 4: Lanczos interpolation over 8x8 pixel neighborhood. 10: Random select from interpolation method metioned above. Note: When shrinking an image, it will generally look best with AREA-based interpolation, whereas, when enlarging an image, it will generally look best with Bicubic (slow) or Bilinear (faster but still looks OK).

Examples

>>> # An example of creating multiple augmenters
>>> augs = mx.gluon.contrib.data.create_image_augment(data_shape=(3, 300, 300), rand_mirror=True,
...    mean=True, brightness=0.125, contrast=0.125, rand_gray=0.05,
...    saturation=0.125, pca_noise=0.05, inter_method=10)