mxnet.gluon.contrib.data.vision.dataloader¶
Contrib Vision DataLoaders.
Functions
|
Create augmenters for bbox/object detection. |
|
Creates an augmenter block. |
Classes
|
Transform to convert 1-D bbox label to 2-D as in shape Nx5. |
|
Image iterator with a large number of augmentation choices for detection. |
|
Image data loader with a large number of augmentation choices. |
- class mxnet.gluon.contrib.data.vision.dataloader.ImageBboxDataLoader(batch_size, data_shape, path_imgrec=None, path_imglist=None, path_root='.', part_index=0, num_parts=1, aug_list=None, imglist=None, coord_normalized=True, dtype='float32', shuffle=False, sampler=None, last_batch=None, batch_sampler=None, batchify_fn=None, num_workers=0, pin_memory=False, pin_device_id=0, prefetch=None, thread_pool=False, timeout=120, try_nopython=None, **kwargs)[source]¶
Bases:
objectImage iterator with a large number of augmentation choices for detection.
- Parameters:
batch_size (int) – Number of examples per batch.
data_shape (tuple) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.
path_imgrec (str) – Path to image record file (.rec). Created with tools/im2rec.py or bin/im2rec.
path_imglist (str) – Path to image list (.lst). Created with tools/im2rec.py or with custom script. Format: Tab separated record of index, one or more labels and relative_path_from_root.
imglist (list) – A list of images with the label(s). Each item is a list [imagelabel: float or list of float, imgpath].
path_root (str) – Root folder of image files.
shuffle (bool) – Whether to shuffle all images at the start of each iteration or not. Can be slow for HDD.
aug_list (list or None) – Augmenter list for generating distorted images
part_index (int) – Partition index.
num_parts (int) – Total number of partitions.
last_batch ({'keep', 'discard', 'rollover'}) –
How to handle the last batch if batch_size does not evenly divide len(dataset).
keep - A batch with less samples than previous batches is returned. discard - The last batch is discarded if its incomplete. rollover - The remaining samples are rolled over to the next epoch.
kwargs – More arguments for creating augmenter. See mx.gluon.contrib.data.create_bbox_augment.
- class mxnet.gluon.contrib.data.vision.dataloader.ImageDataLoader(batch_size, data_shape, path_imgrec=None, path_imglist=None, path_root='.', part_index=0, num_parts=1, aug_list=None, imglist=None, dtype='float32', shuffle=False, sampler=None, last_batch=None, batch_sampler=None, batchify_fn=None, num_workers=0, pin_memory=False, pin_device_id=0, prefetch=None, thread_pool=False, timeout=120, try_nopython=None, **kwargs)[source]¶
Bases:
objectImage data loader with a large number of augmentation choices. This loader supports reading from both .rec files and raw image files.
To load input images from .rec files, use path_imgrec parameter and to load from raw image files, use path_imglist and path_root parameters.
To use data partition (for distributed training) or shuffling, specify path_imgidx parameter.
- Parameters:
batch_size (int) – Number of examples per batch.
data_shape (tuple) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.
path_imgrec (str) – Path to image record file (.rec). Created with tools/im2rec.py or bin/im2rec.
path_imglist (str) – Path to image list (.lst). Created with tools/im2rec.py or with custom script. Format: Tab separated record of index, one or more labels and relative_path_from_root.
imglist (list) – A list of images with the label(s). Each item is a list [imagelabel: float or list of float, imgpath].
path_root (str) – Root folder of image files. Whether to shuffle all images at the start of each iteration or not. Can be slow for HDD.
part_index (int) – Partition index.
num_parts (int) – Total number of partitions.
dtype (str) – Label data type. Default: float32. Other options: int32, int64, float64
last_batch ({'keep', 'discard', 'rollover'}) –
How to handle the last batch if batch_size does not evenly divide len(dataset).
keep - A batch with less samples than previous batches is returned. discard - The last batch is discarded if its incomplete. rollover - The remaining samples are rolled over to the next epoch.
kwargs – More arguments for creating augmenter. See mx.gluon.contrib.vision.dataloader.create_image_augment.
- mxnet.gluon.contrib.data.vision.dataloader.create_image_augment(data_shape, resize=0, rand_crop=False, rand_resize=False, rand_mirror=False, mean=None, std=None, brightness=0, contrast=0, saturation=0, hue=0, pca_noise=0, rand_gray=0, inter_method=2, dtype='float32')[source]¶
Creates an augmenter block.
- Parameters:
resize (int) – Resize shorter edge if larger than 0 at the begining
rand_crop (bool) – Whether to enable random cropping other than center crop
rand_resize (bool) – Whether to enable random sized cropping, require rand_crop to be enabled
rand_gray (float) – [0, 1], probability to convert to grayscale for all channels, the number of channels will not be reduced to 1
rand_mirror (bool) – Whether to apply horizontal flip to image with probability 0.5
mean (np.ndarray or None) – Mean pixel values for [r, g, b]
std (np.ndarray or None) – Standard deviations for [r, g, b]
brightness (float) – Brightness jittering range (percent)
contrast (float) – Contrast jittering range (percent)
saturation (float) – Saturation jittering range (percent)
hue (float) – Hue jittering range (percent)
pca_noise (float) – Pca noise level (percent)
inter_method (int, default=2(Area-based)) –
Interpolation method for all resizing operations
Possible values: 0: Nearest Neighbors Interpolation. 1: Bilinear interpolation. 2: Bicubic interpolation over 4x4 pixel neighborhood. 3: Area-based (resampling using pixel area relation). It may be a preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the Nearest Neighbors method. (used by default). 4: Lanczos interpolation over 8x8 pixel neighborhood. 10: Random select from interpolation method metioned above. Note: When shrinking an image, it will generally look best with AREA-based interpolation, whereas, when enlarging an image, it will generally look best with Bicubic (slow) or Bilinear (faster but still looks OK).
Examples
>>> # An example of creating multiple augmenters >>> augs = mx.gluon.contrib.data.create_image_augment(data_shape=(3, 300, 300), rand_mirror=True, ... mean=True, brightness=0.125, contrast=0.125, rand_gray=0.05, ... saturation=0.125, pca_noise=0.05, inter_method=10)