mxnet.image.detection¶
Read images and perform augmentations for object detection.
Functions
|
Create augmenters for detection. |
|
Helper function to create multiple random crop augmenters. |
Classes
|
Detection base augmenter |
|
Borrow standard augmenter from image classification. |
Random horizontal flipping. |
|
|
Random cropping with constraints |
|
Random padding augmenter. |
|
Randomly select one augmenter to apply, with chance to skip all. |
|
Image iterator with a large number of augmentation choices for detection. |
- mxnet.image.detection.CreateDetAugmenter(data_shape, resize=0, rand_crop=0, rand_pad=0, rand_gray=0, rand_mirror=False, mean=None, std=None, brightness=0, contrast=0, saturation=0, pca_noise=0, hue=0, inter_method=2, min_object_covered=0.1, aspect_ratio_range=(0.75, 1.33), area_range=(0.05, 3.0), min_eject_coverage=0.3, max_attempts=50, pad_val=(127, 127, 127))[source]¶
Create augmenters for detection.
- Parameters:
resize (int) – Resize shorter edge if larger than 0 at the begining
rand_crop (float) – [0, 1], probability to apply random cropping
rand_pad (float) – [0, 1], probability to apply random padding
rand_gray (float) – [0, 1], probability to convert to grayscale for all channels
rand_mirror (bool) – Whether to apply horizontal flip to image with probability 0.5
mean (np.ndarray or None) – Mean pixel values for [r, g, b]
std (np.ndarray or None) – Standard deviations for [r, g, b]
brightness (float) – Brightness jittering range (percent)
contrast (float) – Contrast jittering range (percent)
saturation (float) – Saturation jittering range (percent)
hue (float) – Hue jittering range (percent)
pca_noise (float) – Pca noise level (percent)
inter_method (int, default=2(Area-based)) –
Interpolation method for all resizing operations
Possible values: 0: Nearest Neighbors Interpolation. 1: Bilinear interpolation. 2: Area-based (resampling using pixel area relation). It may be a preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the Nearest Neighbors method. (used by default). 3: Bicubic interpolation over 4x4 pixel neighborhood. 4: Lanczos interpolation over 8x8 pixel neighborhood. 9: Cubic for enlarge, area for shrink, bilinear for others 10: Random select from interpolation method metioned above. Note: When shrinking an image, it will generally look best with AREA-based interpolation, whereas, when enlarging an image, it will generally look best with Bicubic (slow) or Bilinear (faster but still looks OK).
min_object_covered (float) – The cropped area of the image must contain at least this fraction of any bounding box supplied. The value of this parameter should be non-negative. In the case of 0, the cropped area does not need to overlap any of the bounding boxes supplied.
min_eject_coverage (float) – The minimum coverage of cropped sample w.r.t its original size. With this constraint, objects that have marginal area after crop will be discarded.
aspect_ratio_range (tuple of floats) – The cropped area of the image must have an aspect ratio = width / height within this range.
area_range (tuple of floats) – The cropped area of the image must contain a fraction of the supplied image within in this range.
max_attempts (int) – Number of attempts at generating a cropped/padded region of the image of the specified constraints. After max_attempts failures, return the original image.
pad_val (float) – Pixel value to be filled when padding is enabled. pad_val will automatically be subtracted by mean and divided by std if applicable.
Examples
>>> # An example of creating multiple augmenters >>> augs = mx.image.CreateDetAugmenter(data_shape=(3, 300, 300), rand_crop=0.5, ... rand_pad=0.5, rand_mirror=True, mean=True, brightness=0.125, contrast=0.125, ... saturation=0.125, pca_noise=0.05, inter_method=10, min_object_covered=[0.3, 0.5, 0.9], ... area_range=(0.3, 3.0)) >>> # dump the details >>> for aug in augs: ... aug.dumps()
- mxnet.image.detection.CreateMultiRandCropAugmenter(min_object_covered=0.1, aspect_ratio_range=(0.75, 1.33), area_range=(0.05, 1.0), min_eject_coverage=0.3, max_attempts=50, skip_prob=0)[source]¶
Helper function to create multiple random crop augmenters.
- Parameters:
min_object_covered (float or list of float, default=0.1) – The cropped area of the image must contain at least this fraction of any bounding box supplied. The value of this parameter should be non-negative. In the case of 0, the cropped area does not need to overlap any of the bounding boxes supplied.
min_eject_coverage (float or list of float, default=0.3) – The minimum coverage of cropped sample w.r.t its original size. With this constraint, objects that have marginal area after crop will be discarded.
aspect_ratio_range (tuple of floats or list of tuple of floats, default=(0.75, 1.33)) – The cropped area of the image must have an aspect ratio = width / height within this range.
area_range (tuple of floats or list of tuple of floats, default=(0.05, 1.0)) – The cropped area of the image must contain a fraction of the supplied image within in this range.
max_attempts (int or list of int, default=50) – Number of attempts at generating a cropped/padded region of the image of the specified constraints. After max_attempts failures, return the original image.
Examples
>>> # An example of creating multiple random crop augmenters >>> min_object_covered = [0.1, 0.3, 0.5, 0.7, 0.9] # use 5 augmenters >>> aspect_ratio_range = (0.75, 1.33) # use same range for all augmenters >>> area_range = [(0.1, 1.0), (0.2, 1.0), (0.2, 1.0), (0.3, 0.9), (0.5, 1.0)] >>> min_eject_coverage = 0.3 >>> max_attempts = 50 >>> aug = mx.image.det.CreateMultiRandCropAugmenter(min_object_covered=min_object_covered, aspect_ratio_range=aspect_ratio_range, area_range=area_range, min_eject_coverage=min_eject_coverage, max_attempts=max_attempts, skip_prob=0) >>> aug.dumps() # show some details
- class mxnet.image.detection.DetBorrowAug(augmenter)[source]¶
Bases:
DetAugmenterBorrow standard augmenter from image classification. Which is good once you know label won’t be affected after this augmenter.
- Parameters:
augmenter (mx.image.Augmenter) – The borrowed standard augmenter which has no effect on label
- class mxnet.image.detection.DetHorizontalFlipAug(p)[source]¶
Bases:
DetAugmenterRandom horizontal flipping.
- Parameters:
p (float) – chance [0, 1] to flip
- class mxnet.image.detection.DetRandomCropAug(min_object_covered=0.1, aspect_ratio_range=(0.75, 1.33), area_range=(0.05, 1.0), min_eject_coverage=0.3, max_attempts=50)[source]¶
Bases:
DetAugmenterRandom cropping with constraints
- Parameters:
min_object_covered (float, default=0.1) – The cropped area of the image must contain at least this fraction of any bounding box supplied. The value of this parameter should be non-negative. In the case of 0, the cropped area does not need to overlap any of the bounding boxes supplied.
min_eject_coverage (float, default=0.3) – The minimum coverage of cropped sample w.r.t its original size. With this constraint, objects that have marginal area after crop will be discarded.
aspect_ratio_range (tuple of floats, default=(0.75, 1.33)) – The cropped area of the image must have an aspect ratio = width / height within this range.
area_range (tuple of floats, default=(0.05, 1.0)) – The cropped area of the image must contain a fraction of the supplied image within in this range.
max_attempts (int, default=50) – Number of attempts at generating a cropped/padded region of the image of the specified constraints. After max_attempts failures, return the original image.
- class mxnet.image.detection.DetRandomPadAug(aspect_ratio_range=(0.75, 1.33), area_range=(1.0, 3.0), max_attempts=50, pad_val=(128, 128, 128))[source]¶
Bases:
DetAugmenterRandom padding augmenter.
- Parameters:
aspect_ratio_range (tuple of floats, default=(0.75, 1.33)) – The padded area of the image must have an aspect ratio = width / height within this range.
area_range (tuple of floats, default=(1.0, 3.0)) – The padded area of the image must be larger than the original area
max_attempts (int, default=50) – Number of attempts at generating a padded region of the image of the specified constraints. After max_attempts failures, return the original image.
pad_val (float or tuple of float, default=(128, 128, 128)) – pixel value to be filled when padding is enabled.
- class mxnet.image.detection.DetRandomSelectAug(aug_list, skip_prob=0)[source]¶
Bases:
DetAugmenterRandomly select one augmenter to apply, with chance to skip all.
- Parameters:
aug_list (list of DetAugmenter) – The random selection will be applied to one of the augmenters
skip_prob (float) – The probability to skip all augmenters and return input directly
- class mxnet.image.detection.ImageDetIter(batch_size, data_shape, path_imgrec=None, path_imglist=None, path_root=None, path_imgidx=None, shuffle=False, part_index=0, num_parts=1, aug_list=None, imglist=None, data_name='data', label_name='label', last_batch_handle='pad', **kwargs)[source]¶
Bases:
ImageIterImage iterator with a large number of augmentation choices for detection.
- Parameters:
aug_list (list or None) – Augmenter list for generating distorted images
batch_size (int) – Number of examples per batch.
data_shape (tuple) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.
path_imgrec (str) – Path to image record file (.rec). Created with tools/im2rec.py or bin/im2rec.
path_imglist (str) – Path to image list (.lst). Created with tools/im2rec.py or with custom script. Format: Tab separated record of index, one or more labels and relative_path_from_root.
imglist (list) – A list of images with the label(s). Each item is a list [imagelabel: float or list of float, imgpath].
path_root (str) – Root folder of image files.
path_imgidx (str) – Path to image index file. Needed for partition and shuffling when using .rec source.
shuffle (bool) – Whether to shuffle all images at the start of each iteration or not. Can be slow for HDD.
part_index (int) – Partition index.
num_parts (int) – Total number of partitions.
data_name (str) – Data name for provided symbols.
label_name (str) – Name for detection labels
last_batch_handle (str, optional) – How to handle the last batch. This parameter can be ‘pad’(default), ‘discard’ or ‘roll_over’. If ‘pad’, the last batch will be padded with data starting from the begining If ‘discard’, the last batch will be discarded If ‘roll_over’, the remaining elements will be rolled over to the next iteration
kwargs – More arguments for creating augmenter. See mx.image.CreateDetAugmenter.
- augmentation_transform(data, label)[source]¶
Override Transforms input data with specified augmentations.
- draw_next(color=None, thickness=2, mean=None, std=None, clip=True, waitKey=None, window_name='draw_next', id2labels=None)[source]¶
Display next image with bounding boxes drawn.
- Parameters:
color (tuple) – Bounding box color in RGB, use None for random color
thickness (int) – Bounding box border thickness
mean (True or numpy.ndarray) – Compensate for the mean to have better visual effect
std (True or numpy.ndarray) – Revert standard deviations
clip (bool) – If true, clip to [0, 255] for better visual effect
waitKey (None or int) – Hold the window for waitKey milliseconds if set, skip ploting if None
window_name (str) – Plot window name if waitKey is set.
id2labels (dict) – Mapping of labels id to labels name.
- Returns:
numpy.ndarray
Examples
>>> # use draw_next to get images with bounding boxes drawn >>> iterator = mx.image.ImageDetIter(1, (3, 600, 600), path_imgrec='train.rec') >>> for image in iterator.draw_next(waitKey=None): ... # display image >>> # or let draw_next display using cv2 module >>> for image in iterator.draw_next(waitKey=0, window_name='disp'): ... pass
- sync_label_shape(it, verbose=False)[source]¶
Synchronize label shape with the input iterator. This is useful when train/validation iterators have different label padding.
- Parameters:
it (ImageDetIter) – The other iterator to synchronize
verbose (bool) – Print verbose log if true
- Returns:
The synchronized other iterator, the internal label shape is updated as well.
- Return type:
Examples
>>> train_iter = mx.image.ImageDetIter(32, (3, 300, 300), path_imgrec='train.rec') >>> val_iter = mx.image.ImageDetIter(32, (3, 300, 300), path.imgrec='val.rec') >>> train_iter.label_shape (30, 6) >>> val_iter.label_shape (25, 6) >>> val_iter = train_iter.sync_label_shape(val_iter, verbose=False) >>> train_iter.label_shape (30, 6) >>> val_iter.label_shape (30, 6)