mxnet.image.image¶
Read individual image files and perform augmentations.
Functions
|
Creates an augmenter list. |
|
Crops the image src to the given size by trimming on all four sides and preserving the center of the image. |
|
Normalize src with mean and std. |
|
Pad image border with OpenCV. |
|
Crop src at fixed location, and (optionally) resize it to size. |
|
Decode an image to an NDArray. |
|
Read and decode an image to an NDArray. |
|
Resize image with OpenCV. |
|
Rotates the input image(s) of a specific rotation degree. |
|
Randomly crop src with size (width, height). |
|
Random rotates src by an angle included in angle limits. |
|
Randomly crop src with size. |
|
Resizes shorter edge to size. |
|
Scales down crop size if it's larger than image size. |
Classes
|
Image Augmenter base class |
|
Random brightness jitter augmentation. |
|
Cast to float32 |
|
Make center crop augmenter. |
|
Apply random brightness, contrast and saturation jitter in random order. |
|
Mean and std normalization. |
|
Random contrast jitter augmentation. |
|
Force resize to size regardless of aspect ratio |
Random horizontal flip. |
|
|
Random hue jitter augmentation. |
|
Image data iterator with a large number of augmentation choices. |
|
Add PCA based noise. |
|
Make random crop augmenter |
Randomly convert to gray image. |
|
|
Apply list of augmenters in random order |
|
Make random crop with random resizing and random aspect ratio jitter augmenter. |
|
Make resize shorter edge to size augmenter. |
|
Random saturation jitter augmentation. |
|
Composing a sequential augmenter list. |
- class mxnet.image.image.BrightnessJitterAug(brightness)[source]¶
Bases:
AugmenterRandom brightness jitter augmentation.
- Parameters:
brightness (float) – The brightness jitter ratio range, [0, 1]
- class mxnet.image.image.CenterCropAug(size, interp=2)[source]¶
Bases:
AugmenterMake center crop augmenter.
- class mxnet.image.image.ColorJitterAug(brightness, contrast, saturation)[source]¶
Bases:
RandomOrderAugApply random brightness, contrast and saturation jitter in random order.
- class mxnet.image.image.ColorNormalizeAug(mean, std)[source]¶
Bases:
AugmenterMean and std normalization.
- class mxnet.image.image.ContrastJitterAug(contrast)[source]¶
Bases:
AugmenterRandom contrast jitter augmentation.
- Parameters:
contrast (float) – The contrast jitter ratio range, [0, 1]
- mxnet.image.image.CreateAugmenter(data_shape, resize=0, rand_crop=False, rand_resize=False, rand_mirror=False, mean=None, std=None, brightness=0, contrast=0, saturation=0, hue=0, pca_noise=0, rand_gray=0, inter_method=2)[source]¶
Creates an augmenter list.
- Parameters:
resize (int) – Resize shorter edge if larger than 0 at the begining
rand_crop (bool) – Whether to enable random cropping other than center crop
rand_resize (bool) – Whether to enable random sized cropping, require rand_crop to be enabled
rand_gray (float) – [0, 1], probability to convert to grayscale for all channels, the number of channels will not be reduced to 1
rand_mirror (bool) – Whether to apply horizontal flip to image with probability 0.5
mean (np.ndarray or None) – Mean pixel values for [r, g, b]
std (np.ndarray or None) – Standard deviations for [r, g, b]
brightness (float) – Brightness jittering range (percent)
contrast (float) – Contrast jittering range (percent)
saturation (float) – Saturation jittering range (percent)
hue (float) – Hue jittering range (percent)
pca_noise (float) – Pca noise level (percent)
inter_method (int, default=2(Area-based)) –
Interpolation method for all resizing operations
Possible values: 0: Nearest Neighbors Interpolation. 1: Bilinear interpolation. 2: Bicubic interpolation over 4x4 pixel neighborhood. 3: Area-based (resampling using pixel area relation). It may be a preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the Nearest Neighbors method. (used by default). 4: Lanczos interpolation over 8x8 pixel neighborhood. 9: Cubic for enlarge, area for shrink, bilinear for others 10: Random select from interpolation method metioned above. Note: When shrinking an image, it will generally look best with AREA-based interpolation, whereas, when enlarging an image, it will generally look best with Bicubic (slow) or Bilinear (faster but still looks OK).
Examples
>>> # An example of creating multiple augmenters >>> augs = mx.image.CreateAugmenter(data_shape=(3, 300, 300), rand_mirror=True, ... mean=True, brightness=0.125, contrast=0.125, rand_gray=0.05, ... saturation=0.125, pca_noise=0.05, inter_method=10) >>> # dump the details >>> for aug in augs: ... aug.dumps()
- class mxnet.image.image.ForceResizeAug(size, interp=2)[source]¶
Bases:
AugmenterForce resize to size regardless of aspect ratio
- class mxnet.image.image.HorizontalFlipAug(p)[source]¶
Bases:
AugmenterRandom horizontal flip.
- Parameters:
p (float) – Probability to flip image horizontally
- class mxnet.image.image.HueJitterAug(hue)[source]¶
Bases:
AugmenterRandom hue jitter augmentation.
- Parameters:
hue (float) – The hue jitter ratio range, [0, 1]
- class mxnet.image.image.ImageIter(batch_size, data_shape, label_width=1, path_imgrec=None, path_imglist=None, path_root=None, path_imgidx=None, shuffle=False, part_index=0, num_parts=1, aug_list=None, imglist=None, data_name='data', label_name='softmax_label', dtype='float32', last_batch_handle='pad', **kwargs)[source]¶
Bases:
DataIterImage data iterator with a large number of augmentation choices. This iterator supports reading from both .rec files and raw image files.
To load input images from .rec files, use path_imgrec parameter and to load from raw image files, use path_imglist and path_root parameters.
To use data partition (for distributed training) or shuffling, specify path_imgidx parameter.
- Parameters:
batch_size (int) – Number of examples per batch.
data_shape (tuple) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.
label_width (int, optional) – Number of labels per example. The default label width is 1.
path_imgrec (str) – Path to image record file (.rec). Created with tools/im2rec.py or bin/im2rec.
path_imglist (str) – Path to image list (.lst). Created with tools/im2rec.py or with custom script. Format: Tab separated record of index, one or more labels and relative_path_from_root.
imglist (list) – A list of images with the label(s). Each item is a list [imagelabel: float or list of float, imgpath].
path_root (str) – Root folder of image files.
path_imgidx (str) – Path to image index file. Needed for partition and shuffling when using .rec source.
shuffle (bool) – Whether to shuffle all images at the start of each iteration or not. Can be slow for HDD.
part_index (int) – Partition index.
num_parts (int) – Total number of partitions.
data_name (str) – Data name for provided symbols.
label_name (str) – Label name for provided symbols.
dtype (str) – Label data type. Default: float32. Other options: int32, int64, float64
last_batch_handle (str, optional) – How to handle the last batch. This parameter can be ‘pad’(default), ‘discard’ or ‘roll_over’. If ‘pad’, the last batch will be padded with data starting from the begining If ‘discard’, the last batch will be discarded If ‘roll_over’, the remaining elements will be rolled over to the next iteration
kwargs – More arguments for creating augmenter. See mx.image.CreateAugmenter.
- imdecode(s)[source]¶
Decodes a string or byte string to an NDArray. See mx.img.imdecode for more details.
- class mxnet.image.image.LightingAug(alphastd, eigval, eigvec)[source]¶
Bases:
AugmenterAdd PCA based noise.
- Parameters:
alphastd (float) – Noise level
eigval (3x1 np.array) – Eigen values
eigvec (3x3 np.array) – Eigen vectors
- class mxnet.image.image.RandomCropAug(size, interp=2)[source]¶
Bases:
AugmenterMake random crop augmenter
- class mxnet.image.image.RandomGrayAug(p)[source]¶
Bases:
AugmenterRandomly convert to gray image.
- Parameters:
p (float) – Probability to convert to grayscale
- class mxnet.image.image.RandomOrderAug(ts)[source]¶
Bases:
AugmenterApply list of augmenters in random order
- Parameters:
ts (list of augmenters) – A series of augmenters to be applied in random order
- class mxnet.image.image.RandomSizedCropAug(size, area, ratio, interp=2, **kwargs)[source]¶
Bases:
AugmenterMake random crop with random resizing and random aspect ratio jitter augmenter.
- Parameters:
size (tuple of (int, int)) – Size of the crop formatted as (width, height).
area (float in (0, 1] or tuple of (float, float)) – If tuple, minimum area and maximum area to be maintained after cropping If float, minimum area to be maintained after cropping, maximum area is set to 1.0
ratio (tuple of (float, float)) – Aspect ratio range as (min_aspect_ratio, max_aspect_ratio)
interp (int, optional, default=2) – Interpolation method. See resize_short for details.
- class mxnet.image.image.ResizeAug(size, interp=2)[source]¶
Bases:
AugmenterMake resize shorter edge to size augmenter.
- class mxnet.image.image.SaturationJitterAug(saturation)[source]¶
Bases:
AugmenterRandom saturation jitter augmentation.
- Parameters:
saturation (float) – The saturation jitter ratio range, [0, 1]
- class mxnet.image.image.SequentialAug(ts)[source]¶
Bases:
AugmenterComposing a sequential augmenter list.
- Parameters:
ts (list of augmenters) – A series of augmenters to be applied in sequential order.
- mxnet.image.image.center_crop(src, size, interp=2)[source]¶
Crops the image src to the given size by trimming on all four sides and preserving the center of the image. Upsamples if src is smaller than size.
Note
This requires MXNet to be compiled with USE_OPENCV.
- Parameters:
- Returns:
NDArray – The cropped image.
Tuple – (x, y, width, height) where x, y are the positions of the crop in the original image and width, height the dimensions of the crop.
Example
>>> with open("flower.jpg", 'rb') as fp: ... str_image = fp.read() ... >>> image = mx.image.imdecode(str_image) >>> image <NDArray 2321x3482x3 @cpu(0)> >>> cropped_image, (x, y, width, height) = mx.image.center_crop(image, (1000, 500)) >>> cropped_image <NDArray 500x1000x3 @cpu(0)> >>> x, y, width, height (1241, 910, 1000, 500)
- mxnet.image.image.copyMakeBorder(src, top, bot, left, right, *args, **kwargs)[source]¶
Pad image border with OpenCV.
- Parameters:
src (NDArray) – source image
top (int, required) – Top margin.
bot (int, required) – Bottom margin.
left (int, required) – Left margin.
right (int, required) – Right margin.
type (int, optional, default='0') – Filling type (default=cv2.BORDER_CONSTANT). 0 - cv2.BORDER_CONSTANT - Adds a constant colored border. 1 - cv2.BORDER_REFLECT - Border will be mirror reflection of the border elements, like this : fedcba|abcdefgh|hgfedcb 2 - cv2.BORDER_REFLECT_101 or cv.BORDER_DEFAULT - Same as above, but with a slight change, like this : gfedcb|abcdefgh|gfedcba 3 - cv2.BORDER_REPLICATE - Last element is replicated throughout, like this: aaaaaa|abcdefgh|hhhhhhh 4 - cv2.BORDER_WRAP - it will look like this : cdefgh|abcdefgh|abcdefg
value (double, optional, default=0) – (Deprecated! Use
valuesinstead.) Fill with single value.values (tuple of <double>, optional, default=[]) – Fill with value(RGB[A] or gray), up to 4 channels.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns:
out – The output of this function.
- Return type:
Example
>>> with open("flower.jpeg", 'rb') as fp: ... str_image = fp.read() ... >>> image = mx.img.imdecode(str_image) >>> image <NDArray 2321x3482x3 @cpu(0)> >>> new_image = mx_border = mx.image.copyMakeBorder(mx_img, 1, 2, 3, 4, type=0) >>> new_image <NDArray 2324x3489x3 @cpu(0)>
- mxnet.image.image.fixed_crop(src, x0, y0, w, h, size=None, interp=2)[source]¶
Crop src at fixed location, and (optionally) resize it to size.
- Parameters:
src (NDArray) – Input image
x0 (int) – Left boundary of the cropping area
y0 (int) – Top boundary of the cropping area
w (int) – Width of the cropping area
h (int) – Height of the cropping area
size (tuple of (w, h)) – Optional, resize to new size after cropping
interp (int, optional, default=2) – Interpolation method. See resize_short for details.
- Returns:
An NDArray containing the cropped image.
- Return type:
- mxnet.image.image.imdecode(buf, *args, **kwargs)[source]¶
Decode an image to an NDArray.
Note
imdecode uses OpenCV (not the CV2 Python library). MXNet must have been built with USE_OPENCV=1 for imdecode to work.
- Parameters:
buf (str/bytes/bytearray or numpy.ndarray) – Binary image data as string or numpy ndarray.
flag (int, optional, default=1) – 1 for three channel color output. 0 for grayscale output.
to_rgb (int, optional, default=1) – 1 for RGB formatted output (MXNet default). 0 for BGR formatted output (OpenCV default).
out (NDArray, optional) – Output buffer. Use None for automatic allocation.
- Returns:
An NDArray containing the image.
- Return type:
Example
>>> with open("flower.jpg", 'rb') as fp: ... str_image = fp.read() ... >>> image = mx.img.imdecode(str_image) >>> image <NDArray 224x224x3 @cpu(0)>
Set flag parameter to 0 to get grayscale output
>>> with open("flower.jpg", 'rb') as fp: ... str_image = fp.read() ... >>> image = mx.img.imdecode(str_image, flag=0) >>> image <NDArray 224x224x1 @cpu(0)>
Set to_rgb parameter to 0 to get output in OpenCV format (BGR)
>>> with open("flower.jpg", 'rb') as fp: ... str_image = fp.read() ... >>> image = mx.img.imdecode(str_image, to_rgb=0) >>> image <NDArray 224x224x3 @cpu(0)>
- mxnet.image.image.imread(filename, *args, **kwargs)[source]¶
Read and decode an image to an NDArray.
Note
imread uses OpenCV (not the CV2 Python library). MXNet must have been built with USE_OPENCV=1 for imdecode to work.
- Parameters:
filename (str) – Name of the image file to be loaded.
flag ({0, 1}, default 1) – 1 for three channel color output. 0 for grayscale output.
to_rgb (bool, default True) – True for RGB formatted output (MXNet default). False for BGR formatted output (OpenCV default).
out (NDArray, optional) – Output buffer. Use None for automatic allocation.
- Returns:
An NDArray containing the image.
- Return type:
Example
>>> mx.img.imread("flower.jpg") <NDArray 224x224x3 @cpu(0)>
Set flag parameter to 0 to get grayscale output
>>> mx.img.imread("flower.jpg", flag=0) <NDArray 224x224x1 @cpu(0)>
Set to_rgb parameter to 0 to get output in OpenCV format (BGR)
>>> mx.img.imread("flower.jpg", to_rgb=0) <NDArray 224x224x3 @cpu(0)>
- mxnet.image.image.imresize(src, w, h, *args, **kwargs)[source]¶
Resize image with OpenCV.
Note
imresize uses OpenCV (not the CV2 Python library). MXNet must have been built with USE_OPENCV=1 for imresize to work.
- Parameters:
src (NDArray) – source image
w (int, required) – Width of resized image.
h (int, required) – Height of resized image.
interp (int, optional, default=1) – Interpolation method (default=cv2.INTER_LINEAR). Possible values: 0: Nearest Neighbors Interpolation. 1: Bilinear interpolation. 2: Bicubic interpolation over 4x4 pixel neighborhood. 3: Area-based (resampling using pixel area relation). It may be a preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the Nearest Neighbors method. (used by default). 4: Lanczos interpolation over 8x8 pixel neighborhood. 9: Cubic for enlarge, area for shrink, bilinear for others 10: Random select from interpolation method metioned above. Note: When shrinking an image, it will generally look best with AREA-based interpolation, whereas, when enlarging an image, it will generally look best with Bicubic (slow) or Bilinear (faster but still looks OK). More details can be found in the documentation of OpenCV, please refer to http://docs.opencv.org/master/da/d54/group__imgproc__transform.html.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns:
out – The output of this function.
- Return type:
Example
>>> with open("flower.jpeg", 'rb') as fp: ... str_image = fp.read() ... >>> image = mx.img.imdecode(str_image) >>> image <NDArray 2321x3482x3 @cpu(0)> >>> new_image = mx.img.resize(image, 240, 360) >>> new_image <NDArray 240x360x3 @cpu(0)>
- mxnet.image.image.imrotate(src, rotation_degrees, zoom_in=False, zoom_out=False)[source]¶
Rotates the input image(s) of a specific rotation degree.
- Parameters:
src (NDArray) – Input image (format CHW) or batch of images (format NCHW), in both case is required a float32 data type.
rotation_degrees (scalar or NDArray) – Wanted rotation in degrees. In case of src being a single image a scalar is needed, otherwise a mono-dimensional vector of angles or a scalar.
zoom_in (bool) – If True input image(s) will be zoomed in a way so that no padding will be shown in the output result.
zoom_out (bool) – If True input image(s) will be zoomed in a way so that the whole original image will be contained in the output result.
- Returns:
An NDArray containing the rotated image(s).
- Return type:
- mxnet.image.image.random_crop(src, size, interp=2)[source]¶
Randomly crop src with size (width, height). Upsample result if src is smaller than size.
- Parameters:
src (Source image NDArray)
size (Size of the crop formatted as (width, height). If the size is larger) – than the image, then the source image is upsampled to size and returned.
interp (int, optional, default=2) – Interpolation method. See resize_short for details.
- Returns:
NDArray – An NDArray containing the cropped image.
Tuple – A tuple (x, y, width, height) where (x, y) is top-left position of the crop in the original image and (width, height) are the dimensions of the cropped image.
Example
>>> im = mx.nd.array(cv2.imread("flower.jpg")) >>> cropped_im, rect = mx.image.random_crop(im, (100, 100)) >>> print cropped_im <NDArray 100x100x1 @cpu(0)> >>> print rect (20, 21, 100, 100)
- mxnet.image.image.random_rotate(src, angle_limits, zoom_in=False, zoom_out=False)[source]¶
Random rotates src by an angle included in angle limits.
- Parameters:
src (NDArray) – Input image (format CHW) or batch of images (format NCHW), in both case is required a float32 data type.
angle_limits (tuple) – Tuple of 2 elements containing the upper and lower limit for rotation angles in degree.
zoom_in (bool) – If True input image(s) will be zoomed in a way so that no padding will be shown in the output result.
zoom_out (bool) – If True input image(s) will be zoomed in a way so that the whole original image will be contained in the output result.
- Returns:
An NDArray containing the rotated image(s).
- Return type:
- mxnet.image.image.random_size_crop(src, size, area, ratio, interp=2, **kwargs)[source]¶
Randomly crop src with size. Randomize area and aspect ratio.
- Parameters:
src (NDArray) – Input image
size (tuple of (int, int)) – Size of the crop formatted as (width, height).
area (float in (0, 1] or tuple of (float, float)) – If tuple, minimum area and maximum area to be maintained after cropping If float, minimum area to be maintained after cropping, maximum area is set to 1.0
ratio (tuple of (float, float)) – Aspect ratio range as (min_aspect_ratio, max_aspect_ratio)
interp (int, optional, default=2) – Interpolation method. See resize_short for details.
- Returns:
NDArray – An NDArray containing the cropped image.
Tuple – A tuple (x, y, width, height) where (x, y) is top-left position of the crop in the original image and (width, height) are the dimensions of the cropped image.
- mxnet.image.image.resize_short(src, size, interp=2)[source]¶
Resizes shorter edge to size.
Note
resize_short uses OpenCV (not the CV2 Python library). MXNet must have been built with OpenCV for resize_short to work.
Resizes the original image by setting the shorter edge to size and setting the longer edge accordingly. Resizing function is called from OpenCV.
- Parameters:
src (NDArray) – The original image.
size (int) – The length to be set for the shorter edge.
interp (int, optional, default=2) – Interpolation method used for resizing the image. Possible values: 0: Nearest Neighbors Interpolation. 1: Bilinear interpolation. 2: Bicubic interpolation over 4x4 pixel neighborhood. 3: Area-based (resampling using pixel area relation). It may be a preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the Nearest Neighbors method. (used by default). 4: Lanczos interpolation over 8x8 pixel neighborhood. 9: Cubic for enlarge, area for shrink, bilinear for others 10: Random select from interpolation method metioned above. Note: When shrinking an image, it will generally look best with AREA-based interpolation, whereas, when enlarging an image, it will generally look best with Bicubic (slow) or Bilinear (faster but still looks OK). More details can be found in the documentation of OpenCV, please refer to http://docs.opencv.org/master/da/d54/group__imgproc__transform.html.
- Returns:
An ‘NDArray’ containing the resized image.
- Return type:
Example
>>> with open("flower.jpeg", 'rb') as fp: ... str_image = fp.read() ... >>> image = mx.img.imdecode(str_image) >>> image <NDArray 2321x3482x3 @cpu(0)> >>> size = 640 >>> new_image = mx.img.resize_short(image, size) >>> new_image <NDArray 2321x3482x3 @cpu(0)>
- mxnet.image.image.scale_down(src_size, size)[source]¶
Scales down crop size if it’s larger than image size.
If width/height of the crop is larger than the width/height of the image, sets the width/height to the width/height of the image.
- Parameters:
- Returns:
A tuple containing the scaled crop size in (width, height) format.
- Return type:
Example
>>> src_size = (640,480) >>> size = (720,120) >>> new_size = mx.img.scale_down(src_size, size) >>> new_size (640,106)