mxnet.gluon.data.dataset¶
Dataset container.
Classes
|
A dataset that combines multiple dataset-like objects, e.g. Datasets, lists, arrays, etc. |
|
Abstract dataset class. |
|
A dataset wrapping over a RecordIO (.rec) file. |
|
Simple Dataset wrapper for lists and arrays. |
- class mxnet.gluon.data.dataset.ArrayDataset(*args)[source]¶
Bases:
DatasetA dataset that combines multiple dataset-like objects, e.g. Datasets, lists, arrays, etc.
The i-th sample is defined as (x1[i], x2[i], …).
- Parameters:
*args (one or more dataset-like objects) – The data arrays.
- class mxnet.gluon.data.dataset.Dataset[source]¶
Bases:
objectAbstract dataset class. All datasets should have this interface.
Subclasses need to override __getitem__, which returns the i-th element, and __len__, which returns the total number elements.
Note
An mxnet or numpy array can be directly used as a dataset.
- filter(fn)[source]¶
Returns a new dataset with samples filtered by the filter function fn.
Note that if the Dataset is the result of a lazily transformed one with transform(lazy=False), the filter is eagerly applied to the transformed samples without materializing the transformed result. That is, the transformation will be applied again whenever a sample is retrieved after filter().
- Parameters:
fn (callable) – A filter function that takes a sample as input and returns a boolean. Samples that return False are discarded.
- Returns:
The filtered dataset.
- Return type:
- shard(num_shards, index)[source]¶
Returns a new dataset includes only 1/num_shards of this dataset.
For distributed training, be sure to shard before you randomize the dataset (such as shuffle), if you want each worker to reach a unique subset.
- take(count)[source]¶
Returns a new dataset with at most count number of samples in it.
- Parameters:
count (int or None) – A integer representing the number of elements of this dataset that should be taken to form the new dataset. If count is None, or if count is greater than the size of this dataset, the new dataset will contain all elements of this dataset.
- Returns:
The result dataset.
- Return type:
- transform(fn, lazy=True)[source]¶
Returns a new dataset with each sample transformed by the transformer function fn.
- Parameters:
fn (callable) – A transformer function that takes a sample as input and returns the transformed sample.
lazy (bool, default True) – If False, transforms all samples at once. Otherwise, transforms each sample on demand. Note that if fn is stochastic, you must set lazy to True or you will get the same result on all epochs.
- Returns:
The transformed dataset.
- Return type:
- transform_first(fn, lazy=True)[source]¶
Returns a new dataset with the first element of each sample transformed by the transformer function fn.
This is mostly applicable when each sample contains two components - features and label, i.e., (X, y), and you only want to transform the first element X (i.e., the features) while keeping the label y unchanged.
- Parameters:
fn (callable) – A transformer function that takes the first element of a sample as input and returns the transformed element.
lazy (bool, default True) – If False, transforms all samples at once. Otherwise, transforms each sample on demand. Note that if fn is stochastic, you must set lazy to True or you will get the same result on all epochs.
- Returns:
The transformed dataset.
- Return type: