mxnet.gluon.nn.basic_layers¶
Basic neural network layers.
Classes
|
Batch normalization layer (Ioffe and Szegedy, 2014). |
|
Lays Block s concurrently. |
|
Just your regular densely-connected NN layer. |
|
Applies Dropout to the input. |
|
Turns non-negative integers (indexes/tokens) into dense vectors of fixed size. |
|
Flattens the input to two dimensional. |
|
Applies group normalization to the n-dimensional input array. |
|
Lays HybridBlock s concurrently. |
|
Wraps an operator or an expression as a HybridBlock object. |
Stacks HybridBlocks sequentially. |
|
|
Block that passes through the input directly. |
|
Applies instance normalization to the n-dimensional input array. |
|
Wraps an operator or an expression as a Block object. |
|
Applies layer normalization to the n-dimensional input array. |
Stacks Blocks sequentially. |
|
|
Cross-GPU Synchronized Batch normalization (SyncBN) |
- class mxnet.gluon.nn.basic_layers.BatchNorm(axis=1, momentum=0.9, epsilon=1e-05, center=True, scale=True, use_global_stats=False, beta_initializer='zeros', gamma_initializer='ones', running_mean_initializer='zeros', running_variance_initializer='ones', in_channels=0, **kwargs)[source]¶
Bases:
_BatchNormBatch normalization layer (Ioffe and Szegedy, 2014). Normalizes the input at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.
- Parameters:
axis (int, default 1) – The axis that should be normalized. This is typically the channels (C) axis. For instance, after a Conv2D layer with layout=’NCHW’, set axis=1 in BatchNorm. If layout=’NHWC’, then set axis=3.
momentum (float, default 0.9) – Momentum for the moving average.
epsilon (float, default 1e-5) – Small float added to variance to avoid dividing by zero.
center (bool, default True) – If True, add offset of beta to normalized tensor. If False, beta is ignored.
scale (bool, default True) – If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. nn.relu), this can be disabled since the scaling will be done by the next layer.
use_global_stats (bool, default False) – If True, use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator. If False, use local batch-norm.
beta_initializer (str or Initializer, default ‘zeros’) – Initializer for the beta weight.
gamma_initializer (str or Initializer, default ‘ones’) – Initializer for the gamma weight.
running_mean_initializer (str or Initializer, default ‘zeros’) – Initializer for the running mean.
running_variance_initializer (str or Initializer, default ‘ones’) – Initializer for the running variance.
in_channels (int, default 0) – Number of channels (feature maps) in input data. If not specified, initialization will be deferred to the first time forward is called and in_channels will be inferred from the shape of input data.
- Inputs:
data: input tensor with arbitrary shape.
- Outputs:
out: output tensor with the same shape as data.
- class mxnet.gluon.nn.basic_layers.Concatenate(axis=-1)[source]¶
Bases:
SequentialLays Block s concurrently.
This block feeds its input to all children blocks, and produce the output by concatenating all the children blocks’ outputs on the specified axis.
Example:
net = Concatenate() net.add(nn.Dense(10, activation='relu')) net.add(nn.Dense(20)) net.add(Identity())
- Parameters:
axis (int, default -1) – The axis on which to concatenate the outputs.
- add(*blocks)¶
Adds block on top of the stack.
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- forward(x)[source]¶
Overrides to implement forward computation using
NDArray. Only accepts positional arguments.
- hybridize(active=True, **kwargs)¶
Activates or deactivates HybridBlock s recursively. Has no effect on non-hybrid children.
- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
**kwargs (string) – Additional flags for hybridized operator.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install callback monitor.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
Block.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.Dense(units, activation=None, use_bias=True, flatten=True, dtype='float32', weight_initializer=None, bias_initializer='zeros', in_units=0, **kwargs)[source]¶
Bases:
HybridBlockJust your regular densely-connected NN layer.
Dense implements the operation: output = activation(dot(input, weight.T) + bias) where activation is the element-wise activation function passed as the activation argument, weight is a weights matrix created by the layer, and bias is a bias vector created by the layer (only applicable if use_bias is True).
- Parameters:
units (int) – Dimensionality of the output space.
activation (str) – Activation function to use. See help on Activation layer. If you don’t specify anything, no activation is applied (ie. “linear” activation: a(x) = x).
use_bias (bool, default True) – Whether the layer uses a bias vector.
flatten (bool, default True) – Whether the input tensor should be flattened. If true, all but the first axis of input data are collapsed together. If false, all but the last axis of input data are kept the same, and the transformation applies on the last axis.
dtype (str or np.dtype, default 'float32') – Data type of output embeddings.
weight_initializer (str or Initializer) – Initializer for the kernel weights matrix.
bias_initializer (str or Initializer) – Initializer for the bias vector.
in_units (int, optional) – Size of the input data. If not specified, initialization will be deferred to the first time forward is called and in_units will be inferred from the shape of input data.
- Inputs:
data: if flatten is True, data should be a tensor with shape (batch_size, x1, x2, …, xn), where x1 * x2 * … * xn is equal to in_units. If flatten is False, data should have shape (x1, x2, …, xn, in_units).
- Outputs:
out: if flatten is True, out will be a tensor with shape (batch_size, units). If flatten is False, out will have shape (x1, x2, …, xn, units).
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.Dropout(rate, axes=(), **kwargs)[source]¶
Bases:
HybridBlockApplies Dropout to the input.
Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting.
- Parameters:
- Inputs:
data: input tensor with arbitrary shape.
- Outputs:
out: output tensor with the same shape as data.
References
Dropout: A Simple Way to Prevent Neural Networks from Overfitting
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(*args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.Embedding(input_dim, output_dim, dtype='float32', weight_initializer=None, sparse_grad=False, **kwargs)[source]¶
Bases:
HybridBlockTurns non-negative integers (indexes/tokens) into dense vectors of fixed size. eg. [4, 20] -> [[0.25, 0.1], [0.6, -0.2]]
Note
if sparse_grad is set to True, the gradient w.r.t weight will be sparse. Only a subset of optimizers support sparse gradients, including SGD, AdaGrad and Adam. By default lazy updates is turned on, which may perform differently from standard updates. For more details, please check the Optimization API at: https://mxnet.apache.org/versions/master/api/python/docs/api/optimizer/index.html
- Parameters:
input_dim (int) – Size of the vocabulary, i.e. maximum integer index + 1.
output_dim (int) – Dimension of the dense embedding.
dtype (str or np.dtype, default 'float32') – Data type of output embeddings.
weight_initializer (Initializer) – Initializer for the embeddings matrix.
sparse_grad (bool) – If True, gradient w.r.t. weight will be a ‘row_sparse’ NDArray.
Inputs –
data: (N-1)-D tensor with shape: (x1, x2, …, xN-1).
Output –
out: N-D tensor with shape: (x1, x2, …, xN-1, output_dim).
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(*args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.Flatten(**kwargs)[source]¶
Bases:
HybridBlockFlattens the input to two dimensional.
- Inputs:
data: input tensor with arbitrary shape (N, x1, x2, …, xn)
- Output:
out: 2D tensor with shape: (N, x1 cdot x2 cdot … cdot xn)
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(*args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.GroupNorm(num_groups=1, epsilon=1e-05, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', in_channels=0)[source]¶
Bases:
HybridBlockApplies group normalization to the n-dimensional input array. This operator takes an n-dimensional input array where the leftmost 2 axis are batch and channel respectively:
\[x = x.reshape((N, num_groups, C // num_groups, ...)) axis = (2, ...) out = \frac{x - mean[x, axis]}{ \sqrt{Var[x, axis] + \epsilon}} * gamma + beta\]- Parameters:
num_groups (int, default 1) – Number of groups to separate the channel axis into.
epsilon (float, default 1e-5) – Small float added to variance to avoid dividing by zero.
center (bool, default True) – If True, add offset of beta to normalized tensor. If False, beta is ignored.
scale (bool, default True) – If True, multiply by gamma. If False, gamma is not used.
beta_initializer (str or Initializer, default ‘zeros’) – Initializer for the beta weight.
gamma_initializer (str or Initializer, default ‘ones’) – Initializer for the gamma weight.
- Inputs:
data: input tensor with shape (N, C, …).
- Outputs:
out: output tensor with the same shape as data.
References
Examples
>>> # Input of shape (2, 3, 4) >>> x = mx.np.array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) >>> # Group normalization is calculated with the above formula >>> layer = GroupNorm() >>> layer.initialize(device=mx.cpu(0)) >>> layer(x) [[[-1.5932543 -1.3035717 -1.0138891 -0.7242065] [-0.4345239 -0.1448413 0.1448413 0.4345239] [ 0.7242065 1.0138891 1.3035717 1.5932543]] [[-1.5932543 -1.3035717 -1.0138891 -0.7242065] [-0.4345239 -0.1448413 0.1448413 0.4345239] [ 0.7242065 1.0138891 1.3035717 1.5932543]]]
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(data)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.HybridConcatenate(axis=-1)[source]¶
Bases:
HybridSequentialLays HybridBlock s concurrently.
This block feeds its input to all children blocks, and produce the output by concatenating all the children blocks’ outputs on the specified axis.
Example:
net = HybridConcatenate() net.add(nn.Dense(10, activation='relu')) net.add(nn.Dense(20)) net.add(Identity())
- Parameters:
axis (int, default -1) – The axis on which to concatenate the outputs.
- add(*blocks)¶
Adds block on top of the stack.
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(*args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.HybridLambda(function)[source]¶
Bases:
HybridBlockWraps an operator or an expression as a HybridBlock object.
- Parameters:
function (str or function) –
Function used in lambda must be one of the following: 1) The name of an operator that is available in both symbol and ndarray. For example:
block = HybridLambda('tanh')
A function that conforms to
def function(F, data, *args). For example:block = HybridLambda(lambda F, x: F.LeakyReLU(x, slope=0.1))
Inputs –
- ** args *: one or more input data. First argument must be symbol or ndarray. Their
shapes depend on the function.
Output –
** outputs *: one or more output data. Their shapes depend on the function.
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x, *args)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(*args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.HybridSequential[source]¶
Bases:
HybridBlockStacks HybridBlocks sequentially.
Example:
net = nn.HybridSequential() net.add(nn.Dense(10, activation='relu')) net.add(nn.Dense(20)) net.hybridize()
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x, *args)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(*args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.Identity[source]¶
Bases:
HybridBlockBlock that passes through the input directly.
This block can be used in conjunction with HybridConcatenate block for residual connection.
Example:
net = HybridConcatenate() net.add(nn.Dense(10, activation='relu')) net.add(nn.Dense(20)) net.add(Identity())
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(*args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.InstanceNorm(axis=1, epsilon=1e-05, center=True, scale=False, beta_initializer='zeros', gamma_initializer='ones', in_channels=0, **kwargs)[source]¶
Bases:
HybridBlockApplies instance normalization to the n-dimensional input array. This operator takes an n-dimensional input array where (n>2) and normalizes the input using the following formula:
\[ \begin{align}\begin{aligned}\bar{C} = \{i \mid i \neq 0, i \neq axis\}\\out = \frac{x - mean[data, \bar{C}]}{ \sqrt{Var[data, \bar{C}]} + \epsilon} * gamma + beta\end{aligned}\end{align} \]- Parameters:
axis (int, default 1) – The axis that will be excluded in the normalization process. This is typically the channels (C) axis. For instance, after a Conv2D layer with layout=’NCHW’, set axis=1 in InstanceNorm. If layout=’NHWC’, then set axis=3. Data will be normalized along axes excluding the first axis and the axis given.
epsilon (float, default 1e-5) – Small float added to variance to avoid dividing by zero.
center (bool, default True) – If True, add offset of beta to normalized tensor. If False, beta is ignored.
scale (bool, default True) – If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. nn.relu), this can be disabled since the scaling will be done by the next layer.
beta_initializer (str or Initializer, default ‘zeros’) – Initializer for the beta weight.
gamma_initializer (str or Initializer, default ‘ones’) – Initializer for the gamma weight.
in_channels (int, default 0) – Number of channels (feature maps) in input data. If not specified, initialization will be deferred to the first time forward is called and in_channels will be inferred from the shape of input data.
- Inputs:
data: input tensor with arbitrary shape.
- Outputs:
out: output tensor with the same shape as data.
References
Instance Normalization: The Missing Ingredient for Fast Stylization
Examples
>>> # Input of shape (2,1,2) >>> x = mx.np.array([[[ 1.1, 2.2]], ... [[ 3.3, 4.4]]]) >>> # Instance normalization is calculated with the above formula >>> layer = InstanceNorm() >>> layer.initialize(device=mx.cpu(0)) >>> layer(x) [[[-0.99998355 0.99998331]] [[-0.99998319 0.99998361]]]
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.Lambda(function)[source]¶
Bases:
BlockWraps an operator or an expression as a Block object.
- Parameters:
function (str or function) –
Function used in lambda must be one of the following: 1) the name of an operator that is available in ndarray. For example:
block = Lambda('tanh')
a function that conforms to
def function(*args). For example:block = Lambda(lambda x: npx.leaky_relu(x, slope=0.1))
Inputs –
** args *: one or more input data. Their shapes depend on the function.
Output –
** outputs *: one or more output data. Their shapes depend on the function.
- class mxnet.gluon.nn.basic_layers.LayerNorm(axis=-1, epsilon=1e-05, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', in_channels=0)[source]¶
Bases:
HybridBlockApplies layer normalization to the n-dimensional input array. This operator takes an n-dimensional input array and normalizes the input using the given axis:
\[out = \frac{x - mean[data, axis]}{ \sqrt{Var[data, axis] + \epsilon}} * gamma + beta\]- Parameters:
axis (int, default -1) – The axis that should be normalized. This is typically the axis of the channels.
epsilon (float, default 1e-5) – Small float added to variance to avoid dividing by zero.
center (bool, default True) – If True, add offset of beta to normalized tensor. If False, beta is ignored.
scale (bool, default True) – If True, multiply by gamma. If False, gamma is not used.
beta_initializer (str or Initializer, default ‘zeros’) – Initializer for the beta weight.
gamma_initializer (str or Initializer, default ‘ones’) – Initializer for the gamma weight.
in_channels (int, default 0) – Number of channels (feature maps) in input data. If not specified, initialization will be deferred to the first time forward is called and in_channels will be inferred from the shape of input data.
- Inputs:
data: input tensor with arbitrary shape.
- Outputs:
out: output tensor with the same shape as data.
References
Examples
>>> # Input of shape (2, 5) >>> x = mx.np.array([[1, 2, 3, 4, 5], [1, 1, 2, 2, 2]]) >>> # Layer normalization is calculated with the above formula >>> layer = LayerNorm() >>> layer.initialize(device=mx.cpu(0)) >>> layer(x) [[-1.41421 -0.707105 0. 0.707105 1.41421 ] [-1.2247195 -1.2247195 0.81647956 0.81647956 0.81647956]]
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(data)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.
- class mxnet.gluon.nn.basic_layers.Sequential[source]¶
Bases:
BlockStacks Blocks sequentially.
Example:
net = nn.Sequential() net.add(nn.Dense(10, activation='relu')) net.add(nn.Dense(20))
- class mxnet.gluon.nn.basic_layers.SyncBatchNorm(in_channels=0, num_devices=None, momentum=0.9, epsilon=1e-05, center=True, scale=True, use_global_stats=False, beta_initializer='zeros', gamma_initializer='ones', running_mean_initializer='zeros', running_variance_initializer='ones', **kwargs)[source]¶
Bases:
BatchNormCross-GPU Synchronized Batch normalization (SyncBN)
Standard BN [1] implementation only normalize the data within each device. SyncBN normalizes the input within the whole mini-batch. We follow the implementation described in the paper [2].
Note: Current implementation of SyncBN does not support FP16 training. For FP16 inference, use standard nn.BatchNorm instead of SyncBN.
- Parameters:
in_channels (int, default 0) – Number of channels (feature maps) in input data. If not specified, initialization will be deferred to the first time forward is called and in_channels will be inferred from the shape of input data.
num_devices (int, default number of visible GPUs)
momentum (float, default 0.9) – Momentum for the moving average.
epsilon (float, default 1e-5) – Small float added to variance to avoid dividing by zero.
center (bool, default True) – If True, add offset of beta to normalized tensor. If False, beta is ignored.
scale (bool, default True) – If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. nn.relu), this can be disabled since the scaling will be done by the next layer.
use_global_stats (bool, default False) – If True, use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator. If False, use local batch-norm.
beta_initializer (str or Initializer, default ‘zeros’) – Initializer for the beta weight.
gamma_initializer (str or Initializer, default ‘ones’) – Initializer for the gamma weight.
running_mean_initializer (str or Initializer, default ‘zeros’) – Initializer for the running mean.
running_variance_initializer (str or Initializer, default ‘ones’) – Initializer for the running variance.
- Inputs:
data: input tensor with arbitrary shape.
- Outputs:
out: output tensor with the same shape as data.
- Reference:
- apply(fn)¶
Applies
fnrecursively to every child block as well as self.- Parameters:
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Return type:
this block
- cast(dtype)¶
Cast this Block to use another data type.
- Parameters:
dtype (str or numpy.dtype) – The new data type.
- collect_params(select=None)¶
Returns a
Dictcontaining thisBlockand all of its children’s Parameters(default), also can returns the selectDictwhich match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters:
select (str) – regular expressions
- Return type:
The selected
Dict
- export(path, epoch=0, remove_amp_cast=True)¶
Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters:
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns:
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
- forward(x)[source]¶
Overrides the forward computation. Arguments must be
mxnet.numpy.ndarray.
- hybridize(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶
Activates or deactivates
HybridBlocks recursively. Has no effect on non-hybrid children.- Parameters:
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
- infer_shape(x, *args)¶
Infers shape of Parameters from inputs.
- infer_type(*args)¶
Infers data type of Parameters from inputs.
- initialize(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶
Initializes
Parameters of thisBlockand its children.- Parameters:
init (Initializer) – Global default Initializer to be used when
Parameter.init()isNone. Otherwise,Parameter.init()takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
- load(prefix)¶
Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters:
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
- load_dict(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from dict
- Parameters:
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
- load_parameters(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶
Load parameters from file previously saved by save_parameters.
- Parameters:
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
- optimize_for(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶
Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters:
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
- property params¶
Returns this
Block’s parameter dictionary (does not include its children’s parameters).
- register_child(block, name=None)¶
Registers block as a child of self.
Blocks assigned to self as attributes will be registered automatically.
- register_forward_hook(hook)¶
Registers a forward hook on the block.
The hook function is called immediately after
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_forward_pre_hook(hook)¶
Registers a forward pre-hook on the block.
The hook function is called immediately before
forward(). It should not modify the input or output.- Parameters:
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Return type:
mxnet.gluon.utils.HookHandle
- register_op_hook(callback, monitor_all=False)¶
Install op hook for block recursively.
- Parameters:
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
- reset_ctx(ctx)¶
This function has been deprecated. Please refer to
HybridBlock.reset_device.
- reset_device(device)¶
Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters:
device (Device or list of Device, default
device.current_device().) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
- save(prefix)¶
Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters:
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
- save_parameters(filename, deduplicate=False)¶
Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export().- Parameters:
References
- setattr(name, value)¶
Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters:
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1to sharedense0’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters:
shared (Dict) – Dict of the shared parameters.
- Return type:
this block
- summary(*inputs)¶
Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters:
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArrayis supported.
- zero_grad()¶
Sets all Parameters’ gradient buffer to 0.