Conversion Metafile
When converting a model, it is possible to provide a YAML metafile to customize the generated model. For example, it is possible to specify:
- The data representation in memory (nhwc or nchw)
- Model quantization options
- Output dequantization
- Input preprocessing options
- Delegate to be used for inference (npu, gpu, cpu)
Example:
$ synap convert --model mobilenet_v1_quant.tflite --meta mobilenet.yaml --target VS680 --out-dir mnv1
This metafile is mandatory when converting a TensorFlow .pb model. It can be completely omitted when converting a quantized .tflite model.
The best way to understand the content of a metafile is probably to first look at an example. Here is one for a typical mobilenet_v1 model, followed by a detailed description of each field. Most of the fields are optional; mandatory fields are explicitly marked.
delegate: npu
data_layout: nhwc
security:
secure: true
file: ../security.yaml
inputs:
- name: input
shape: [1, 224, 224, 3]
means: [128, 128, 128]
scale: 128
format: rgb
security: any
preprocess:
type: nv21
size: [1920, 1080]
crop: true
outputs:
- name: MobilenetV1/Predictions/Reshape_1
dequantize: false
format: confidence_array
quantization:
data_type: uint8
scheme: default
mode: standard
algorithm: standard
options:
dataset:
- ../../sample/*_224x224.jpg
Fields
delegate
Select the delegate to use for inference. Available delegates are:
default(default, automatically select delegate according to the target HW)npugpucpu
If not specified, the default delegate for the target hardware is used. It is also possible to specify the delegate on a layer-by-layer basis. See section heterogeneous_inference.
data_layout
The data layout in memory. Allowed values are: default, nchw, and nhwc.
For TensorFlow and TensorFlow Lite models, the default is nhwc. Forcing the converted model to be nchw might provide some performance advantage when the input data is already in this format since no additional data reorganization is needed.
For Caffe and ONNX models, the default is nchw. In this case, it is not possible to force to nhwc.
input_format
Format of the input tensors. This is an optional string that will be attached as an attribute to all the network input tensors for which a "format" field has not been specified.
output_format
Format of the output tensors. This is an optional string that will be attached as an attribute to all the network output tensors for which a "format" field has not been specified.
security
This section contains security configuration for the model. If this section is not present, security is disabled. Security is only supported with the npu delegate.
secure: If true, enable security for the model. For secure models, it is also possible to specify the security policy for each input and output. A secure model is encrypted and signed at conversion time so that its structure and weights will not be accessible and its authenticity can be verified. This is done by a set of keys and certificates files whose path is contained in a security file.file: Path to the security file. This is ayamlfile with the following fields:
encryption_key: `[path-to-encryption-key-file](path-to-encryption-key-file)`
signature_key: `[path-to-signature-key-file](path-to-signature-key-file)`
model_certificate: `[path-to-model-certificate-file](path-to-model-certificate-file)`
vendor_certificate: `[path-to-vendor-certificate-file](path-to-vendor-certificate-file)`
Both relative and absolute paths can be used. Relative paths are considered relative to the location of the security file itself. The same fields can also be specified directly in the model metafile in place of the 'file' field. For detailed information on the security policies and how to generate and authenticate a secure model, please refer to SyNAP_SyKURE.pdf.
inputs (pb)
Must contain one entry for each input of the network. Each entry has the following fields:
name(pb): Name of the input in the network graph. Fortfliteandonnxmodels, this field is not required but can still be used to specify a different input layer than the default input of the network. This feature allows converting just a subset of a network without having to manually edit the source model. For.pbmodels or whennameis not specified, the inputs must be in the same order as they appear in the model. When this field is specified, theshapefield is mandatory.shape(pb): Shape of the input tensor. This is a list of dimensions; the order is given by the layout of the input tensor in the model (even if a different layout is selected for the compiled model). The first dimension must represent by convention the number of samples N (also known as "batch size") and is ignored in the generated model, which always works with a batch-size of 1. When this field is specified, thenamefield is mandatory.means: Used to normalize the range of input values. A list of mean values, one for each channel in the corresponding input. If a single value is specified instead of a list, it will be used for all the channels. If not specified, a mean of0is assumed. The i-th channel of each input is normalized as:norm = (in - means[i]) / scale.scale: Used to normalize the range of input values. The scale is a single value for all the channels in the corresponding input. If not specified, a scale of1is assumed.format: Information about the type and organization of the data in the tensor. The content and meaning of this string are custom-defined. However, SyNAP Toolkit and SyNAPPreprocessorrecognize by convention an initial format type optionally followed by one or more named attributes:[format-type](format-type) [[key](key)=value].... Recognized types arergb(default): 8-bits RGB or RGBA or grayscale image,bgr: 8-bits BGR image or BGRA or grayscale image. Recognized attributes arekeep_proportions=1(default): preserve aspect-ratio when resizing an image usingPreprocessoror during quantization,keep_proportions=0: don't preserve aspect-ratio when resizing an image usingPreprocessoror during quantization. Any additional attribute if present is ignored by SyNAP.preprocess: Input preprocessing options for this input tensor. It can contain the following fields:type: format of the input data (e.g.rgb,nv12)size: size of the input image as a list [H, W]crop: enable runtime cropping of the input image
security: Security policy for this input tensor. This field is only considered for secure models and can have the following values:any(default): the input can be either in secure or non-secure memorysecure: the input must be in secure memorynon-secure: the input must be in non-secure memory
outputs (pb)
Must contain one entry for each input of the network. Each entry has the following fields:
name(pb): Name of the output in the network graph. Fortfliteandonnxmodels, this field is not required but can still be used to specify a different output layer than the default output of the network. This feature allows converting just a subset of a network without having to manually edit the source model. For.pband.onnxmodels or whennameis not specified, the outputs must be in the same order as they appear in the model.dequantize: The output of the network is internally dequantized and converted tofloat. This is more efficient than performing the conversion in software.format: Information about the type and organization of the data in the tensor. The content and meaning of this string are custom-defined. However, SyNAPClassifierandDetectorpostprocessors recognize by convention an initial format type optionally followed by one or more named attributes:[format-type](format-type) [[key](key)=value].... All fields are separated by one or more spaces. No spaces are allowed between the key and the value. Example:confidence_array class_index_base=0. See theClassifierandDetectorclasses for a description of the specific attributes supported.security: Security policy for this output tensor. This field is only considered for secure models and can have the following values:secure-if-input-secure(default): the output buffer must be in secure memory if at least one input is in secure memoryany: the output can be either in secure or non-secure memory
quantization (q)
Quantization options are required when quantizing a model during conversion. They are not needed when importing a model that is already quantized. Quantization is only supported with the npu delegate.
data_type: Data type used to quantize the network. The same data type is used for both activation data and weights. Available data types are:uint8(default)int8int16float16
scheme: Select the quantization scheme. Available schemes are:default(default)asymmetric_affinedynamic_fixed_pointperchannel_symmetric_affine
mode: Select the quantization mode. Available modes are:standard(default)full
algorithm: Select the quantization algorithm. Available algorithms are:standard(default)kl_divergencemoving_average
options: Special options for fine-tuning the quantization in specific cases. Normally not needed.dataset(q): Quantization dataset(s), that is the set of input files to be used to quantize the model. In the case of multi-input networks, it is necessary to specify one dataset per input. Each dataset will consist of the sample files to be applied to the corresponding input during quantization. A sample file can be provided in one of two forms:- As an image file (
.jpgor.png) - As a NumPy file (
.npy)
- As an image file (
Image files are suitable when the network inputs are images, that is 4-dimensional tensors (NCHW or NHWC). In this case, the means and scale values specified for the corresponding input are applied to each input image before it is used to quantize the model. Furthermore, each image is resized to fit the input tensor.
NumPy files can be used for all kinds of network inputs. A NumPy file shall contain an array of data with the same shape as the corresponding network input. In this case, it is not possible to specify a means and scale for the input; any preprocessing if needed has to be done when the NumPy file is generated.
To avoid having to manually list the files in the quantization dataset for each input, the quantization dataset is instead specified with a list of glob expressions, one glob expression for each input. This makes it very easy to specify as quantization dataset for one input the entire content of a directory, or a subset of it. For example, all the jpeg files in directory samples can be indicated with:
samples/*.jpg
Both relative and absolute paths can be used. Relative paths are considered relative to the location of the metafile itself. It is not possible to specify a mix of image and .npy files for the same input. For more information on the glob specification syntax, please refer to the Python documentation: https://docs.python.org/3/library/glob.html.
If the special keyword random is specified, a random data file will be automatically generated for this input. This option is useful for preliminary timing tests, but not for actual quantization.
If this field is not specified, quantization is disabled.
Notes
- The fields marked
(pb)are mandatory when converting.pbmodels. - The fields marked
(q)are mandatory when quantizing models.
The metafile also supports limited variable expansion: ${ENV:name} anywhere in the metafile is replaced with the content of the environment variable name (or with the empty string if the variable doesn't exist). ${FILE:name} in a format string is replaced with the content of the corresponding file (the file path is relative to that of the conversion metafile itself). This feature should be used sparingly as it makes the metafile not self-contained.