Skip to main content

Benchmark YOLO models on Astra

This tutorial will guide you to benchmark pre-trained YOLO models on Synaptics Astra™ Machina™ boards to evaluate their performance in terms of inference speed. Our SyNAP toolkit supports up to the latest YOLOv11 version, allowing you to easily benchmark and deploy models with minimal effort. You can run pre-trained models from Ultralytics or even your own fine-tuned models. Since Ultralytics has been at the forefront of creating state-of-the-art YOLO models, let’s use their YOLOv8 Nano as an example to showcase how you can benchmark its performance on SL1680's NPU.

Benchmarking applications

Machina board comes with few command line applications which allows to easily benchmark or run Inference on compiled .synap models for Astra Machina. You can check more about these applications here : Inference

Using synap_cli application, you can easily benchmark any model leveraging Astra NPU on synthetic data. And for several other tasks like object detection, image classification or image processing on an image, you can utilize specific application.

Ultralytics Pre-trained Yolo models

First on your host machine, Download Ultralytics YOLOv8 Nano model in TFlite Float 32 format.

!pip install ultralytics

import ultralytics
from ultralytics import YOLO
model = YOLO("yolov8n.pt") # Load the YOLOv8 model
model.export(format="tflite" , imgsz=(640,640) , opset=12) # Export the model to TFLite Float format

Once you get a TFLite model, it is recommended to use a TFLite Float32 model, which can be later quantized to int8, using our SyNAP toolkit.

Compile for Astra NPU

Now compile the above model , say yolov8_float32.tflite for Astra SL1680 using meta.yaml and bus.jpg (or any image for quantization)

Meta file: synap.yaml

inputs:
- scale: 255
format: rgb keep_proportions=1
name: inputs_0
shape: [1, 640, 640, 3]
outputs:
- format: yolov8 w_scale=640 h_scale=640 bb_normalized=1
quantization:
data_type: uint8
scheme: asymmetric_affine
dataset:
- ./bus.jpg

Then, use SyNAP toolkit for compilation:

synap convert --model yolov8_float32.tflite --target SL1680 --meta synap.yaml --out-dir yolov8n

This generates model.synap which is uint8 quantized and optimized for Astra SL1680's NPU.

Now you can copy this model.synap to Astra Machina board and run on Machina:

For Benchmarking:

synap_cli -m model.synap -r 5 random

For Image Inference benchmarking and object detection:

synap_cli_od -m model.synap bus.jpg

Benchmarking Results

SL1680BenchmarkingImage Inference on bus.jpg
YOLOv8 NanoInference Time: 33.48 msInference Time: 39.90 ms
YOLOv8 SmallInference Time: 58.27 msInference Time: 64.38 ms

Conclusion

With just a few commands, you can run benchmarks and evaluate model performance effortlessly. Happy benchmarking!