Image Classification

In this Quick guide, we start with image classification. It is a way for computers to recognize and label images based on what they contain. For example, a model can look at a photo and tell if it’s a cat, dog, or car.

note

If you haven't yet setup your board and installed the examples, please refer to the quick start.

info

The Quick guide is compatible with all Machina SL16xx boards with OOBE image with pip and python pre-installed and optimization tailored to:

NPU for SL1680 and SL1640

GPU for SL1620

MobileNet models

We'll be using MobileNet, a vision model originally developed in 2017 by Google specifically for designed for mobile and embedded devices.

The MobileNet v2 model is pre-installed on your Astra board. It has been pretrained on the ImageNet database, then quantized and compiled to the NPU on Synaptics Astra for even higher performance.

Input: An 224 x 224 image
Output: Confidence scores for 1000 ImageNet classes - we will select the top class as our classification result

Image classification on the edge

Lets run image classification on Synaptics Astra using the SynapRT example below:

python -m vision.image_class './samples/fish.jpg'

And you should get the result:

Class Score Label ----------------------------- 1 0.41 goldfish, Carassius auratus 0 0.18 tench, Tinca tinca 392 0.09 rock beauty, Holocanthus tricolor 395 0.05 gar, garfish, garpike, billfish, Lepisosteus osseus 29 0.04 axolotl, mud puppy, Ambystoma mexicanum

Inference time: 14.79ms

You can look up class in the ImageNet classes list. It should have recognized the goldfish!

Classify another image

We can change a command line parameter to classify another image. Running:

python -m vision.image_class  './samples/mountains.jpg'

Should output the result:

Class Score Label ----------------------------- 980 0.34 volcano 970 0.22 alp 34 0.14 leatherback turtle, leatherback, leathery turtle, Dermochelys coriacea 807 0.07 solar dish, solar collector, solar furnace 915 0.06 yurt

Inference time: 14.75ms

ImageNet correctly identifies this as a volcano. (It's Ojos del Salado in Chile, the tallest volcano in the world.)

Input sources

So far, we've been running image classification on single image examples. However, SyanpRT pipelines can accept *.jpg images, a *.mp4 video, a /dev/video camera feed or even an RTSP IP Camera.

Image classification is not designed to handle multiple subjects or identify where they are in an image. For that we need Object Detection. In the next Quick guide, you will be doing Object Detection.

MobileNet models​

Image classification on the edge​

Classify another image​

Input sources​

MobileNet models

Image classification on the edge

Classify another image

Input sources