Rosella       Machine Intelligence & Data Mining

Computer Vision Machine Learning for Raspberry PI SBC

Computer Vision based on Machine Learning of Artificial Intelligence is a very promissing application area of Raspberry Pi Single Board Computers (SBC). Intelligent surveilance and monitoring functions can be embedded into Raspberry Pi applications, achieving edge computing. To see what applications you can develop with computer vision, you need to understand what computer vision models can do. It is important to understand that current computer vision technology has limitations. So limited applications are possible. Common computer vision modeling types include;

  • Image classification: Given an image, this provides classification information. This gives probability of each object class. For example, bird/35%, car/15%, horse/5%, etc. Of course, object with the highest probability is the object class/type of the image.
  • Image regression: Given an image, this gives single or multiple numerical output values. For example, probability of being cancerous, temperature, left/right moves, etc.
  • Object detection: Given an image, this gives bounding box information of detected objects, such as probability, X/Y coordinates, width, height. It can also provide detected object class/type.
  • Similarity regression of two images: Given two images, this gives (whatever) probability of two images. One such application can be face recognition whether two images are of the same person.
  • Stereoscopic regression such as distance measurement.

Vehicle/Car Detection Convolutional Neural Network

Person Detection Convolutional Neural Network

Computer Vision Modeling

Convolutional Neural Networks (CNN) are used to model computer vision tasks. To do computer vision modeling, fairy good understanding of CNN is essential. Computer vision model development involves the following stages;

Computer vision machine learning is a very complex process. You need powerful but still easy to use and learn Machine Learning Software. CMSR Machine Learning Studio is a fully GUI based machine learning model development platform. You don't code anything until you embed CMSR generated model codes into your application projects. You just need to call a model function from your main program. Users can train machine learning models without coding. It's very easy to use and comes with powerful features. It provides the following types of computer vision modeling tools;

  • CNN: Convolutional neural network for image classification and class probability.
  • FCN: Fully convolutional network for image classification and class probability. Same as CNN.
  • M-CNN: Multi-value output CNN. It's a regression modeling algorithm.
  • OD-CNN: Object detection CNN. It detects objects and provides bounding box information. This is very similar to YOLO. You can develop your own YOLO models with this.
  • T-CNN: Twin CNN for similarity prediction for such as face recognition.
  • S-CNN: (Experimental) Measures distance from stereoscopic images.
Free Codingless Computer Vision Development / Modeling Software Download

Free download of CMSR Machine Learning Studio is available for computer vision developers with free (limitted) technical support.

For free downloads, please visit CMSR Download/Install.

Powerful GPU and Large RAM Memory Computer is Essential!

Computer vision is extremely compute intensive. Especially training will take huge computing time on powerful computers. You cannot do on Raspberry Pi SBC! Computer with powerful GPU and large memory RAM is essential. High performance gaming computers are ideal for machine learning. However this doesn't prevent you developing computer vision models since you can develop small models. More shading GPU cores are better as CMSR Studio can take advantage of bigger shading cores such as Nvidia Cuda cores. CMSR employes fine grained data parallelism. On 896 Nvidia Cuda cores, we observed 165 times fast than single CPU core model training. With more Cuda cores, it can get over 1,000 times fast.

Model training is done with randomized order images. Otherwise, models will develop skews towards later images. To read images randomly, all training data images must be brought into main memory. Otherwise training will be extremely slow. You can estimate needed total RAM memory size in bytes with the following formula;

   total image dataset size = ((image width) * (image height) * 3 + 3) * (number of images)

This should be your maximum image dataset size. For your computer RAM, it should be about twice of this size as OS will also use RAM. If you don't have large memory, you will have to content with small image training datasets. Note that this is CPU RAM size. GPU VRAM size is different. It can be much smaller as GPU VRAM stores only model parameters and some extras.

Model Code Generation for Embedded Applications

Forget about ChatGPT thing! CMSR ML Studio can generate highly efficient AI ML codes.

CMSR ML Studio provides easy embedding into applications. Just generate program codes and compile with your project codes. Just call a function from your main code. CMSR can generate the following type codes;

  • Single CPU thread: Java, C, Swift.
  • Multicore CPU: C++, Java.
  • GPU: OpenCL (Java, C++), Cuda (Java, C++), OpenGL ES3 (C++), Metal (Swift, Objective-C).

generated programs incorporate pre- and post processing as follows;

  • Color inversion.
  • Color transformation and value encoding.
  • Histogram equalization.
  • Object filtering and duplicate removal.
  • Sorting.

For a generated program example, please see CMSR Generated Program Example: C++ OpenCL.

Python developers will need to use Python-C++ bindings. Pure Python codes will be 80 times slower than C++ codes. So it's pointless to develop pure Python code generation. If OpenCL is supported, C++ OpenCL is recomended. Otherwise multicore version should be used.

Generated Code Performance on Raspberry Pi 4

For Raspberry Pi 4B, the best option is C++ 4 CPU Threads. It will give the maximum speed. We tested performance on Raspberry Pi 4B 4GB. With a single CPU thread, 25 million super parameter object detection deep neural network takes 87 seconds to complete. With 4 CPU threads, it takes 27 seconds. It's about 70% reduction in elapsed time. 5% less from the perfect reduction of 75%. If you use small neural networks with less than 1 million parameters, it will take only about a second or less. That's not bad performance. So making neural network small is the way to go. Note that elapsed time is roughly proportional to trainable paramaters.

The major drawback of Raspberry Pi SBCs is the lack of GPU compute support such as OpenCL and OpenGL ES3. Because of this, performance of computer vision tasks will suffer. So we have to rely on multiple CPU cores.

Embedding Models into Applications

CMSR generated codes are of very high efficiency. Especially in edge computing, efficiency and speed is one of the most important factors. All you need to code is what you actually use them. The following code shows usage of CNN classification model in C++. You will create a couple of arrays to receive results. Then call the main evaluate function with parameters. You can repeat "evaluate" function as much as you need in your applications. Of course, your applications should get image data from onboard camera!

#include <iostream>
#include "CMSRModel.hpp"
using namespace std;

int main(void) {
	char filename[] = "data/modelfile.cnn";
	char imagefile[] = "data/cnnimages.rgb";

	int IMAGEARRAY[64*64*3];
	int outLabelCount = 4;
	int outLabelIndices[5];
	float outLabelProbabilities[5];
	int blackandwhite = 0;
	int r0g1b2 = 1;

	CMSRModel *model = new CMSRModel();
	model->verbose = true;

	// initialize model;
	model->initializeModel(4, filename);

	// The following steps can be repeated many times;
	model->populateImageArray((int*)IMAGEARRAY, imagefile, 64*64*3); // you can get data from camera!
	model->evaluate (
		outLabelCount, /* result label count */
		outLabelIndices, /* ordered result output label indices */
		outLabelProbabilities, /* ordered result output label relative probabilities */
		blackandwhite, /* 1 if black and white, otherwise 0 */
		r0g1b2,        /* 1 if IMAGEARRAY[][][0] is red, otherwise 0 */
		IMAGEARRAY   /* [row/height][column/width][colors] */
	cout << "Results;\n";
	for (int i=0; i < outLabelCount; i++) {
		cout << i << ": " << outLabelIndices[i] << " / " << outLabelProbabilities[i] << "\n";

	// release memory resources;
	delete model;

	cout << "End.\n";

	return 0;