Gesture Recognition with Edge Impulse and Arduino

How to develop a TinyML model on Arduino Giga R1 and display the result on the Giga Display with LVGL

Leonardo Cavagnis
9 min readApr 3, 2024

In this tutorial, you’ll be guided through the process of creating a gesture recognition system capable of identifying two distinct movements. These movements will be executed by directly interacting with the display, and the results will be immediately displayed on the screen.

We’ll classify a vertical waving motion from top to bottom (updown gesture) and a horizontal waving motion from right to left (rightleft gesture). These gestures will simulate the action of cracking open an egg displayed on the screen.

If the recognized gesture is updown, the eggshell will crack open upwards. If the recognized gesture is rightleft, the egg will crack open towards the right.

Gestures and display. Image by author.

Project components

The project will be implemented using an Arduino Giga R1 WiFi, a powerful board designed for advanced embedded applications. It features a high-performance Arm Cortex-M7 processor, suitable for Machine Learning applications.

Additionally, the Giga R1 will be equipped with a Giga Display Shield, which offers a display along with additional features such as a digital microphone, and a 6-axis IMU.

The machine learning model will be created using Edge Impulse, a platform for building and deploying TinyML models on embedded devices.

Furthermore, display management will be carried out using the LVGL library, an open-source graphics library designed for embedded systems, providing an easy and versatile framework for creating GUIs on microcontrollers.

Arduino GIGA R1 WiFi and GIGA Display Shield. Image by Arduino Official Store.

Project steps

To create a machine learning model for Arduino, the process will involve the following steps:

  1. Data collection: Collect relevant data samples that represent the problem.
  2. Data preprocessing: Clean, preprocess, and format the raw data for training.
  3. Model selection: Choose a suitable ML algorithm, model architecture, and settings.
  4. Model training: Train the selected model using the preprocessed data.
  5. Deployment: Deploy the trained model on the Arduino for inference.
Project steps. Image by author.

Data collection

The first phase involves collecting data related to the two movements (updown and rightleft) to be recognized. For each gesture, acceleration and gyroscope data are acquired from the IMU sensor embedded in the display shield.

An Arduino sketch has been developed to capture each movement for 1 second at a sampling frequency of 100Hz (meaning every 10ms, 3 acceleration values and 3 gyroscope values are captured), resulting in a total of 100 measurements per second.

#define SAMPLING_FREQ_HZ            100
#define SAMPLING_PERIOD_MS 1000 / SAMPLING_FREQ_HZ
#define NUM_SAMPLES 100

//...
for(int i = 0; i < NUM_SAMPLES; i++) {
timestamp = millis();
imu.readAcceleration(acc_x, acc_y, acc_z);
imu.readGyroscope(gyr_x, gyr_y, gyr_z);
//...
while (millis() < timestamp + SAMPLING_PERIOD_MS) ;
}

To trigger movement acquisition, we will use an acceleration threshold. When the acceleration goes beyond a certain threshold (which has been empirically calibrated during testing), the acquisition will start.

void loop() {
//...
if (imu.accelerationAvailable()) {
imu.readAcceleration(acc_x, acc_y, acc_z);
float aSum = fabs(acc_x) + fabs(acc_y) + fabs(acc_z);
if (aSum >= ACC_THRESHOLD_G) {
moveDetect = true;
} else {
moveDetect = false;
}
}
//...

The output will be printed on the serial monitor in CSV format and copied to a text file. Each CSV file will represent a single movement.

    Serial.println("timestamp,accx,accy,accz,gyrX,gyry,gyrz");

for(int i = 0; i < NUM_SAMPLES; i++) {
//...
Serial.print(timestamp - start_timestamp) ;
Serial.print(",");
Serial.print(acc_x);
Serial.print(",");
Serial.print(acc_y);
Serial.print(",");
Serial.print(acc_z);
Serial.print(",");
Serial.print(gyr_x);
Serial.print(",");
Serial.print(gyr_y);
Serial.print(",");
Serial.println(gyr_z);
//...
}
Serial monitor output. Image by author.

Each movement will be repeated many times to ensure a wide dataset for training. The more movements acquired, the more accurate the predictive system built upon this data will be.

CSV files. Image by author.

Upload data on Edge Impulse

To create your first project on Edge Impulse, visit the website https://studio.edgeimpulse.com/, create an account, and click the “+ Create new project” button.

Edge Impulse: Create project. Image by author.
Edge Impulse: Project Dashboard. Image by author.

Go to the “Data acquisition” section → “+ Add data” → “Upload data” to load your CSV files.

Upload data settings. Image by author.

Leave the default settings so that the label of movement will be automatically detected by Edge Impulse through the file name (remember to name the CSV file to “label.progressivenumber”) and let it choose randomly the movements intended for training (80%) and for testing (20%).

Data acquisition dashboard. Image by author.

Training data is used to teach the model during the training process, while testing data is used to evaluate the model’s performance.

Model training

To train the model, go to the “Impulse design” section.

Impulse design section. Image by author.

Model selection

In this context, the tool automatically interprets that the data are time series and extract information on the frequency and window size (to facilitate that, we have included the timestamp of acquisition in the dataset).

After determining the nature of the data, it is important to perform the feature extraction: a process where raw data is transformed into a set of features. These features capture relevant characteristics of the data, making it easier for ML algorithms to learn and make accurate predictions.

Extracting meaningful features from the data is crucial, and in Edge Impulse is done through processing blocks.

Processing blocks. Image by author.

From the description, it appears that the most accurate processing block would be the Spectral Analysis. This block is particularly useful for extracting frequency information from signals over time.

After extracting features from the raw signal, you can now train your model using a learning block.
A learning block is a neural network that is trained to learn from your data. In this context, the most suitable would be the Classification. It is specifically designed to train models to classify input data into predefined categories (such as updown and rightleft) based on the extracted features.

Learning blocks. Image by author.
Create impulse dashboard. Image by author.

Features extraction

After adjusting the model settings through the “Create impulse” section, it’s time to initiate the first phase of training: Spectral Analysis.
Navigate to the “Spectral features” section, keep the default settings, and click on “Save parameters”, followed by “Generate features”.

Feature extraction. Image by author.

Model creation

Now that the feature extraction phase is complete, it’s time to train the model. Navigate to the “Classifier” section, keep the default settings, and press “Start training”.

The effectiveness of a model is evaluated using the accuracy metric, which indicates how precise the model is in its predictions. At the end of the training phase, take a look at the accuracy and check if it meets the requirements (a good model typically achieves 90% or higher).

Training: first attempt. Image by author.

At the first iteration of training, we achieved an accuracy of 75%, which is not yet sufficient. We decide to make a second attempt. To improve accuracy, we can increase the number of neurons in the dense layers of the neural network (e.g., 30 → 35) and/or the training cycles (e.g., 30 → 40). Please note that increasing the number of neurons will result in a larger model size in terms of memory space, while increasing the training cycles will lead to longer training times.

Training: second attempt. Image by author.

At the second attempt, we achieved an accuracy of 100%, which is satisfactory.

Deployment

Now, you’re ready to create the model for exporting to Arduino.

Go to the “Deployment” section, select “Arduino library” as deployment options, and click on “Build”.

Deployment dashboard. Image by author.

At the end of the building process, an Arduino library named “Egg-breaker_inferencing” will be created and downloaded.

The library consists of the model translated into .c and .h files (located in the /src folder) and a series of examples (located in the /examples folder). The examples are created for different types of boards supported by Edge Impulse. Currently, there isn’t a specific example for Arduino Giga R1, but you can start by taking inspiration from the Arduino Nano 33 BLE Sense examples.

Edge Impulse Arduino library structure. Image by author.

To utilize the library, move it in the Arduino/libraries folder of your computer and #include "Egg-breaker_inferencing.h" in your sketch.

The prediction sketch

The sketch is structured similarly to the one used for data acquisition. When a movement is detected via accelerometer threshold, acceleration and gyroscope data are acquired for 1 second and then given as input to the prediction model. The prediction model returns a result indicating whether the movement is updown, rightleft, or neither of the two.

  • Sensor data reading and buffer filling
if (moveDetect) {
// Allocate a buffer here for the values we'll read from the IMU
float buffer[EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE] = { 0 };

for (size_t ix = 0; ix < EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE; ix += 6) {
// Determine the next tick (and then sleep later)
uint64_t next_tick = micros() + (EI_CLASSIFIER_INTERVAL_MS * 1000);

imu.readAcceleration(buffer[ix + 0], buffer[ix + 1], buffer[ix + 2]);
imu.readGyroscope(buffer[ix + 3], buffer[ix + 4], buffer[ix + 5]);
// ...

delayMicroseconds(next_tick - micros());
}
  • Extract features from raw data
// Turn the raw buffer in a signal which we can the classify
signal_t signal;
int err = numpy::signal_from_buffer(buffer, EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE, &signal);
if (err != 0) {
ei_printf("Failed to create signal from buffer (%d)\n", err);
return;
}
  • Run prediction model
// Run the classifier
ei_impulse_result_t result = { 0 };

err = run_classifier(&signal, &result, false);
if (err != EI_IMPULSE_OK) {
ei_printf("ERR: Failed to run classifier (%d)\n", err);
return;
}
  • Print the prediction results
// print the predictions
ei_printf("Predictions ");
ei_printf("(DSP: %d ms., Classification: %d ms., Anomaly: %d ms.)",
result.timing.dsp, result.timing.classification, result.timing.anomaly);

for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
ei_printf(" %s: %.5f\n", result.classification[ix].label,
result.classification[ix].value);
}

The prediction model assigns a probability to each category ranging from 0 to 1. The higher the value, the greater the probability that the gesture is the one executed.

Predictions output. Image by author.

Print result on the display

To manage the Arduino Giga Display Shield, we will use the Arduino_H7_Video library (included in the Mbed OS GIGA Board Package) and LVGL.

#include "Arduino_H7_Video.h"
#include "Arduino_GigaDisplayTouch.h"

#include "lvgl.h"

Arduino_H7_Video Display(800, 480, GigaDisplayShield);
Arduino_GigaDisplayTouch TouchDetector;

void setup() {
//...
Display.begin();
TouchDetector.begin();
//...
}

In LVGL, you can handle an image as a variable in internal memory or as a file in external storage (e.g., an SD Card).
Since there are only a few small images, you choose the variable approach. These are usually stored within a project as C files.
To convert an image into a C file, use the LVGL online image converter.

Image C files. Image by author.

The images will be shown on the display based on the type of gesture detected by the model. The “open up egg” image will be displayed if the movement is the updown gesture, the “open right egg” image if the movement is the rightleft, and the image with the closed egg if no movement is recognized.

You consider a valid gesture if its probability exceeds at least 0.9.

lv_obj_t * img1;

void setup() {
//...
img1 = lv_img_create(lv_screen_active());

LV_IMG_DECLARE(egg);
lv_img_set_src(img1, &img_egg);
lv_obj_align(img1, LV_ALIGN_CENTER, 0, 0);
lv_obj_set_size(img1, 200, 200);
}

void loop() {
//...
if (result.classification[ix].value > 0.9) {
if(strcmp(result.classification[ix].label, "updown") == 0) {
LV_IMG_DECLARE(img_egg_openup);
lv_img_set_src(img1, &img_egg_openup);
lv_obj_align(img1, LV_ALIGN_CENTER, 0, 0);
lv_obj_set_size(img1, 200, 200);
} else if (strcmp(result.classification[ix].label, "rightleft") == 0) {
LV_IMG_DECLARE(img_egg_openright);
lv_img_set_src(img1, &img_egg_openright);
lv_obj_align(img1, LV_ALIGN_CENTER, 0, 0);
lv_obj_set_size(img1, 200, 200);
}
}
//...
}

Code

--

--

Leonardo Cavagnis

Passionate Embedded Software Engineer, IOT Enthusiast and Open source addicted. Proudly FW Dev @ Arduino