Machine Learning: Basics

Beginners guide for the science of intelligent machines

Leonardo Cavagnis
7 min readApr 7, 2021

Intelligence is the ability to adapt to change.

Intelligent machines are everywhere.
We are living in an era where cars move autonomously around our cities, intelligent home speakers help us with housework, vacuum robots clean our rooms, and so on.
But, what do all these machines have in common?
They emulate human thought and actions.

What is Machine Learning?

Machine Learning (ML) is a discipline that studies how to build computer systems able to do predictions based on experience.

Let’s suppose that you want to create a vacuum cleaner robot that cleans your home and it is able to recognize your pet so it will not approach or scare it.

A simple approach could be to provide the robot your dog’s photo.
When the robot moves around your home, everything it frames with its camera is compared with your dog’s photo and if it matches, it does not come close to the dog.

However, this approach is not effective in all cases.
In fact, if you want to lend your robot to a friend who also owns a dog, you need to replace your dog’s photo stored in the robot memory.

This is an example of a traditional algorithm that carries out a series of predefined steps: acquiring the image framed by camera and comparing it with a photo stored in the memory.

But if you want to build a smarter robot, you will have to “train” it on “how to recognize a dog” and not just make a mere comparison. But how do you achieve this?

For instance, you can provide the robot a series of photos of different dogs and let it learn the main characteristics of a dog. (e.g., coat color and length, nose size, etc.)

This way, the robot will no longer perform a simple comparison between two photos, but it will check whether or not there is a dog in front of it based on what it has learned.
What has been described here is a machine that implements a Machine Learning algorithm: a system able to learn and improve itself by using data.
Data is what you can consider as the “experience”.

Machine Learning is one of the many ways to build an “Intelligent machine” and is the field of the Artificial Intelligence that faces to Data Science.

Machine Learning = Algorithm + Data

Artificial Intelligence (AI) is a branch of computer science concerned with building computer systems able to perform tasks that typically require human intelligence.

Data Science is a field of study that aims to use a scientific approach to extract meaning from data.

Learn from data

In which way can you instruct your machine to “learn from data”?

  • Input data
    ML algorithms learn from data. So, you need a set of input data called training data (e.g., dog’s pics of different breeds).
    The success of your intelligent machine depends on the quality and quantity of your training data.
  • Process data
    Before starting to create the ML model, you have to manually process your training data.
    Data processing aims to improve the quality of data in order to extract useful information from.
    (e.g., remove background from pics)
  • Extract features
    This useful information is called features.
    A feature is an individual measurable property of a phenomenon being observed.
    (e.g., in image recognition a common feature is the RGB color decomposition)
  • Build model
    A ML algorithm is a procedure that is run on training data to create a model. A model represents what has been learned by a ML algorithm and is composed by rules and mechanisms, necessary to make predictions.
    These algorithms are predictive and are derived from the data science field. Predictive algorithms use statistical techniques to analyze current data and make predictions about the future.
  • Evaluate model
    Now that your model is ready, you have to check how good it is.
    To do this, you have to use test data to evaluate the performance of the model using some metrics.
    Test dataset is different from training dataset and it is given in input to the built model to check the correctness of the predictions.
    If predictions are not accurate, you have to re-build your model using a new prediction algorithm and/or different features. This procedure has to be re-iterated until your model has become quite “good”.
  • Deploy model
    When the model has reached an appropriate level of accuracy, you are ready to deploy it.
    Deploying a model means translate it into a “computer program” and install it on a machine.

Identifying a dog in a picture is what, in machine learning context, is called image recognition.
Image recognition is a supervised learning problem, and it is one of the categories of machine learning problems.

Machine Learning problems

In ML field, problems are divided in three categories:

  • Supervised learning problem
  • Unsupervised learning problem
  • Reinforcement learning problem

Supervised Learning

In a supervised learning problem, you train the machine using data which is well labeled: it means that you know the correct answer of an input (e.g., pics of dogs). It is called “supervised” because learning takes place in the presence of a “supervisor” that iteratively checks the predictions of the ML algorithm.

Supervised problems are divided in two classes:

  • Classification: the prediction is a category, such as image recognition task (e.g., identify “dog” or “no dog” in a photo).
  • Regression: the prediction is a numeric value, such as “euros” or “weight” (e.g., euro-dollar exchange rate of the following day).

Unsupervised Learning

In an unsupervised learning problem, you do not need to “supervise” the model because you do not know the correct answer. Input data is unlabeled, and model works to discover patterns and information from it.
Unsupervised Learning is harder than supervised learning task.

An example of unsupervised problem is the recommender system of video streaming platforms.
Netflix does not know exactly the preferences of a viewer, but it tries to predict what tv-series and/or films the customer could be interested in based on the previous broadcasts they have seen.

The most important class of unsupervised problem is:

  • Clustering: dividing input data into groups such that objects in the same groups (aka cluster) are similar to each other (e.g., Netflix groups its audience in categories: thriller tv series lovers, action films lovers, etc.).

Reinforcement Learning

In a Reinforcement Learning (RL) problem, no training is made (as in supervised and unsupervised problems) but machine learns from mistakes.
It is based on the concept of rewards with a goal to be achieved.
RL algorithms use an agent (i.e., a system) that explores the environment in which it takes actions. Every action modifies the state of the environment and it is evaluated with a reward.
The agent’s mission is to reach the goal maximizing the reward.

An example of an RL problem is the trajectory planning of a robot. Imagine a vacuum robot cleaner (the agent) that at the end of its cleaning cycle of your home (the environment) has to come back to the charging station (the goal).
The goal is to reach the dock station taking the shortest path and avoiding obstacles.
With each robot’s step, the RL algorithm (the observer) calculates the reward of the step (e.g., proportional to target’s distance and presence of obstacles) and updates the position on the map (environment state).

Deep Learning

Last, but not least…
There is another class of ML problems: Deep Learning.
It is a subfield of ML, creating algorithms that are inspired by the structure of the human brain (named as artificial neural networks).

This technique is based on self-learning: there is no human intervention on input data and features are extracted by the model itself.

Deep Learning, compared to traditional ML, requires much more data and greater computation power as it is more complex to implement.

Autonomous driving car is one of the many applications of deep learning.
To properly drive, an autonomous car needs a human-like experience and expertise. A large amount of real data is required to understand the scenarios of roads, pedestrians, road signs and so on.

Let’s play

Do you want to play with Machine Learning? Use TensorFlow!

TensorFlow is an open-source software library for machine learning.
It is also available a Lite version, called TinyML, designed for Arduino and low-power microcontrollers.

Using Arduino and TinyML, you can develop intelligent machine able to perform human-like activities. For instance, you can build your personal speech recognition device (like Alexa) to perform a specific action (e.g., turn on a light).

Is not enough?

If you want to better understand the basic concepts and find out more about ML, I recommend the free online MathWorks course: Machine Learning Onramp.

--

--

Leonardo Cavagnis

Passionate Embedded Software Engineer, IOT Enthusiast and Open source addicted. Proudly FW Dev @ Arduino