# Machine learning for interaction designers

## Introduction

### What will we learn in this workshop?

As the name of the workshop suggests, we'll talk about machine learning through the prism or media installations. 

We won't go into much details inside the math behind it. This is actually the whole point of this workshop : there are some tools out there that can help us leverage the capabilities of machine learning without having to know too much about what's under the hood.

Through a series of practical examples (and a tiny bit of theory!), we'll try to understand what machine learning is, and how using it can help us in our everyday patching life, and even allow us to do things that would be very tedious to do without it.

### Making the machine learn : the $Q Recognizer

As you may have guessed, machine learning is about empowering your computer to learn things by examples. Let's start straight ahead with a simple task : how can we teach our computer to learn simple gestures?

- This is our first mission : we want to recognize 2d gestures. There's a nuget out there that could help us : `VL.2d.DollarQRecognizer`. Let's install it with the following command :

  ```
  nuget install VL.2d.DollarRecognizer -pre
  ```

- $Q Recognizer is an algorithm that's able to recognize 2d gestures. The idea is simple : for each gesture that you want your computer to recognize, you give the computer several examples of it so it can learn the concept of a specific gesture. Later on, when the computer has enough examples, it should be able to recognize any input, even if it has never seen  it before and tell which gesture it is.

  The help patch `Learn and recognize gestures` that comes with the library is the solution to this little exercise.


Think of it as a kid learning to recognize animals with an imagery book. By looking at different pictures of cats (with different colors and sizes), a child will be able to recognize a cat in the street, even if it's not the exact same size and colors than those he saw in the book. This is because he was able to **generalize** the concept of a cat, and can now recognize one, no matter what it looks like. On the other hand, if it sees a dog, it will also know that this is not a cat, because it clearly does not have the same **features** a cat has.

Now let's go back to our gesture recognition thing, and ask yourself a question : how would you have done this without a gesture recognizer? Well, if you wanted to recognize when someone draws the letter A, you could say that it's a shape made of three lines : two touch each other on the top, and a third one crosses them near the middle of the two firsts.. but what if someone draws an A that's not sharp on the top, but rather rounded? What if the horizontal bar is very close to the bottom of the letter?

<img src="doc/a_shape.png" style="zoom:50%;" />

You can see where this is going : we can recognize all the above shapes as the letter A, but it's really hard to write a program that would take all those cases into account, and also recognize any other shape that should be an A but that's not drawn above.

And this is exactly where machine learning comes to the rescue : **rather than telling the computer how to find the solution to our problem, we give it the solution and let it figure out itself what's necessary to solve the problem.** Here, the computer figures out itself, thanks to the examples we gave it, what the concept of an A is, and generalizes it so it will be able later on to recognize an A that's not part of its **training set**, just like the kids do with pictures of animals. 

In the end, the algorithm will produce a **model** that you can see as a black box that's tailored to recognize what we taught it.

Here's a simple representation of what's going on under the hood in a classic machine learning pipeline :

<img src="doc/simple_ml_pipeline.png" style="zoom:50%;" />

## A bit of theory

The history of machine learning and artificial intelligence takes us back to the 1943, when Warren McCulloch and Walter Pitts wrote a paper called _A logical calculus of the ideas immanent in nervous activity_. This paper proposes an overly simplified model of the human brain, in which very basic elements, the perceptrons, are connected to each other to perform highly complex tasks. The idea here is that back in the 40's, scientists were already trying to figure out how to mimic the behavior of the human brain, and how to achieve some kind of artificial intelligence.

#### Machine learning? Deep learning?

But how to find your way around between nowadays buzzwords? We hear a lot about artificial intelligence, machine learning, deep learning, neural networks, how do all those terms relate to each other?

<img src="/doc/terms.png" style="zoom:50%;" />

If you look for the answer to this question yourself, you might stumble upon different arguments, or different ways of explaining things. As far as we're concerned, we'll use the following distinction :

- **Artificial intelligence** is the whole set of techniques that allow a machine to simulate human intelligence. To that regard, a simple decision tree could be considered as artificial intelligence.

- As the schema points out, **machine learning** is another set of techniques that make up artificial intelligence. It can be defined as "the field of studies that gives computers the ability to learn without being explicitly programmed" (Arthur Samuel, 1959). We can distinguish three main categories of machine learning :
  - **Supervised machine learning** : here, we provide the computer with labelled examples. Think of what we did earlier with the gesture recognition : we gave the computer examples of a gesture, and for each example a label, telling it what this gesture represented. We call it _supervised_ because we have to carefully curate a dataset and label it so that the machine can learn what we want. This workshop is going to focus on this area of machine learning.
  - **Unsupervised machine learning** on the other hand just requires data and lets the computer interpret it. Think of a DBSCAN algorithm, that's able to find clusters of points. We don't give labels or anything else than raw data to the computer, it figures out categories itself.
  - We'll also just mention **reinforcement learning** which we'll not cover in this workshop. In this type of learning, the computer tries to do things and gets rewards if it does the right thing. You've probably seen this type of ML when watching videos of AI's learning to play video games. At first, the AI does nothing, but soon it finds out that if it pushes the right button, it can go forward. It then learns that it should avoid holes and enemies to stay alive, and you end up with [an AI that finishes Super Mario Bros](https://www.youtube.com/watch?v=CI3FRsSAa_U) in the blink of an eye.
  
  > Also, note that vvvv user motzi made an excellent introduction to machine learning in the vveekend-vvorkshop series, in which he goes into details into machine learning training process, describing different machine learning algorithms and steps you should take when optimizing your dataset before training. I definitely recommend to check it out. Part one is [here](https://www.youtube.com/watch?v=17JjwXB6tDk), and part two [there](https://www.youtube.com/watch?v=XtYRMjJcrwA).
  
- **Deep learning** to finish, is a subset of machine learning. See, we briefly mentioned perceptrons as being an oversimplified model of our brain's neurons. When connected to each other, those perceptrons make up a neural network that somehow mimics the behavior of our brains to perform complex tasks. We talk about deep learning when we have a neural network made of hundreds and hundreds of layers : 

![](doc/deep_vs_shallow.png)

<p align=center><i>Credits : itility.nl</i></p>

The purpose of this workshop is not to discuss how those deep neural networks are working. For that matter, I recommend checking [this incredible video](https://youtu.be/aircAruvnKk) from the Youtube channel 3BlueOneBrown that explains very clearly how a deep neural network can recognize handwritten digits.

#### Uses of machine learning today

- **Natural language processing :** Siri uses neural networks to process voice commands and synthesize voice. See those two articles on the topic on Apple's Research website :
  - [Personalized Hey Siri](https://machinelearning.apple.com/research/personalized-hey-siri)
  - [Deep Learning for Siri's voice](https://machinelearning.apple.com/research/siri-voices)
- **Tailored recommendations :** Netflix uses machine learning to give you custom recommendations based on your watching habits. Fun fact : the thumbnails you see for movies and series are also influenced by your behavior! Again, two articles from Netflix's Research website :
  - [Netflix Recommendations : Beyond the 5 Stars (Part 2)](https://netflixtechblog.com/netflix-recommendations-beyond-the-5-stars-part-2-d9b96aa399f5) 
  - [Artwork Personalization at Netflix](https://netflixtechblog.com/artwork-personalization-c589f074ad76)
- **Spam filters :** popular mail services such as Gmail use machine learning recognize spam and fraudulent emails. See [this article](https://www.sciencedirect.com/science/article/pii/S2405844018353404) on Science Direct for more information
- **Image recognition :** an obvious use of machine learning for us. We can quote the [Yolo](https://pjreddie.com/darknet/yolo/) detection system that uses deep learning to recognize objects in an image.
- **Style transfer :** another popular use of machine learning. It allows to take a input image of your choice and a reference image (say a famous painting) and apply the "style" of the reference image to the input image. We'll do this in vvvv using RunwayML in a few minutes!

#### A few terms to remember

Before jumping again in vvvv, let's take note of a few terms that would be useful to remember :

- **Supervised learning :** we tell the computer what is what, it learns it and can tell us afterwards what it sees. Implies we need to provide curated labelled samples first
- **Classification :** is about telling if an arbitrary input lays in category A or B
- **Regression :** regression is about predicting a numerical value. Think for instance a model that would predict house prices based on their proximity to the see, their size, etc

## Tools of the trade

Since a few years, tools allowing us to easily leverage machine learning techniques started popping out. Here are the tools that we covered in the workshop :

- **RunwayML :** a cloud service running pre-trained ML models

- **Wekinator :** a tool that performs supervised machine learning, accessible inside vvvv via OSC commands

- **Microsoft Lobe :** image recognition made easy

## Cool links

To finish, here are some cool links that I encourage you to check if you want to read things about what we just saw and get deeper in all things machine learning!

- You can watch [a recording of the RunwayML workshop from NODE20](https://thenodeinstitute.org/courses/node20-vvvv-workshop-bundle/) in which we go into more details in the RunwayML nuget. Gene Kogan, one of Runway's contributors, also gives many information about the feature that allows you to train your own models.
- In his VL learning vlog, vvvv user Takuma Nakata has two videos where he goes through VL.RunwayML's help patches and uses a few models ([part1](https://youtu.be/QEjuEjVzX_E), [part2](https://youtu.be/r-bCVuBqa6M))
- As previously mentioned, vvvv user motzi made two videos for the vveekend vvorkshop series in which he gives much more details about data preparation, training process using his vvvv beta pack for machine learning. Part one is [here](https://youtu.be/17JjwXB6tDk), part two [there](https://youtu.be/XtYRMjJcrwA).
- Also mentioned in the PDF, a video by awesome YouTube chanel [3BlueOneBrown](https://www.3blue1brown.com/) (definitely check this out!) : [What is a Neural Network?](https://youtu.be/aircAruvnKk)

- [ml4a](https://ml4a.github.io/ml4a/) : a website dedicated to machine learning for artists. You'll find articles describing explaining what are neural networks, and how they work.
- [Machine Learning for artists and musician](http://www.wekinator.org/kadenze/) online course by Rebecca Fiebrink, who created Wekinator.
- [An introduction course](https://www.excella.com/resource/building-deep-neural-networks-in-ml-net) about differnt types of ML and a short demo with ML.NET, Microsoft's C# library for all things machine learning
- IBM has a cool series of articles about machine learning :
  - [Machine learning](https://www.ibm.com/analytics/machine-learning)
  - [What is supervised learning?](https://www.ibm.com/cloud/learn/supervised-learning)
  - [What is unsupervised learning?](https://www.ibm.com/cloud/learn/unsupervised-learning)
- OVH also has a fun series about deep learning :
  - [Deep learning explained to my 8 years old daughter](https://www.ovh.com/blog/deep-learning-explained-to-my-8-year-old-daughter/)
  - [What does training neural networks mean](https://www.ovh.com/blog/what-does-training-neural-networks-mean/)
  - [Understanding the anatomy of GPUs using Pokemons](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/)
