Unlimited Plugins, WordPress themes, videos & courses! Unlimited asset downloads! From $16.50/m

Next lesson playing in 5 seconds

Cancel

Free Preview: Image Recognition on iOS With Core ML

Introduction

00:58
  • Overview
  • Transcript

Machine learning is one of the hottest topics in the tech world right now. It's being used more and more widely for applications such as image, speech and gesture recognition, as well as for natural language processing. With recent advances, it's even possible to run machine learning algorithms on your mobile device.

In this course, Markus Mühlberger will show you how to put machine learning to work in iOS 11 with Apple's new Core ML library. You'll get an overview of the key machine learning algorithms along with examples of where each one can be applied. You'll learn how to import and convert publicly available models for use with Core ML, and you'll learn how to build an app that applies these models for image recognition. And as a bonus, you'll learn how to build an app that does natural language processing!

We've built a complete guide to help you learn Swift, whether you're just getting started with the basics or you want to explore more advanced topics: Learn Swift.

1. Introduction

1.1 Introduction

Machine learning has become a prominent topic lately. With the advancements in computer science and technology, it has now become possible to do this on your home computer or, as of iOS 11, on your mobile Apple device. Hello, and welcome to image recognition on iOS 11 with CoreML. I'm Markus Muhlberger and in this course I'm giving you an introduction into Apple's machine learning framework, CoreML. We are going to talk about the science behind it like neural networks or support vector machines. How to do image recognition in your iOS application. And how to use live video to do the same thing. As a bonus, I added the lesson about natural language processing with iOS and CoreML. Let's get started with machine learning on iOS.

1.2 What Are Machine Learning and Core ML?

Hi, and welcome back to Image Recognition on iOS 11 with CoreML. In this lesson, we are going to cover the basics, and I will give you an overview about technologies in this course. Machine learning is a field of computer science that allows computers to learn without explicit programming. This means that software programs can make decisions on their own, based on factors they determine on their own as well. Some of the simplest machine learning software, are spam filters. With every import and the feedback the user gives them, they learn more and more what a spam message looks like. Like the Aurora field there are different approaches to machine learning, some of them are neural networks, decision trees and support vector machines. Let's have an overview introduction about this. Artificial neural networks are inspired by human or animal brains. Like in the real world, they consist of a collection of connected nodes, called neurons and each connection, the synapse can transmit signals to other neurons. Each receiving neuron can process the signal, and send it downstream to the next ones, and so on. Those signals are generally represented as values between 0 and 1. Neurons and synapses can also have weight that change as the system is learning. Typically, neural networks are organized in layers. Different layers perform different kind of transformations on the inputs. The first layer is called the input layer, the last the output layer, and everything between are hidden layers. So how to interpret the outputs and what to use as inputs? An input is a feature you identified for instance to determine if a human is sick or healthy, you would use temperature, weight, height, and other properties as individual features. The neural networks then does it's calculations and process everything to the output notes. The value they receive can also be called confident score. If you have only a binary interpretation, healthy or sick, you can make it do with a single output note. A higher value, for instance, would indicate a healthier human being. If you have multiple classes like determining if an image you put is in either a bus, plane or bike, you'd have to have 3 notes. The value for each note describes the confidence the system has, that the input matches the mode of transportation the note represents. So this process is called forward propagation. To make the network improve, you need to drain it, and this is a slow process, a very slow one. During draining, you will compare the output produced by the neural network to the expected output for your training data. You can tell that your network have big [INAUDIBLE] walls and it will then go backwards through the layers propagating the error values by determining the individual differences for each node. This process is called back propagation. It allows to note that just the parameters and weights to improve results. After training a lot, it will eventually average out and produce correct result. Artificial neural networks can be used for decision making AI systems, for instance for a chess or poker game, pattern recognition, automated trade systems, or medical diagnosis. Let's look at decision trees now. As a programmer, you can imagine it as a set of if-else statements that will create a tree structure, and eventually arrive at an end node that normally contains a classification. Each decision ask a question about a feature. It will then direct to flow to the next note, but a process will be repeated until an end note is reached. When training the algorithm it will change the questions ask by adding or removing notes or changing the decision value, this is done by learners. Decision trees are an approach that is fast an easy, but not always as accurate as more complex algorithms. Sometimes learners can create overly complex decision trees that don't generalize well enough, which is called overfitting. On to the final algorithm I want to explain. The principle behind Support Vector Machines is also not very complex. It is the decider between two classes based on a distinct feature line that separates them. There are techniques to classify between multiple categories but they are mostly also binary. So if you have two features, for instance the height and weight of people, you can draw data bonds on a graph. As you can see, group a is on the top left and group b is on the bottom right. We can draw a line between them, every data point that is above it belongs to a, the others to b. To determine which is the best vector to use, we are going to look at the data points closest to the other group and calculate a vector and then the margin it has to each group. The goal here is to maximize the margin. The line in the middle is called a hyperplane and you will see why in a minute. The vectors along the margins are called support vectors and they give the approach its name. Now what to do if you can't just draw a line between two sets, like here where group a is in the middle. Well, we can just push this into a high dimension by using a transformation function or method. In this case, for instance, we can use a parabola to transform the one dimensional values into two dimensional space. And as you can see we can now draw a line between them. Since this line is in a higher dimension than the original data and since they normally have more features than just one or two, the divider between the classes is called the hyperplane. Support Vector Machines work well for image classification and has implications in biology, and other sciences. For instance, the classification of proteins so, why am I telling you all this? Well, because CoreML uses these technologies and a few others for classification. In it's currents version, it can be used for classification but not for training so you have to use external tools to create a CoreML Model. Those are Caffe and Keras for neural networks, SciKit-Learn + XGBoost for tree-based classification and SciKit-Learn + libSVM for support vector machines. SciKit-Learn can also be used for a few other algorithms. I have link to Apple's documentation about it that includes a table of all supported tools and types in the lesson notes. Although CoreML doesn't actually do the training part, that's okay. Since you don't want to train your model on every device separately anyway. To use CoreML iOS, you have to use additional frameworks provided by Apple for the task you want to achieve. CoreML acts as a column based layer for the Vision framework for image processing, foundation for natural language processing and GameplayKit for decision trees like path finding and AI behavior. To recap Machine learning has many approaches lie neural networks, decision trees or support vector machines. A neural network consist of input and output layers with hidden layers in between that transform the data received from previous layers. A decision tree asks yes or no questions about the data and traverses through a tree until it reaches a leaf. Support vector machines classify between two groups using a hyperplane as a separator. If a classification can't be done in a dimension, it is transformed to the higher dimension using [INAUDIBLE] method. CoreML can't train the algorithms itself, but uses trained models from other tools. In the next lesson, we are going to start with the course project and talk more about the CoreML Model itself, and try it out, see you there.