146180 adam noonan 400x277 v2 ios 11 110817
  • Overview
  • Transcript

3.1 Natural Language Processing With Core ML

In this bonus lesson, we'll use Core ML to work with text instead of images. You'll see how natural language processing has become much more powerful in iOS 11.

3.1 Natural Language Processing With Core ML

Hi and welcome back to Image Recognition iOS 11 With CoreML. In this bonus lesson, we are going to look at Natural Language Processing, short NLP to analyze tasks. Natural Langauge Processing is a very common problem. And then at least the basics are not very complicated. It get's more difficult when it comes to detecting sarcasm, but otherwise detecting a language, identifying key words, or determining if some text is more a positive or negative, isn't that bad. IOS implements these technologies in the NSLinguisticTagger that is part of the foundation, library and has been around since IOS 5. Of course not all of it has been available then, IOS 11 and CoreML added a lot of features regarding language detection and so on. The Tagger is going to talk the nice words in the provided text making it ready for processing. To use this API you don't have to add custom CoreML models, it's already built in. Let's jump right into a playground I prepared. Safety introductory text of the Wikipedia article about natural language processing for languages English, French, Japanese, and Chinese. To work with the Linguistic Tagger you have to add to tag schemes you want to support. In our case this is nameTypeOrLexicalClass and lemma. The options I can leave blank. Then we can add some texts to the tagger. I'm using the English paragraph here since it's easier to validate the results. Next we are going to need a range that will be analyzed further. I'm just going to take the complete text starting at location zero. I'm going to the end by using tagger.string.utf16.count to get the correct length for the string. Finally we can define some options our taggers we should use as well. I'm going to ignore punctuation and wide space. Now we can enumerate over the tags that the Tagger found. I'm going to use the range value, then I have to specify a unit, this can be either a verb, sentence or paragraph, depending on what you want to analyze, we want to look at individual words. Then we need to scheme, that is interesting to us. Let's start with name type or lexical class and also provide our options. The last parameter is a closure function that will be called for every tag it finds. It provides the resource, a range and a stop argument that is out only. So you can tell the enumerator to stop early. So, let's see what we can extract from the Tagger. First, let's get the word from the text. And to do that, I have to convert it to another string, and use substring on it using the total range parameter. I'm also only interested in the lowercased word. Now, let's bring the value of the result and the word. I'm adding a default value since the result can be nil. What we receive is a classification noun, verb, adjective, and so on. We could add a guard class for instance, that filters out just the nouns like that. Let's also have a look at using a lemma as a scheme. I'm going to comment out the print statement and the guard clause and replace it with a different one. I want to guard against empty results. And the ones where the lemma doesn't differ from the word itself to remove the noise. Let's also print it. As you can see, it retrieves the base for a word. For instance, programming has the lemma program. This also works for French. But doesn't for Japanese or Chinese, as the languages work differently. Finally, before I wrap up, I want to show you a very neat property on the Tagger. It can also be statically accessed. It is dominant language. This will return the language that has detected first. It works for all the languages are used here. For Chinese it even detects that it is traditional Chinese. When I combine French and English however, it does only recognize the language that is present first. To recap. Natural language processing is an important topic in computer science nowadays. You can use it in IOS with the Foundation Framework and the NSLinguistic Tagger class. It can do various text analysis on multiple languages. I hope you enjoyed this morning's lesson and I'll see you with a conclusion.

Back to the top