4.4 Building a Sentence Analyzer
Using these data structures and tools for iterating through them, let's create a new application. This time we will analyze the frequency of letters in a sentence input by the user.
1.Introduction2 lessons, 11:32
2.Python Building Blocks6 lessons, 1:08:07
3.Controlling the Flow7 lessons, 1:20:10
4.Common Data Structures4 lessons, 46:49
5.Application Structure7 lessons, 1:15:12
6.Collections7 lessons, 46:55
7.File I/O6 lessons, 48:51
8.Networking5 lessons, 43:48
9.Connecting to Network Services3 lessons, 34:27
10.Conclusion1 lesson, 02:08
4.4 Building a Sentence Analyzer
Now, that we have some new data structures and some looping techniques in our tool belt, it's time to start using those to write an application. Now, the application that we're gonna focus on in this lesson is fairly simple. But I think once you go through it, you'll start to think of some other very interesting use cases where you could use a technique like this. So, here's the basic premise of our application. We're going to start by asking the user for some input, just like we've done in the past. And this time we're going to ask for, basically a string, or a sentence, if you will. A series of words or even just a set of characters or something like that. And what our application is going to do is it's going to analyze that sentence or series of characters, to the point where it's going to actually count the instances of certain letters within the sentence. Now, you can make it look for certain words, if you want, or certain sentences. But the basic idea here is we're gonna have it go through and count how many times it runs into different characters that it comes across within the sentence. So, it sounds fairly simple but there's some fairly interesting issues that we're going to work through. So, let's start from the beginning. And what we want to do is actually get the input from the user. So, let's go ahead and ask for a sentence. So, let's create a variable, sentence. And we're gonna set that equal to inputs. We're gonna prompt the user. And we're simply gonna say, please enter a sentence, or whatever you might want to ask the user for. So, after this, after the input function runs and gets this input, we will have a string in our sentence variable. So, the first thing that we're going to do, since we need to ultimately break this down. Now, there's a couple different ways you can go about this. But I want to show you a couple interesting functions along the way that you'll probably want to use later on as you start to advance in your Python development. So, the first thing that I want to do is I actually want to break the sentence up into a list of words. So, how can I do that? Well, I could definitely just loop through all of the characters in the sentence that was input. And that would work, but that's a little bit, a few sections ago. So, what I'd [LAUGH] ultimately like to do here is use something, maybe a built in function in Python, that's gonna allow me to break this sentence up or split this sentence up into a series of words. And it just so happens that something like that exists. So what I'm gonna do is, I'm going to create a variable here, called words. And I'm gonna send that equal to sentence. And since the input function is returning a string, implicitly we know that sentence is going to be a string. And so, we have all of the functions available on a string and one of those functions that's gonna become very handy is called split. Now, what split does is it's going to break or slice or split this sentence up into a list of, basically, strings based on whatever character or string that we give to it that we want it to split the sentence or the string up by. And it just so happens that all we want to split this on is the space. So basically, we're going to say wherever you find a space in this sentence, split in on that. And then, what's ultimately going to happen is split is going to return a list of the different strings that results from this operation. So, you'll see an example of that in just a few moments. So now, the basic logic that I want to follow now is I ultimately want to loop through all of these words. I want to grab the characters out of those words. And then, I want to use some sort of data structure that's going to make it easy for me to count the appearances of each of those characters throughout the sentence. And it might be obvious, but at this point since we just started talking about dictionaries, maybe we'll use that to do this counting. And I'll show you how in just a moment. So, let's go ahead and loop through these words. So, we'll say for word in words. So, we're gonna loop through all of those words found in that list. And then, we want to actually get the list of characters that makes up that string or that word. And if you recall earlier, in a few lessons, we talked about different ways that we could take a string input and then basically cast it into another data type. So, if you recall when we first started using this input function, we were getting strings, but we wanted to deal with numbers or numeric values. We would use things like the int function or the float function to kind of coerce or cast that string value into our desired type. And it just so happens that there's another function called list. And list will take in a string and spit out a list of characters. So, that's exactly what we're going to use here. So, we're gonna say, characters. So, we want to create a list of characters that is going to be equal to list, and we'll put in there, word, as the input. So, word is gonna be broken up into a list of characters and put into our characters variable. So, now we're going to loop through characters, so we'll say for character in characters. Now, we need to check. Cuz we're gonna start dealing with a dictionary now, and before we can actually start dealing with it, we need to create it. So, let's create it up here above. So, we'll say that we're going to create a dictionary. We'll call this char_count and we'll set that equal to an empty dictionary. So, remember we have the curly braces here. So now, the first thing we need to do once we come across all of these characters is before we can actually increment a counter that is going to be the value of each of the name value pairs. Within our dictionary, we need to see if that key exists first. Because remember, if we try to access a value at a key that doesn't exist, we're gonna get an error or an exception and we don't want to do that. So, we want to do some defensive programming here and we want to make sure that key exists first. So, if you remember, I taught you a little bit of a tip to be able to check to see if that works. So, we can do, if char is in char_count. And so what this will do is this construct here in the dictionary is going to check to see if the value in character is a key in my character count dictionary. And if it is, then obviously this will return true and will fire off this block of code here, and then we'll do an else to handle the rest. So, what do we wanna do if the character does exist? Well, all we wanna do is access char_count for the key character. And then, at this point, we'll assume that there's already a value associated with it, which we're gonna put in there a numeric value or an integer. So, we're just going to increment it at += to 1. So, now what we do if the character is not in the keys found in character count? Then, we'll have an else block. And within that else block, all we want to do is add this name value pair. Or this key value pair into our dictionary. So, we'll say char_count[char]. And then, all we have to do is set it equal to 1. So by default, the first time we come across a character, we will insert that as a key, that character as a key into our dictionary. And we'll set its value equal to 1, so that will be the initial counter. And then, every time we run into that character later on, we'll go ahead and increment that value that's stored for that particular character key. So, that's gonna handle most of the logic there. So, we'll go through all the characters in all the words. And that's basically gonna get us where we wanna go. So now, all we have to do is do a little bit of outputting of our data. So, how do we wanna do that? Well, we learned a nice little way to get back the name value pairs in a loop construct from our dictionary. So we can say for, and we'll say key represented by k, and value represented by v in, we'll do char_count.items. Now, we're gonna get back all of those iterable items found within our character count dictionary, which remember our key value pairs. And then, we just want to print something else. So, we'll say print Found. We'll say the key is gonna be the characters, so we'll say Found k. And then we want to say how many number of times we found it. And that's gonna be the value stored at v. And then we'll just say, times. So, something like that. So, go ahead and save that. And that's really all it's gonna take, so now we've parsed out some input. We have split it up into words. We've created our dictionary, we've looped through all the words, grabbed all the characters for each word. Loop through the characters, see whether or not that character is or is not a key within the dictionary. And then, either set or increment the value as appropriate. And then, at the very end, we just loop through them all, and we show you how many times we ran into a particular character. So, let's see what this looks like. I wanna make sure that I save this. So we'll save this, and I have this set as char_account.py. And we'll just go ahead and run this. So, we'll say Python. Char_count.py. So there we go. Please enter a sentence. So, we'll say, I love python. And we'll go ahead and hit enter. And there you have it. So we see that we go through all of the letters that we found in here. And it's going to count all of them, and we only ran into a single character that came through twice and that was the o. So, if we look through there, I see an o here, and I also see an o here. So, that's pretty nice, but I have to say I'm not overly impressed here by the output here, as it's kind of confusing. Yes, it shows all of these letters, but they're kind of hard to look at because they're not alphabetized. So, maybe we can come back and make a little bit of an optimization to make that a little bit better. So, how could we do something like that? Well, first we would need to get a way to sort this dictionary based on those keys. Well, the downside is that you can't actually sort a dictionary within Python. A dictionary is not an ordered data structure on any key or value or anything. It's just, it is what it is and you look up things by those keys. Now, there's another data type called an ordered dictionary, an ordered dict. Now, I'm not going to go into using that. You can look that up on your own time, if you would like. But there is a way we could get around this. So, yes, we can't necessarily sort a dictionary, but we can sort a list. So, how would that work if we wanted to do that type of a process? Well, once again, there is a nice little built in function. And you'll find that for most things that you're trying to accomplish within Python, there probably, if you look around hard enough, already exists a function built in to handle that. And it just so happens that sorting is one of those operations that is already covered, at least for the most part built into Python. So, we can use a function called sorted that's gonna take an input of some sort of list that you want to work with, and it will sort that list for you. So, let's say that I wanted to sort the keys found within my dictionary. So I could do something like this. I could say sorted char_count. And remember, I mentioned that there is a keys function that is just going to return a list of all of the keys that are found within that dictionary. So now, sorted is going to return a list of keys, in this case, in sorted order. So in this case, I'm just gonna call this sorted_keys = sorted. So, now we can modify our for loop down here a little bit. And I'm gonna actually replace my little multi variable for loop here for just a single variable just to make this a little bit simpler. So, ultimately, all I really wanna do here now is I wanna loop through all of these sorted keys and then grab the values for them as we print them out. So, let's go ahead and create a for loop to do just that. So now, I can say for key in sorted_keys, because I already know that, that sorted keys list is sorted. Now, I can go through and I can grab the value associated with that key. So, I can say value is going to be equal to char_count [key]. So, that will give me the key store or the value associated with that key. And then I can go ahead and just use this print down below, and we'll paste that here. And this time, we'll say found key value times. And then, we'll just get rid of this for loop, and we'll save that. So now, let's see, if we've typed everything in correctly here, hopefully we'll be in good shape. So, we can say, Please enter a sentence. It will say I love python, and hit Enter. And now, we should see this be in sorted order. So, I see I, e, h, l, n, o, p. So, it definitely looks like it worked. And now, we see it's a little easier to look at. But, one thing that I want you to bear note of when we are doing this kind of an operation. If you take a look at any sort of comparison operators within any sort of language, you will quickly find out that uppercase and lowercase letters are considered different things. So, let's see how that could happen. So, I could do something like this. I could run my application again. And I could say something as simple as my name. I could say D-e-r capital E-k. Now, this is a little strange and you obviously wouldn't do this, but this is just to prove a very simple point. Now if I were to hit Enter, you're gonna see that nothing came through showing up two times, because when we did our comparison here. So we said, if character is in character count as a key, it's actually doing a very simple comparison. And like I said, when you do comparisons on characters, uppercase and lowercase have different values. So, I'm going to leave you with a little bit of a homework assignment, and see if you can come up with a solution. So what I'm posing to you is what I would like to see or what you should try to find is if there is a function out there that would allow you to maybe make this work in your advantage. To maybe look as if everything were given lowercase, or look as if everything were given uppercase or something along those lines. So you could truly see how this application would work. Because I would want this example with uppercase and lowercase letters to work in a similar fashion, as you would see maybe this version where now we get all of them kind of combined down and grouped together. Where we could see two e's for both this example and this example. And I bet if you look hard enough, you'll find some nice little built in functions out there that might help you do things like, I don't know, send everything two lowercase or two uppercase or something along those lines when you're dealing with strings. But regardless, there you have it. We've now built a very simple sentence analyzing application that will utilize a couple different for loops that are nested as well as dictionaries. And being able to sort some information, split some strings, as well as output some data we found in a dictionary. So, that pretty much wraps up our section on going through some data structures. And then in the final section of this course, I would like to work on some very interesting concepts that we come across within Python having to do with things like being able to structure your code in a very logical fashion. So, stay tuned for that in the next section.