3.6 Reading and Writing Complex Objects
To this point, we have really only been dealing with simple objects like strings within files. Well, guess what. Files are good for other types of information as well. In this lesson, I will show you how to write a complex object to a file. Then I will show you how to read that data back into its original data structure.
1.Introduction2 lessons, 04:55
2.Collections7 lessons, 46:55
3.File I/O6 lessons, 48:51
4.Conclusion1 lesson, 01:19
3.6 Reading and Writing Complex Objects
We've gone through a lot of cool things in this course so far. We've been dealing with files and how to handle reading and writing and moving around in files and saving things here, and writing things there. And how we can work with collections and lists, and modify these collections and lists, and order them and pick things out and do all sorts of crazy things like that. Now I wanna start to bring those two worlds together just a little bit. So let's give a somewhat real world example here, where I would like to take some information, some important information about maybe like a person or something like that. And I wanna be able to write that to a file, I wanna be able to save that to a file. Maybe for reporting purposes or maybe just to save it somewhere, so that I can retrieve it later and do some sort of processing or whatever it is I need to do. But ultimately, it's time to start working with a little bit more interesting data than maybe just numbers or strings in a file or sentences or something like that. Let's get a little bit more down to the nitty gritty. Let's say, I'm dealing with a system where I need to keep track of personal information about some people, maybe multiple people maybe a single person or whatever have you. It doesn't really matter, but the point is I need to get that information and I need to save it. I need to persist it somewhere, and in this case we're gonna persist it to a file. So let's start off with creating a simple person. So I'm gonna create a dictionary here and it's going to be person. Now this is gonna contain some information and it's about this person. Now you could define any number of properties about this person that you want, but I'm gonna keep things rather simple. But feel free to expand upon it however you would like. So we're gonna start off with a first_name and this person's first_name is going to be Derek, and this person's last_name is going to be Jensen. And this person's age is going to be 100, and maybe this person's eye_color is going to be green or whatever you like. And you can continue to add properties on here they could be strings if we list so you could you know do whatever you want. Maybe you could add it here, programming languages. Maybe you wanna keep track of this person's proficient programming_languages and it could be things like Python, C#, maybe Swift. A whole a whole slew of different things and now I have this object and you can have several of these objects, you can have a lot of different people. You could be creating a census of different people in your organization and their skill sets and all sorts of things like that. And I'll have all that new information and I wanna persist I wanna save it. Well you could save it to a database, but in the vein of what we've been trying to work on here, we've been working with files and we've been talking about different ways to access files reading and writing and things like that. And everything that we've done to this point we have been reading in writing from files in a very ASCII sort of way. So the text that's in those files is the text that's there. Those are the strings, the raw strings that we've been dealing with, but there's another way that we can read and write data to files and that's in more of a binary format. So I'm able to write things a little bit differently to files, and this comes in very helpful when you're dealing with objects. Or when you're trying to, what's known as serialized data to a file somehow so that you can retrieve it later on in its exact format. And which tends to be a problem where I'm dealing with a complex type here. I'm dealing with a dictionary that's got some strings, it's got some integers, it's got a list and how do I manage that data? How do I persist that data and persist its structure and then retrieve it back out later? That could be some very complex logic I would have to build, but thankfully for us with in Python there is using some things we've already learned about accessing files and bring in a little bit of a library to help us where that's really gonna make this entire process much much easier. So I have this person and I want to write this person to a file, wanna serialized this object and save it to a file. Well like we've talked about before, let's go ahead open a file, let's say open and we'll just call this person for now and we can call this whatever we will. Let's just call a person .ser. You could give it whatever extension you would want, doesn't really matter. And then we're gonna give it some sort of access, we're gonna call this as person_file, and that's the start of it, so let's talk about a couple things that are going to change here. I do wanna write to this file, I wanna write my person to this file. So we are going to use write access, but we're going to use a slightly different write access. So far we've talked about raw write and write plus. This time I wanna do a binary write. So that B is going to say I want to write this in a binary format. So there's gonna be some extra things going on in there behind the scenes that I'll show you in just a few moments. Now it's a little bit more than just being able to do that. There's a little bit extra that has to go into all of this. You have to be able to format this person object in such a way that it's going to be easy to write to the file and luckily for us there is a little bit of a trick here, a little bit of a library called pickle. I know it's an interesting name, but it's incredibly powerful and very useful. So we're going to import that into our script here. And within our with statement now that we have this open for binary writing, we were referring to this as a person_file. Now, I want to use pickle I want to use the dump function on pickle. And I need to specify what it is I want to dump, which is gonna be my personal object. Remember this could be a dictionary, it could be a string, it could be a list of dictionaries or a dictionary of lists or whatever you want. You can really get very, very complex data structures and it will work out just fine. And then we need to specify where we wanna write it, we wanna write it using our person_file. So go ahead and save that they will go back to our terminal here and we'll go ahead an execute this. So we'll say Python3 and we're gonna run and serialize and let's go ahead and run that. Well it doesn't look like anything happened here, but we got a new file that showed up here, person.ser. So let's take a look at that. Well there's some crazy stuff going on and here I kinda see some things that look familiar. I see first_name and Derek, so I see there is data in there. But what's all this other stuff going on in here? Well that's the binary representation of our person object, and why do we care? Why did we do it that way? Well the reason we did it that way, is so it would preserve the structure of our data here of our person dictionary. Now what is that matter? Well that matters because now later on, I can come back and I can reload that particular piece of information and do more processing with it. So how do we do that? Well it's quite simple, I'm gonna do a similar operation about to open my file again this time. So we'll say, we're gonna open person.ser. And this time we're going to do a read binary and we're gonna do this as person file again just to be consistent I guess, but you can name it something else if you want. And in this case we're gonna get back some data, so I'm gonna say this is gonna be new person is gonna be equal to and then I can use my pickle again. And this time we're going to load, and we need to specify where we're gonna load this from and we're gonna load this from our person file. And once I've retrieved that from my file, let's just print the result out so we can see what it looks like. So we're gonna print out new person, so let's save this. And because I'm just gonna rerun this entire thing remember because I'm using right it's going to overwrite the existing file with this again. So it's gonna do it overwrite, they look the same but it's gonna rewrite it and then it's going to read the contents, it's going to load that serialized data using pickle again into new person. I'm gonna print it out, and I would expect it to look exactly the same as this, let's save that. We'll go ahead and run this. And as you see, I get back the representation that I serialized to disc. I have my programming languages, I have age, eye color and it's maintaining all the different types that were put into that dictionary. So I have a list of strings, I have an integer, I have other strings. And that's incredibly powerful once you start to get out in process this data where you can go through and you can be loading up large data structures and saving them to files. And then reading them later and updating them and pre-processing and all that sort of good stuff. So you could be using complex storage mechanisms and persistence mechanisms like databases, if you would like. But I would say, that in the early times, as you're starting to first get going, file storage is not a terrible thing. You can do incredibly powerful things with tools that are built-in in a standard library and within Python to be able to read files, to write to files, to serialize complex data structures. And maintain all of that so that you have a nice little system going for you which you can do a lot of complex operations with very simple tools.