FREELessons: 24Length: 3.1 hours

Next lesson playing in 5 seconds

  • Overview
  • Transcript

3.1 Git Concepts

Now that we have a little bit of practical Git knowledge under out belts. I want to take a break from the practical stuff and talk a little bit about Git theory. And I'll mention right now that if you want to skip these two theory videos that I'm gonna be doing, and just move on to the practical stuff, you can do that and come back to these later. But I do recommend that you come back to these later if you're gonna skip them, because I think it's good to understand just a little bit of what Git is doing behind the scenes. And this is true even if you're used to using other version control systems. I say this because Git is actually pretty different from most other version control systems. It has some unique paradigms and ideas that we're going to be discussing in these two videos, and so even if you're used to using other systems, I recommend you stick around for this. But before we actually get into the Git theory, I want to mention some of these other version control systems. Cuz you might be wondering what are the alternatives to git? A few of the popular ones are Subversion also called SVN, Mercurial, Bazaar, and there are other ones too as well as some proprietary ones. I can imagine that you might be thinking well, why should I learn Git? Why invest all this time to learn git instead of learning one of these other ones. Well, the About page on the Git website lists things like branching and merging. the staging area, the fact that it's a distributed version control system, as things that make Git better than the others. However, really, all you have to do is Google something like why Mercurial is better than Git and you'll get a lot of responses telling you that Mercurial is better than Git and why. So this is probably going to lead you to the question, why should I actually spend the time to learn Git? Is Git really the perfect tool? Well, in my opinion, there isn't really a perfect tool for any developer task, mainly because we can't all agree on what the tool should do. So I'm not going to try to convince you here that Git is the best tool. If you're not convinced Go and do some research on your own and decide for yourself. That said, I think Git is a really great tool, and if its conventions resonate with you then there's really no reason why you shouldn't use it. So, we're going to talk about a couple of these conventions in this video and a little bit in the next video. And the first one that we're going to talk about is distributed. Git is a distributed version control system. And this really just means that when you clone a Git Repository you get the full history of it and not just the latest code. We haven't talked about cloning a repository yet and we will in a future video, but for now that cloning a repository is really very simple. It's just making a local copy of that repository. On your own personal computer. When you do that clone, you don't just get the latest version of the code. You will have the ability to look at any of the history of that code and see where the project was yesterday, where it was a month ago, where it was a year ago. A lot of version control systems just give you the latest code, or, at least, that's their default behavior. You can use, then., flags or special commands to make sure you get the whole history of the project. But Git defaults to giving you that whole history. Of course, this also means that you don't need a network to make commits. Some version control systems actually have a central repository on a server, and different developers who are working on that repository Will actually have to be connected to the network and of course connected to that server if they want to push a change or pull a change from that server. And we'll talk more about pushing and pulling in future videos of course. However with Git it's all kept on your local computer because it's distributed. It's distributed among the different computers that have copies of the repository. And of course we can still do the pushing and pulling of changes as you'll see later. But Git allows you to actually make commits when you're not on a network because all that history is stored locally. Of course since it's all stored locally as well, every time you clone a repository you're actually making a back up of it. So this distributed system means that if you have a development project with lots of people working on it, then everybody actually has a complete history of it which means that they all have a backup of the code, so you're not depending on essential server, that has the one copy, and if it dies well then, hopefully you had a backup for that. In this case, everybody has their own backup. Next we're going to talk about the staging area, and how it compares to the working directory, and to the Git Repository itself. Now, I know we talked about this a little bit before, but I really wanna make sure you have a very solid understanding of how it works. You really need to fully [INAUDIBLE] the staging area, not only because it's one of Git's central features that your gonna be using. Evey day all the time, but also because the Staging Area is actually a feature that is unique to Git. No other version control system has a Staging Area. So when we talk about the Staging Area before, we used the factory and shipping analogy. So if you look at this diagram here you can see that these are the three places where Git stores content. And the first one Is the working directory and I said this working directory is kinda like a factory because this is where the changes are actually made. But that's actually only partially true. Take a look at this output here. This is what we would get if write get status of our project right after we had made a commit of all the changes. You can see that it says that there is nothing to commit and most importantly, that the working directory is clean. This is what you could call the main state of the working directory. This is meant to mirror the latest commit in your project. And the difference here is that, while the actual commit is a compressed view of the data, and we'll talk about that more in the next video. The actual working directory holds the uncompressed and complete version of the files. And this is actually called a checkout. We've made a checkout of the latest commit, and that is the working directory. It's interesting to note that if you haven't made any changes after committing, as you can see is the case with this output, the working directory is actually unnecessary and redundant, in the sense that it doesn't hold any data which isn't already safely stored in the Git Repository. In fact you can actually make a copy of the repository. And make that copy so that it does not have a working directory. This is actually called the bear clone. This is actually what GitHub uses. Repositories that are up on GitHub don't have a working directory because no one's actually working on those repositories. They're only being pushed to and pulled to from local computers. So those repositories are called bare clones because they don't have a working directory. So if you think about it that way, the working directory is where those uncompressed and leased files are, and that is what you're modifying. So if the repository doesn't need to be modified, it doesn't have a working directory. So in that way, the working directory is like a factory, because that's where things are created, that's where any changes are going to happen. And the next area on our diagram is the staging area. As I mentioned earlier, this is kind of like the loading dock. After you're finished making file modifications, you stage them using git add. As you know, this is basically marking them for commit. Actually, the staging area is just a single file in that .git folder within your Git Repository that's called index. As you can see, I've got the file highlighted in red here. And And that index file is what keeps track of what changes we have staged and what changes we have not staged. And actually, index is another name for the staging area. Sometimes it's just called the index. Now, you might be thinking well what's the whole point of having a staging area anyway? If I've made changes in the working directory, why would I not want to commit them? Why would I not want to just take everything in the working directory and commit it? And then make changes and commit and make changes and commit. Why do I have to go through this intermediate step of actually staging the files? Well, when using version control it's really important that your commits make sense. Here's a couple of examples of good commit messages that I pulled out of the jQuery project on GitHub. Each of these commits actually does a single thing. Now I don't know about you, but when I'm writing code, it's not always this organized. I'm not always working on a single thing at a time. Maybe I'll be working on one feature and I'll find a bug, and so I'll just pause that feature, jump over here, and just make the changes to that bug. And then I'm about to commit that bug, and I say oh wait a second. I've, actually half created this other feature, and if I just commit everything, I'm going the bug change, and I'm going to commit that feature. That's going to make you come up with some bad commit messages. You're going to end up with things like this, where you just say, oh I I changed a bunch of things. Or I added this feature and I also moved this or I changed that. Or you're going to have commits where you think you fixed something and then you do some more testing and oh, no you didn't fix it. So you're going to try again. And oh, you almost got it. And this kind of thing really makes a version control system almost useless because your commits don't actually show a step by step growth of the project. So, you want these kind of commit messages and this is where that staging area comes into play. The staging area allows you to pick and choose what you want to commit and commit it. So if I have started a feature, I find a bug, I go fix the bug, and I want to commit the bug before I actually continue working on the feature, I can add just those bug fixes to the staging area. And make it commit. And the part of the feature that I've already created, that just stays there in the Working Directory. I commit the bug to the Git Repository. That's been committed, so now I can go back to working on the feature. And that's how I can make these nice commit messages where I created this feature, I added this thing. And it's all very encapsulated bits and that's what your aiming for when your making it you want it to be encapsulated, appropriate, you want an intelligent commit to put into your Git Repository. And that Git Repository is the third area where content is stored in your Git project. And you might just think that the whole project directory is a Git Repository, isn't it? Actually, it's just that .git folder inside of your project. We saw that .git folder in an earlier video, and that .git folder is actually where all the magic happens. That is what Git is concerned with and that is the Git Repository. All the content and all the changes you've ever made to your project are actually stored in some form inside of that single little folder inside your project folder. And so to come full circle, after you make a commit, assuming you have committed all the changes you make, the latest content in the Git Repository, the latest commit that you've made is actually identical to the working directory. And then we just continue that cycle where you make changes, you stage them, and they're committed. You make changes, you stage them, and they are committed. Well, that's all we're gonna discuss in this video. In our next theory lesson, we're gonna look inside of that .git folder. Find out how that Git is actually storing in our content, what it does to actually track the changes that we make, and how you can reference some of the commits you have made.

Back to the top