Lessons: 21Length: 2.6 hours

Next lesson playing in 5 seconds

Cancel
  • Overview
  • Transcript

1.3 Git Concepts

Git has a number of different fundamental concepts that we'll use frequently in the course, and that you'll come to rely on to understand what Git is doing. Let's take a look at some of these concepts now so that they'll be familiar as we come across them in the course.

1.3 Git Concepts

Hi, folks. In this lesson, we're gonna cover some basic Git concepts to give you an overview of some of the stuff that we'll be working with throughout the course, and so that you get a good idea of what is happening behind the scenes as we run various different Git commands. A lot of the things that we'll mention might not make perfect sense right now, but as we progress through the course we'll run different commands and hopefully you will think back to this lesson and everything will make perfect sense. So first of all, let's talk about the repository itself. What is a repository? The repository contains everything related to our project. It contains a series of different objects that represents the contents of files in our working directory. It contains the index and the HEAD, as well as all of the history for our project in the form of commits. The repository is the .git folder inside our working directory, which we'll see when we initialize our first repository in the next section of the course. Now let's look at the working directory itself. When we initialize a repository, as long as we aren't creating a bad repository, the directory that contains the repository becomes the working directory. This is where the files that Git will track changes to reside and where we will go to edit those files and work on them. These are the uncompressed versions of the files rather than the compressed blobs in the Git directory. If we don't have any changes to the content of files in our working directory, the working directory is clean. Git stores the working directory as one of three different trees. A tree is how Git stores directories. Another tree is the staging area, or index. We'll come back and look at staging files in a lot more detail in the next section of the course. For now, just understand that like the working directory, the staging area is stored as a tree of content. The last tree maintained by Git is the HEAD, which points to the last commit that was made in the repository. When our working directory is clean, it points to the HEAD. Git stores things like a simple file system, where directory structures are stored as trees, and files, or rather the contents of files, are stored as blobs. Trees and blobs are different types of objects used by Git. Commits, which are snapshots of the repository at a given point in time, are also a type of Git object. All of these objects can be referenced using a unique 40 character SHA-1 hash. So we can get any objects in the repository or in the history of the project, whether it's a blob representing the content of a file, or a tree containing additional trees and blobs, or an entire commit, using this hash. And very often we can use just the seven character shortened version of the hash to reference the specific piece of content that we want. Let's just look at these types of objects in a little more detail. A blob contains content for a file. When we add a file to Git, Git doesn't store a copy of the file itself. Instead, it compresses the file using zlib and stores this compressed version of the file by putting it inside a directory. The name of the directory is the first two letters of the hash, and the name of the file is the remaining 38 characters of the hash. So when we give Git a hash and tell it to get that piece of content, it knows which folder and file to open. Trees also have hashes, and they contain content, but this content is actually other trees or these blob objects. Each item in a tree is represented by four bits of data, which are the permissions for the object, the type of object, whether it's another tree or a blob, the hash of the object and the name of the file, if it's a blob or the directory if it's a tree. A commit is also an object, and again it contains content, but again this content is different from that of tree objects or blob objects. A commit object contains information about who made the commit, the commit message that was added when the commit was made, the hash of the parent commit, and the hash of the tree that points to the trees and blobs that make up the commit. Remember that the commit doesn't actually contain any of the content of our files. It contains trees, which contain blobs, which actually contain this content. So the commit can contain trees and blobs and bit of metadata about the commit, and interestingly, it also contains the hash of the pairing commit. All commits, except the very first one, have at least one parent commit, which is just the previous commit that took place before this commit. Sometimes commits can even have more than one parent commit, such as when two branches are being merged together. When we make changes to files in our working directory and commit them, Git stores only the changes in the commit. If the content of a file hasn't changed in this commit, Git can just reference the previous piece of content by its hash. It doesn't actually need to store a copy of the same file. And this is an important factor of what makes Git so fast and lightweight. So in this lesson, we looked at some of the fundamental concepts of Git that we should really get our heads around in order to understand exactly what is happening when we run some of the common commands. We looked at the different trees that Git maintains, such as the working directory, the index or staging area, and the HEAD. We also looked at the different types of object that Git uses, including trees for directories and blobs for file contents. We also looked a bit at the commit object. You should now have a rough idea of how Git works and how it tracks changes to files in your projects. Thanks for watching.

Back to the top