March 21, 2022

Git 101

From time to time I get to the same place, telling some people about git, what it solves and some basic usage.

Since I’ve done it a lot recenly I wanted to write down a post and enjoy it.

What is git?

Git is a gift from the gods for the following use cases:

  • My laptop is broke! I need the data there is a whole month of work there!

  • The stupid google drive override my code and now my project doesn’t start!

  • I tried to change some lines of code, then it complicated and now I can’t go back, I’m thinking of deleting everything and start over.

Basically is an historic of our files stored as a graph. You take a picture of a folder and store it. You change some stuff in the folder and decide that you like these changes, then you take another picture.

If we try to diagram it we can think of it as looking inside a bag:

At some point my bag contained 1,2 and 3.
_ _ _ _

|1 3 |

| 2 |

|_ _ _ _| Thank in git is identified as a **commit** a picture of my bag.

I talked with the sales team and they offered me a good exchange my 2 for their 4.

So my bag became:

_ _ _ _

|1 3 |

| 4 |

|_ _ _ _| I could store only the changes as example in this:

So my history could be seen as:

First commit (1,2,3) -> Second commit (2 -> 4)

But this will be really slow to regenerate the state at some point since all operations are dependent. So git stores all the data in the snapshot but it performs and underlying trick, if the file did not change it will share the previous reference.

By storing all the data in each commit I can go to any point really really fast.

Does it work automatically? No.

The git “flow” is usually described as “zones”. Let’s go bit by bit.

How do I start with git?

Just install it following the official docs and also do the intiial config https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

From the second link we just need to set our name and email: https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup

$ git config --global user.name "John Doe"
$ git config --global user.email johndoe@example.com

First, how we do create a git repository?

Just get into a folder (that may already contain some files) and type in the console “git init”. That will create the magic “.git” directory (the point before git indicates that it is a hidden directory).

Second what can I do now?

Now all the files you create are in the working area that is your local file system.

If you want to persist some changes, you need to put them into the staging area in order to make so, you need to add then, just type git add <file_name> or git add . for adding all the changes to the staging area.

Note: this area do not exists by itself, internally git has a file called index where it tracks all the changes so you basically are writing changes into that file. That means that you could store some changes on a file and also delete that file for the working area! Commit the changes and still do not see the file

Now we have 2 options: ¡2! People don’t know a lot this, but both are really useful:

  • Commit: If we want to store the changes in a graph we need to commit the changes. If you type git commit you will enter into a text editor for writing a message. Git messages should clarify the changes. (If it is vim and you are stuck, pres ESC and then :wq (it will appear below) and type enter, you can avoid vim by type git commit -m "my message".

  • Stash: If we want to discard the changes, we have another area called stash let’s consider it a bin where we can save some changes for the future. Just type git stash and all the changes you made will dissapear! (They can be recovered with git stash pop)

Lets summarize the areas:

working area  staging area commit area stash area
where the untracked changes reside my “bare” filesystem it holds the changes I decided to add the “stored” changes applied to my files the trsh bin

When we take the first paths there are 3 possibilities:

  1. Everything is ok

  2. We did the commit in the wrong place (branch)

  3. We committed some stuff that we shouldn’t have committed.

  4. We messed it up, I want to undo everything.

Let’s start by the case 3 and 4:

A commit can be removed by two commands: git revert (which creates a new commit undoing the changes) and git reset.

If we added sensitive data like credentials, we really want it out, as anyone can check out git history if we publish it. So we are going to apply git reset.

There are two options for this: we can keep the changes or lose everything.

If we want to keep the changes we may use git reset --soft HEAD~N. (Where is the number of commits to undo).

That will add all our changes to the staging area and undo the commit.

We can remove the files from the staging area using git rm <file_name> and then commit them back.

If we want to lose all the changes, git reset --hard HEAD~N if enough. (Where n is the number of commits to undo.) What is this? What is HEAD~N?

Well, we need to introduce a new concept: the branches

Think of the graph mentioned before:

commit1 --> commit2 --> commit3 --> commit4

In order to have quick references to that list of commits we can use branches with would be a movable pointer that keeps pointing to the last commit.

            [main]      [dev]       [feature/impl]
commit1 --> commit2 --> commit3 --> commit4 [HEAD]

Let’s think of this diagram, we have 3 commits, the main branch is pointing to the commit2, while the dev (that also has the commit2) is pointing to a new commit, the commit3. The commits are the ones relating between them the branches are just pointers to those commits.

And then we have a feature/branch that comes from dev (as it has the same commit as dev) and has the HEAD tag, why?

Because you can just be in a branch at a time, the current branch in its last state it is called the HEAD.

That means when you’re undoing / reversing changes you want to gos from the current branch on the last commit N commits back.

Obviously nothing to add to case 1 let’s keep digging into branches and then we will be able to solve case 2! :)

What are branches for?

That’s a good question, first, a branch permits to start developing stuff in paralell while maintaining isolation the current working code.

For example in the following diagram

commit1 --> commit2 --> commit3 --> commit4 [HEAD main]

|

| --> commitX [feature/1]

|

| --> commitY [feature/2]

How do I work with branches?

First you need to create a branch with git branch <branch name>. Then you must change to that branch, checking it. git checkout <branch name>.

This operation is sooo common that there is a shorcut: git checkout -b <branch name>.

So for case 2 (did you forget it?), we need to generate a new branch with the previous command it will contain the current commits. Then we can reset the branch were we add those changes so everything is ok and nobody will ever know. (How would they know?)

Ok, so now I have several changes on my branch feature/X and I want them on master, what do I need to do?

On a team you will do a pull request, and that operations is automating the following commands:

git checkout master

git diff feature/X

So all the changes are displayed and another team member can review your code. If the pull request is approved it will perform:

git merge feature/X.

Notice how I went to master and merge feature/X into master. That means I’m applying those changes into my branch.

That will generate a new commit introducing all the changes from the feature/X branch into master.

Well ok… Too much stuff, let’s say I get. But… if my pc brokes down I’m in the same problem, am I?

No! Git can be synchronized with remote repositories. You can register into one of the several git providers (github, gitlab, bitbucket, azuredevops… or whatever your company is using) and you can mirror your changes into a remote repository doing what’s calling push & pull.

For an existing repository you can just add its match just by using this command:

git remote add origin <repo url>.

An obvioulsy you can remove a remote repository using git remote rm origin and then adding a new one.

Most providers support two ways of authenticating: https and ssh. The latest is my preferred and here is a guide for setting up your ssh keys:

https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent

Once you have your remote repo linked you can “push” your changes using the command git push

You may get an error saying that there is no ‘upstream’ branch. Usually when there is no remote branch it throws this error and it also answer you with the proper command to use.

Also if somebody published their changes you can bring your changes into your computer by “pulling them” that means using git pull.

And just for explaining something from before. Let’s say we publish a branch to the remote repo, but the last commit contained some credentials. I removed the commit from my local history but it exists on the remote branch. I can overwrite the remote branch by forcing the changes with git push -f.

And if a remote repository exists?

Well that’s even easier! You just need to clone the remote repository. Go to the directory where you want to repo to be put in and then type git clone <repo_url>. That will create a new directory with the repo name and will have the latest changes from master.

This may have been a bit intense for a 101, but for sure you’re more than ready for the typical git workflow! :) Most people only use this during several years.

Good luck! I promise to iterate over these tutorial.

2017-2024 Adrián Abreu powered by Hugo and Kiss Theme