eli5 GitHub/Gitkraken basics

24 views

I signed up for a college class thinking I’d be writing storylines for video games, but it is NOT that. So, I’m doing GitHub stuff and I am so confused.

I did a computer science fundamentals class last semester thinking it would be fun; it wasn’t. Technology is like magic mumbo jumbo and I cannot get a handle on it. Anyway, there are no other classes for summer I’m interested in, and I want to keep the credit hours I’ve signed up for.

What is a repository?
Commit?
Staging?
How does any of this work with coding? Or creating something?

And yes, I’ve watched the tutorials but I just don’t what these basics mean, and the videos just act like I should.

In: 9

repository is where the project lives: that includes all code and setup for the project

commit: the actual code change you do: for example, adding a line of text somewhere is needs to be commited

staging: is just where you test your code changes in an environment called staging (qa)

Think of it as saving a project at different points in its lifetime. A canvas turning into a painting, and every “commit” is a time you save the painting so you could go back to that point and do something different to the painting.

If you do change something, you can save it as a new “branch” and switch between branches and continue editing before switching back

All of these things are VCS or Version Control Systems. It turns out software is quite complicated and having a history of what changes(commits) you make to a set of files(repository) is useful.

The repository serves as a storage place for your files and configurations and it allows other people to obtain those files with their history, make changes to them, and give them back (merge requests, pull requests) so their changes can go in with yours.

This can also serve to deconflict multiple people changing the same files. If you’re working on the same files that someone else is, it’s possible you could be conflicting. The VCS can help identify these conflicts and possibly even resolve them.

Of course you can develop software or anything for that matter without version control, but you may lose file history, have difficult access controls, collaboration non-existent or difficult, and other things these tools offer.

So explaining this to a 5 year old might actually be impossible, the 2 most basic commands are:

1. Checkout: This is basically just a glorified name for download, it lets you download a specific version of the program code.
2. Push: This is the opposite of a checkout, it means to upload code from your computer into GitHub and puts it into a new version.

Now for all the words you listed heres the simplest explanation I can come up with:

Repository: A collection of ALL the versions of code created for this project.

Commit: The act of pushing code (Or in some cases creating pull requests, which is an advanced version of push)

Staging: This is an advanced version of checkout, it lets you save your changes locally and reset the code to the version you downloaded. This is useful if you are doing several different things at the same time.

How does any of this work with coding? Or creating something?

It is an amazingly powerful tool for software development, it is also mainly only used in a team setting.

For teams its most basic function is to mix together peoples code without them having to coordinate stuff like: “Ill pass you this snippet of code to add to line 253” it makes working together a lot smoother and is generally loved by everyone.

As a summer class it seems like a great way to make yourself WAY more hireable as a software developer.

“I did a computer science fundamentals class last semester thinking it would be fun; it wasn’t. Technology is like magic mumbo jumbo and I cannot get a handle on it.” This is most people in IT, if you are looking for a relatable subreddit for this: r/programmerhumour has hundreds of us all in your shoes

This post is quite long, and it may all seem quite arcane and complex. I suppose it is. But after a while the basics will really feel relatively natural.

# Basics

Git is a version control system. A repository for files. Basically “a folder with timestamped versions of its content” – you can save folder state like you would save in a video game. Git was mostly made to work with plain text files (because code is mostly stored in plain text files), but it works with all kinds of files.

Why is this good? Because programming is hard, and programmers make mistakes, and it’s good to be able to come back to an earlier save. And for teamwork, but more on that later.

GitKraken is a git client – a software to use git with. There are many. Git by itself is used from the command line, which I wouldn’t recommend.

GitHub is a git hoster. It’s like a webmail website, but instead of for mails, it’s for git repositories. There are others, but github is the biggest and best-known one. Why does one need a git hoster? If one works alone, mostly as a backup, or to show/offer your code to others. But if you work in a team, the hosted (we say *remote*) git repository becomes the central repo you and your colleagues work on.

# In Practice (alone)

OK, so how does this work?

On your computer, you have a folder, and you create a git repository in that folder. Nothing changes; it’s still just a folder (technically, a hidden folder named “.git” is created in that folder).

Now when you add, delete, or change files in that folder, your git client will show you thse changes. Once you are at a point that you would like to save, you mark all the changes you want to save – this will be added, removed, and renamed files, and changes in files. This is called *staging*.

Once you’ve staged all your changes, you make a *commit*. That’s essentially a savegame of what is currently in your folder. You give it a comment (“made the login button pink”), and you click *commit*. Confusingly, both the savegame and the action of creating it are called *commit*. Or, put another way, you commit files to the repository, and this set of changes is then also called a commit. A commit is identified by a long cryptic combination of numbers and letters, its SHA1 hash. You don’t need to know what that is, just that this is basically the commit’s identity – like your social security number.

Then you continue working, and if you made a big mess, you can go back to any earlier commit. Or if you did something good, you make a new one.

# Teamwork

When you work in a team, you also *push* your commit to the *remote* repository (the copy of the repository that lies on github), so your colleagues also see all your changes.

And if a colleague made a change, you *pull* their changes down onto your computer, into your local git repository.

If your colleague made a change while you were also making changes, you will have to *merge* their changes into your repository. Git does this automatically in most cases: It takes your changed files and the colleague-changed files, and it automatically creates a *commit* which it helpfully calls a *merge commit*. This resulting commit is nothing special, it’s just a savegame that contains both your and their changes.

Sometimes, you and your colleague worked on the same file. You will get a conflict, and whoever of you was the last will have to *resolve* that conflict by looking at both versions of the file and creating one that makes sense (often choosing one version or the other).

In plain text files, git can understand when you worked on different lines of the file, and it will not generate a conflict, but merge the files together with your changes in the lines you changed and your colleagues’ in the lines they changed. This works surprisingly well.

In other types of files, images for instance, git can’t do that, and if both you and your colleague changed the file, you will have to pick which version gets to live.

When you’re done with the *merge*, you should *push* the merge commit, so your colleague can *pull* it.

# Branches

Branches are cool, but they’re not basic. I won’t get into them here.

Imagine you and a friend are working on a project that involves a lot of files. This project has text documents, spreadsheets, images, all kinds of stuff. The two of you need to be able to make changes to this complicated mess while also making sure you’re not overwriting the work the other person is doing. So you agree on the following system

There will be a master version of all the files, separate from the versions either of you have on your computers

The two of you, and anyone else you might bring on board, can go get a copy of this master version

Now that you have a copy, you can make changes to the files. When you’re satisfied with your changes, you can bundle them up, maybe write a little note about why these changes were made

Then you can send your bundle of changes to the master version.

If you and your friend both send bundles, and the files you were working on were different then the ones your friend was working on, no problem. The master version can apply your changes easily.

But if you both made changes to the same file, uh oh! The master version will accept the first bundle submitted, and then reject the second. It will tell that person “Hey, you need to get a new copy of the master version, look at the other person’s change, and resolve the discrepancy.” Hopefully it’s a non problem; maybe you edited a document’s title, and your friend edited the font size. Those charges don’t conflict. But maybe you both edited the title. Now, the rejected person needs to do some work to resolve the problem. They need to make a decision about what the real charge to the title should be.

Once they fix the conflict, they can resubmit the bundle, and the master version can be confident that this change takes the other person’s work into account.

If that makes sense, that’s git.
The master version is a repository
Getting a copy of the master version is called cloning.
A bundle of changes is a commit. If you’ve made changes to many files, but don’t want to commit all of them, you can stage just the ones you’re ready to send.
Sending the bundle to the master is a push.
Trying to push a file that’s already been changed by someone else is a merge conflict.
Getting the changes that are in the master but not in your local copy is called a pull.

GitKraken is just a UI so you can do all this without typing out commands. It also tries to help you visualize all the changes and contributors to a project, but honestly I don’t find the graph very helpful

There’s a ton more, but don’t sweat it. If you can understand git as a record of all the changes made by all the people working on a project, you’re 90% of the way there

Have you ever worked on a draft for a paper, or something, and saved a copy of your work? Then you add a few parts, not really sure you wanna keep them, so you save the file again, but you click “Save As…” instead of “Save” and keep it as a separate copy from the original? Then, at some point, you decide you don’t really like what you did, and you want to start over fresh from that original copy?

Maybe you end up doing this a lot. Each file has to have a unique name, so the place you keep your project starts piling up with files named something like `thing (3) (3) (final) (3) (actually final for real this time).docx`. Navigating this can be kind of hell if you take it too far. God forbid you have more than one file! It would be really nice if there was a kind of program that would do things like hiding all of the backup copies somewhere and only keeping the most up-to-date one around, but still allow you to jump back to any of your snapshots at any time…

That’s basically what **Git** does. Not GitHUB, or GitKRAKEN, just “Git”. Sorry if that’s confusing. I’ll get to that in a minute.

Git is, as lots of other commenters already pointed out, what they call a “version control” software. Its primary purpose is to, in a manner of speaking, take backup snapshots of your project from time to time, and keep the snapshots in a little filing cabinet for you. You tell it to “watch” a specific folder. Then, every time you tell it to save a snapshot, every single file in that folder (or only some of them, depending on how you set it up) will get saved in the snapshot. Even files within folders within folders all the way down, if you want. Then, as you work, if you ever want to “roll back” to one of your snapshots, you can do that. You just ask Git to open the filing cabinet and pull out the version of the project from the specific moment in history you ask it to. When it does this, all the files in the watched folder get replaced by the versions of the files that were there when the snapshot was taken. It doesn’t matter how many files changed, or how much they changed, or even if you added or deleted any files between snapshots. The folder will simply revert to whatever point in history you tell it to, like magic.

A **repository** in this analogy is the filing cabinet. Every project has a single filing cabinet where Git keeps all of that project’s snapshots. (It’s actually the folder called `.git` that gets created in the same folder as your project files. That’s where the snapshots live, along with some data Git uses to remember things like when they were taken and what order they come in.) A **commit**, when used as noun, is the fancy word for a snapshot. Using the word “commit” as a verb refers to the action of taking a snapshot.

Normally you would do this and all of the following actions in the command line. But if you’re not proficient or comfortable in the command line, there are several graphical user interface tools that will make interacting with Git easier, more like programs you’re probably more used to using with clickable buttons and such. **GitKraken** is one of these tools. There are many others, GitKraken is just one of them. It’s apparently the one you happened to stumble across first, either by your own research or because the people you are working with made you use it. It’s neither a strictly good nor bad thing, it’s just a different way to use Git.

Back to Git. Let’s say you edited your file a bit, and you’re ready to take a snapshot of it. Or, to use the proper terms, you’re ready to commit your changes to the repository. The first thing you have to do is stage your changes. What does that mean? Well, it’s mostly only useful in situations where your project has many files, not just one. Sometimes, after you’ve change several files since your last snapshot, you don’t want to save all of the changes in all of those files. You only want to keep some of them, you’re still on the fence with the other ones. That’s what the stage is for. When you **stage** a file, you are telling Git “next time I take a snapshot, only include these ones”. It’d be like… you’re at a family gathering, taking obligatory family photos, and someone suggests “okay, let’s do one of just the kids”. So you put only the kids in front of the camera. There are more people in the room than just the kids, but the kids are the only ones on the “stage”, so when the camera takes its snapshot, only the kids are in it.

So, this is all well and good. You have a fancy rollback tool now, and it can handle as many files as you want. Super. What else can it do?

Well, if you’ve ever worked on a long developed complex project, you’ll know that sometimes it can start going down several different paths simultaneously. Only the good ones will win in the end, but you might not know which one is the best one until you pursue a couple paths for a ways. You could say the project has “branched” out into multiple versions.

Git is designed with this feature in mind, too. Its snapshot history isn’t a straight timeline. You can branch that timeline like a tree. Say you make a commit at some point in your project, then start making some changes, and committing those changes. Then, you can roll back, start over, and go a different direction with new changes and new commits. You now have two branches of the project living in Git simultaneously, each one descended from that branching-off point. In fact, a **branch** is exactly what Git calls these. You can make as many branches as you want and Git will remember all of them, and you’ll be able to jump around to any of them at any point just like you would with the rest of your history.

The most important part, though, is that Git will let you take two different branches and **merge** them back together, incorporating changes from both branches into a single unified version. If the two branches changed totally unrelated parts of the project, Git can do this merging automatically. But if two parts of the project were modified in different ways, Git will dig its heels in the sand and go, “Okay, woah, hold up. I’m not made of magic. You’re gonna have to tell me which of these conflicting parts you want to keep.” They call this a “merge conflict” in the biz; kind of a pain if you don’t expect them and they tend to frighten new Git users, but keep in mind that what Git is really doing here: it auto-merges files together, but it marks the places where it needs an actual human to go in and referee things so it doesn’t clobber something it wasn’t supposed to.

Continued in part 2 below, sorry for length.