Git is a version control (VCS) system for tracking changes to projects. Version control systems are also called revision control systems or source code management (SCM) systems. These projects can be large-scale programs like the Linux kernel, but they can also be smaller scale projects like your own R development, homework assignments, papers, or thesis. There are many other VCSs available (subversion and Mercurial are currently used extensively in opens source projects) but Git is one of the easier ones to set up. It is also well supported by the GitHub and GitLab ecosystems, and UI has recently set up its own GitLab service.
Git is available on the CLAS-managed Linux workstations.
You can also install Git on your own Linux system. Git is available for installation as a package within most Linux distributions and can be installed that way (e.g. using yum
or dnf
on Red Hat systems or apt-get
on Debian systems, or their graphical interfaces). You can also install from source code.
Instructions for installing on Windows or MacOS are available in https://happygitwithr.com/
There is a graphical interface git-gui
that may be useful.
A number of programming editors Interated Develop[ment Enviornments (IDEs) include Git support. Several Emacs Git modes are available. RStudio has integrated Git support.
Before starting to use Git you should first tell Git who you are so that you can be identified when you make contributions to projects. This can be done by specifying your name and email address.
git config --global user.name "Your Name Comes Here"
git config --global user.email your_email@yourdomain.example.com
These two commands only need to be executed once. The information you provide is included in a .gitconfig
file in your home directory and is used to label your actions in the Git log.
If you have spaces in the name you provide then be sure to enclose the name in quotation marks.
You can create a directory for your new project using the mkdir
command.
mkdir myproject
or you can work with a a project directory that already exists. You can enter that directory with
cd myproject
Once you are in the project directory, you need to initialize your Git repository with the command git init
so that Git knows that it needs to track changes here.
luke@nokomis ~/myproject% git init
Initialized empty Git repository in /home/luke/myproject/.git/
If you type ls -a
you will see a directory named .git
has been added. This directory stores all of the history information and other configuration data. Don’t touch this directory.
Now try editing a file. For example, let’s create a file named code.R
containing some R code. For example, you might add the lines
x <- rnorm(100)
hist(x)
Once you are finished editing the file, you can return to Git and type git status
:
luke@nokomis ~/myproject% git status
# On branch master
#
# Initial commit
#
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# code.R
nothing added to commit but untracked files present (use "git add" to track)
Notice that the code.R
file is listed under Untracked files
. This is because you need to tell Git that you want to track the changes in this file. This can be done using git add code.R
:
luke@nokomis ~/myproject% git add code.R
Running git status
again after this command produces
luke@nokomis ~/myproject% git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
# (use "git rm --cached <file>..." to unstage)
#
# new file: code.R
#
Now code.R
is listed under Changes to be committed
and is considered a new file
.
Now that you have added your new code file you can commit the change using git commit
. The git commit
command requires that you provide a short message about what the changes are and this can be done using the -m
switch. If you do not use this switch git
will open an editor session for you to enter a message. Make sure you write something informative (but concise) or else you won’t have any idea what you did when you look at it later.
luke@nokomis ~/myproject% git commit -m "Initial code for plotting histogram"
Created initial commit 7425153: Initial code for plotting histogram
1 files changed, 2 insertions(+), 0 deletions(-)
create mode 100644 code.R
After committing the change you can run git status
. It should say something like
luke@nokomis ~/myproject% git status
# On branch master
nothing to commit (working directory clean)
Notice that the first line says On branch master
. The master
branch is considered the main line of development. It is possible to have other branches of development but we will skip that for now.
Suppose you want to change the code in your code.R
file and print a summary of the data instead of plotting a histogram. You can load the file in our favorite editor and add the line
summary(x)
and delete the line
hist(x)
Once you are done editing the file, you can save/close it and run git diff
to see a summary of the changes.
luke@nokomis ~/myproject% git diff
diff --git a/code.R b/code.R
index de54497..32a52f4 100644
--- a/code.R
+++ b/code.R
@@ -1,2 +1,2 @@
x <- rnorm(100)
-hist(x)
+summary(x)
The output shown here is the Unix diff
format and it shows what lines were added, deleted, or changed. If a line has a -
in front of it, that line was changed. If a line has a +
in front of it, that line was added. Here you have removed the hist(x)
line and added the summary(x)
line.
In order to commit this change to the history, you need to add the file using git add
again and then commit it with a commit message.
luke@nokomis ~/myproject% git add code.R
luke@nokomis ~/myproject% git commit -m "Do summary instead of histogram"
Created commit 6d8dafe: Do summary instead of histogram
1 files changed, 1 insertions(+), 1 deletions(-)
Since the combination of git add
followed by git commit
is so common there is a shortcut: git commit -a
.
Now there are two revisions in your project history. You can see the complete project history by using the git log
command:
luke@nokomis ~/myproject% git log
commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:15:18 2009 -0600
Do summary instead of histogram
commit 7425153349ff276d89731ac8092597e7dc67520d
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:07:22 2009 -0600
Initial code for plotting histogram
The git log
command gives you the commit identifier, the author of the commit, the date of the commit, and the short message that you provided with each commit. Reverting a Commit
Now suppose you dicide that the summary is not that useful and that you’d rather do the histogram like you had it before. You can resolve this situation with the git revert
command. Notice in the Git log that the most recent commit has identifier 6d8dafe72a198ed63d11be8592c39bcd14179a6b
(NOTE: this identifier string may be different on your computer!). If you want to reverse the change that this commit introduced you can run
git revert --no-edit 6d8dafe
and the code.R
file will be reverted back to the version just before that commit. Note that you do not have to type in the entire identifier string at the command line—the shortest unique substring will suffice. Usually, using the first 7 characters is more than enough. Now when you run git log
you get
luke@nokomis ~/myproject% git log
commit bd21cc08168a38a86931416b701b3c248d0c36f7
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:21:34 2009 -0600
Revert "Do summary instead of histogram"
This reverts commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b.
commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:15:18 2009 -0600
Do summary instead of histogram
commit 7425153349ff276d89731ac8092597e7dc67520d
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:07:22 2009 -0600
Initial code for plotting histogram
Notice that history is not erased, but the revert is officially in the log.
For collaboration or backup it is useful to have your local repository linked to a remote repository, for example one on a GitHub server.
It is easiest to start with a repository created on the GitHub server. Cloning this repository creates a new local repository that is linked to the remote one you cloned.
Using a sample repository in https://research-git.uiowa.edu/STAT4580-Spring-2020/STAT4580.git you can clone and enter the repository with
git clone https://research-git.uiowa.edu/STAT4580-Spring-2020/STAT4580.git STAT4580
cd STAT4580
You can check the remote repository you are connected to with
git remote -v
Now you can edit and add files and commit your changes. Once you are done, you can push your changes with
git push
Before you continue working you should pull any changed you might have committed from another computer or that might have been committed by a collaborator:
git pull
It is also possible to connect a repository you created locally with a remote repository, but starting with the remote one is simpler.
You can read the help page for a specific command by calling git help
. For example, if you wanted to read the help page for git status
, you could call
git help status
Inserting help in between git
and the command name will retrieve the help page for that command
git status
: check status and see what has changedgit add
: add a changed file or a new file to be committedgit diff
: see the changes between the current version of a file and the version of the file most recently committedgit commit
: commit changes to the historygit log
: show the history for a projectgit revert
: undo a change introduced by a specific commitgit checkout
: switch branches or move within a branchgit clone
: clone a remote repositorygit pull
: pull changes from a remote repoositorygit push
: push changes to a remote repositorySome ways for collaborating using Git:
git pull
and git push
to share their changes. - Have one team member maintain a git repository on a web page. Other team members clone this repository and pull changes from it. Changes made by others can be contributed to the repository maintainer as email patches. - Use shared disk space for the remote repository (physical space such as a USB drive or a sharing service like Dropbox.With git you can create different branches
of development in addition to the master
branch that is created by default when you call git init
. These side branches can be used to test out new ideas or to explore past history.
For example, suppose you wanted to rewind the state of the repository to a previous commit, but you don’t want to revert all of the commits after that point to get to that previous commit (in other words, you want to preserve the project history as is). The simplest thing to do is to create a new branch at the point in the history that you’re interested in exploring.
Creating a new branch can be done with the git checkout
command. First take a look at the git log to see where in the history you want to create a new branch.
luke@nokomis ~/myproject% git log
commit bd21cc08168a38a86931416b701b3c248d0c36f7
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:21:34 2009 -0600
Revert "Do summary instead of histogram"
This reverts commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b.
commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:15:18 2009 -0600
Do summary instead of histogram
commit 7425153349ff276d89731ac8092597e7dc67520d
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:07:22 2009 -0600
Initial code for plotting histogram
Suppose we want to go back and see the state of the project after the second commit. We can create a new branch starting at the commit with identifier 6d8dafe...
:
luke@nokomis ~/myproject% git checkout -b test 6d8dafe
Switched to a new branch "test"
Note again the short version of the identifier. Now you are on the test
branch. Running git log
gives
luke@nokomis ~/myproject% git log
commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:15:18 2009 -0600
Do summary instead of histogram
commit 7425153349ff276d89731ac8092597e7dc67520d
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:07:22 2009 -0600
Initial code for plotting histogram
Now you are on this branch you can do some work. Create a text file doc.txt
containing
This file documents the code for this project
After creating the doc.txt
file git status
gives
luke@nokomis ~/myproject% git status
# On branch test
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# doc.txt
nothing added to commit but untracked files present (use "git add" to track)
You can add the file using git add
and then commit it to the project using git commit
as before:
luke@nokomis ~/myproject% git add doc.txt
luke@nokomis ~/myproject% git commit -m "Add documentation file"
Created commit e9dd8b1: Add documentation file
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 doc.txt
Running git log now gives
luke@nokomis ~/myproject% git log
commit e9dd8b1dd106e514088e53c1ae922fe59bd995b2
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:53:26 2009 -0600
Add documentation file
commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:15:18 2009 -0600
Do summary instead of histogram
commit 7425153349ff276d89731ac8092597e7dc67520d
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:07:22 2009 -0600
Initial code for plotting histogram
But wait! What about the history that you have on the “master” branch? It’s still there, but you cannot see it while you are on the “test” branch. To switch back to the “master” branch you can call git checkout
.
luke@nokomis ~/myproject% git checkout master
Switched to branch "master"
and run git log
again.
A useful tool for viewing the branch structure of a Git archive is gitk
. Running gitk
with the --all
switch indicates that gitk
should show all branches.
Suppose you feel the work done on the test
branch is in fact useful and you want to merge that into your master
branch. You can call git
merge to merge two branches together. You need to do this from a checkout of the master
branch; you can check the branch you are on with git branch
:
luke@nokomis ~/myproject% git branch
* master
test
The asterisk indicates you are on the master
branch, so you can do the merge:
luke@nokomis ~/myproject% git merge test
Merge made by recursive.
doc.txt | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 doc.txt
Running git log
gives us the history of the two branches merged together with a merge commit
.
luke@nokomis ~/myproject% git log
commit 8d4705f0a90d710c8f5361114ae83273724ad8bd
Merge: bd21cc0... e9dd8b1...
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 08:06:57 2009 -0600
Merge branch 'test'
commit e9dd8b1dd106e514088e53c1ae922fe59bd995b2
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:53:26 2009 -0600
Add documentation file
commit bd21cc08168a38a86931416b701b3c248d0c36f7
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:21:34 2009 -0600
Revert "Do summary instead of histogram"
This reverts commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b.
commit 6d8dafe72a198ed63d11be8592c39bcd14179a6b
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:15:18 2009 -0600
Do summary instead of histogram
commit 7425153349ff276d89731ac8092597e7dc67520d
Author: Luke Tierney <luke@stat.uiowa.edu>
Date: Wed Jan 21 07:07:22 2009 -0600
Initial code for plotting histogram
Because we merged the test
branch into the master
branch we no longer need it. A branch can be deleted with the -d
switch to git branch.
luke@nokomis ~/myproject% git branch -d test
Deleted branch test.
luke@nokomis ~/myproject% git branch
* master
Be careful when deleting branches; if you have not merged that branch into the “master” then deleting a branch will lose all of the history associated with that branch.
This introduction is adapted from Roger Peng’s Bare Bones Git Intro for Biostat 776.