QUICK AND DIRTY

If you just want to set up git to use calliope's development branch, and don't want to think too overly much, do this:

  1. Download and install (if you need to) zlib-devel, openssl-devel, and possibly the curl libary.
  2. Download the latest "git-core" tarball from this directory and put it in /usr/local, "make", "make install HOME=/usr".
  3. Download the latest "cogito" tarball from this directory and put it in /usr/local, "make", "make install HOME=/usr".
  4. cd /usr/local
  5. cg-clone http://neil.verplank.org/opensource/calliope/calliope-git/
  6. ln -s /usr/local/calliope-git/ /usr/local/calliope
  7. mkdir /usr/local/calliope/var
  8. mkdir /usr/local/calliope/var/log (you may also want to create var/burn and var/log/burn with correct permissions)
  9. edit the calliope/etc/conf file as appropriate (you may want to create /etc/calliope, and edit Util/Conf.pm, and make this the default conf file. Then changes to the development conf file in calliope/etc/conf wont overwrite your conf file.)
  10. cd installer
  11. ./install

This should create /usr/local/calliope-git, with the current calliope development branch. Read the READ_ME and the INSTALL files, edit the conf file, run the installer, and away you go!

To contribute: You clone my branch. The first time, you'll need to create var, var/log, and if you want to burn, var/burn, and var/log/burn (you only have to do this once). Permissions and ownership on all these are important (see INSTALL and the conf file itself). You probably want to set up a second calliope conf file as /etc/calliope, and alter Util/Conf to make this your default conf file (so that new conf changes from me go into calliope/etc/conf, and you have to merge those changes into your "real" /etc/calliope conf file).

Once you have a working calliope installation, you then make whatever changes you wish within your calliope-git folder. Use cg-commit to commit the changes, cg-add to add new files to your repository (dont add the var directory structure). To post your branch, you copy the contents of the .git folder to a web accessible directory, and tell me where to find it. All I have to do is add your branch, then cg-pull to get your changes.

 

ABOUT GIT / COGITO

Git is a SCM (Software Configuration Manager), a set of tools used to manage the software development process across multiple developers, and generally used for version control. In general, scm's can be divided into centralized and distributed; CVS and now Subversion are popular centralized version control systems. Git is a distributed manager, and requires no central server.

Git was developed by Linus Torvald's as a tool for managing kernel development, but interest in the toolset has prompted more widespread interest. Git and Cogito are still in early stages of developement; while they do work, and are being used, they can and do break. If you are looking for tried-and-true versioning management for widescale deployment, may we humbly suggest Subversion .

To be quite technical, git is probably more acurately described as a specialized file system. Cogito is a wrapper for git. In other words, git does the behind the scenes work (often referred to as "the plumbing" in various git documentation), and cogito provides the actual commands you would use for software version management. So while cogito and git are separate, you need them both to form a useful whole.

This distinction can be a little confusing, because it's possible to *merely* use git, which is what Linus does, but for the rest of us mortals, the more generalized Cogito is the way to go.

Note that there are a number of related links at the very end of this document that you may also find useful.

 

GETTING GIT / COGITO

You can download the current version of Cogito and Git-core from kernel.org:

http://www.kernel.org/pub/software/scm/

You need the git-core tarball, found as git-core-0.99.xx, which you will find in the above scm directory.

tar -zxvf git-core-0.99.x
cd git-core-0.99
make
make install HOME=/usr

Note that if you don't specify HOME=/usr, git and cogito put themselves in /root/bin, which may or may not make them available to you on the command line.

Also dowload the latest cogito tarball someplace you like to put software (perhaps /usr/local/ ?), expand it and compile it:

tar -zxvf cogito-x.x
cd cogito-x.x
make
make install HOME=/usr

which will put the the cogito executables in the /usr/bin directory. Cogito does not have many depencies, but you may need openssl-devel, zlib-devel, as well as the curl library:

http://curl.haxx.se/download.html

 

REPOSITORIES

There are two ways to create a git repository. The first is to clone someone else's, and the second is to create your own if you're working on a brand new project. In either case, a folder will be created that contains the code for your branch, and an "invisible" .git folder. The .git folder is git's "database," where it tracks changes in your code, as well as (if you ask it to) various additional branches from other developers.

If you follow the instructions in "keeping cogito up to date," you'll create a repository for the cogito/git codebase. Alternatively, skip to "starting your own project" to create your own code repository and make it available to other developers.

Of course, you can use cogito to manage any number of projects, including cogito itself. For each project, you'll have a folder with the codebase, and inside that folder, a ".git" folder. All of cogito's commands are dependent on your location in the filesystem, i.e. upon which project you're currently working.

Some of the available documentation doesn't make this clear, so it's worth re-iterating - every project that you manage with Cogito will have its own repository. Thus, it's very important to keep track of precisely which directory you're in when you issue commands in this howto, and of course while working on your own project.

 

A NOTE ON BRANCHES

The raison d'etre of Git is branches. Centralized repositories, such as CVS and Subverision, serve to integrate the work of many people into a single codebase. Distributed SCM's often struggle with the problem of how to provide a "central" or "core" codebase across a distributed development network. Git actually embraces these differences.

When many people use git, each person will have their own unique branch. If everyone continously pulls and merges changes from other people's branches, everyone will have the same codebase. But Git actually facilitates the developer's ability to work on their particular change set, compare their work with others', and merge the work from many developers into their own current branch. Ultimately, putting out a new revision of a given codebase will require someone to merge changes from various branches, and post this "master" version as a new revision. While this may at first seem like extra work, it in fact provides a way for many developers to work in different ways on the same problem, compare and contrast their work, and gives the project manager the ability to draw on the best work from many people in order to build the most current and stable version of a given product.

 

KEEPING COGITO UP TO DATE

Being very new software, cogito, and the underlying git codebase change frequently. One nice thing about Cogito is that you can use it on itself to keep it up to date.

After first installing Cogito, execute the following command. Note that this will create a new repository that is a clone of the remote repository within the directory you are currently in. If you've been following these instructions, you're probably in /usr/local/cogito-x.x, and the repository will be inside that directory. You may also wish to put the repository in /usr/local, and simply delete the original codebase (since you'll now be using cogito to keep itself up to date).

[cd to some appropriate directory as desired]
cg-clone rsync://rsync.kernel.org/pub/scm/cogito/cogito.git

This will set up a cogito / git repository of the cogito codebase in a subdirectory of your current directory, named either cogito.git, or simply cogitio. Now you can add Linus' git repository if you like (not a bad idea):

cd cogito.git (or possibly just "cd cogito")
cg-branch-add linus_git rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git
cg-pull linus_git

If you've followed these directions explicitly, you'll have the original source code in /usr/local/cogito-x.x, and the git repository inside of that, /usr/local/cogito-x.x/cogito. To keep cogito up to date, change into the repository diretory if you're not there already, and issue the following commmands:

[ cd /usr/local/cogito-x.x/cogito ]
cg-update
make
make install HOME=/usr

 

USING COGITO TO WORK ON AN EXISTING PROJECT

If you're getting involved in someone else's project, you'll want to start with their code. Links to people's codebases may come in various forms, including rsync, http, and ftp, depending on how they've chosen to provide access. Your branch will ultimately need to be accessible by one of these three methods, but need not be the same as other people's branches. In other words, Linus may offer an http: branch, Pasky might use rsync:, and you might choose ftp:.

If you already set up the cogito repository above ("keeping cogito up to date"), then you've already set up a code repository using Cogito. The process is the same for any software development project. Start by creating a folder that will serve as your code repository, clone a remote branch, and you're off and running.

mkdir projectFoobar
cd projectFoobar
cg-clone <link to remote branch>

This will create your initial repository. It's important to note that as there may be many branches of a given project, whichever branch you initially clone will be your "origin". If you know this origin is the most current and up to date branch, you can now begin development on the code in the repository.

You can now make changes in your repository by editing the code. Once you're satisfied, you check in any changes with:

cg-commit

If you add new files, tell cogito about them with:

cg-add
cg-commit

You can also add additional branches from other developers:

cg-branch-add
cg-pull

Visually, you'll still have a single codebase representing your origin's branch (assuming you haven't changed it :-). The additional branches are stored using git's magic as "objects" inside the .git/objects folder. Note that the is the name you give that branch - what they call their branch may differ. In other words, is a mnemonic for your convenience, and not defined by their actual branch.

At any later point, you can then get recent changes from with:

cg-pull

Note that this brings in changes to that branch but does not commit them to the branch. In other words, if there have been changes, you will now have two distinct versions of that branch. You can see the changes by:

cg-diff
cg-diff -r origin: [ > outputfilename.diff ]

You can accept the changes with:

cg-merge
cg-merge -b origin

You can also combine the above three steps, and simply update to its current remote state by:

cg-update

In order to see the changes relative to the original branch that you cloned, you would:

cg-pull origin
cg-diff origin
cg-merge origin

Or simply:

cg-update origin

It's important to note that at this point, "origin" is a branch. Issuing the "cg-update origin" command will update the origin branch to the its current state, relative to the state it was cloned in (or last updated to). Your branch is still your branch, and may now differ from its origin. To see those differences:

cg-diff

 

GETTING HELP

Print out a list of current commands using:

cg-help

To learn more about a specific command, use:

cg-help

For instance, to learn more about 'cg-merge', type:

cg-help merge

You can omit the 'cg-' when getting help on a cg command.

 

STARTING YOUR OWN PROJECT

To start you very own project using cogito, simply change into your project directory (or create it), and issue the command:

[ mkdir foobar ]
cd foobar
cg-init

Now, create or copy files into your project directory, and add them using:

cg-add

Finally, commit these changes using:

cg-commit

 

SHARING YOUR WORK

To share your repository with other developers, make /.git available via HTTP. For instance:

ln -s /usr/local//.git /home/httpd/html/git

You can provide git via http, rsync, or ftp, which you use is up to you. Every developer can make a different choice, git doesn't care, so long as you provide the appropriate link to someone else who wants to pull from your branch.

 

ADVANCED

Ok, this is where there would be a thourough-going explanation of all of cogito's commands, as well as a discussion of how cogito acutally manages revisions. Definitely something on trees and branches and the distinction, and finally, a discussion of why git does not (and does not need to) track file renames, but instead allows code to be tracked through time, even as it moves from file to file.

Instead, I present this excellent tutorial from the author of Git:

 

A GIT TUTORIAL FROM LINUS

As mentioned above, this document really deals with cogito, an SCM wrapper around git. Git is the plumbing that provides Cogito its functionality. It's almost certain that other SCM's will be developed around git (and there are already SCM's that are now wrapping themselves around git).

What follows is a tutorial on using git, the plumbing, directly. Using cogito makes project management cleaner, and more "palatable" in Linus' words, but the following will be of use to anyone wishing to see under the hood, and may prove useful to anyone using Cogito.

A short git tutorial ==================== May 2005

Introduction ------------

This is trying to be a short tutorial on setting up and using a git archive, mainly because being hands-on and using explicit examples is often the best way of explaining what is going on.

In normal life, most people wouldn't use the "core" git programs directly, but rather script around them to make them more palatable. Understanding the core git stuff may help some people get those scripts done, though, and it may also be instructive in helping people understand what it is that the higher-level helper scripts are actually doing.

The core git is often called "plumbing", with the prettier user interfaces on top of it called "porcelain". You may want to know what the plumbing does for when the porcelain isn't flushing...

Creating a git archive ----------------------

Creating a new git archive couldn't be easier: all git archives start out empty, and the only thing you need to do is find yourself a subdirectory that you want to use as a working tree - either an empty one for a totally new project, or an existing working tree that you want to import into git.

For our first example, we're going to start a totally new arhive from scratch, with no pre-existing files, and we'll call it "git-tutorial". To start up, create a subdirectory for it, change into that subdirectory, and initialize the git infrastructure with "git-init-db":

mkdir git-tutorial
cd git-tutorial
git-init-db

to which git will reply

defaulting to local storage area

which is just gits way of saying that you haven't been doing anything strange, and that it will have created a local .git directory setup for your new project. You will now have a ".git" directory, and you can inspect that with "ls". For your new empty project, ls should show you three entries:

- a symlink called HEAD, pointing to "refs/heads/master"

Don't worry about the fact that the file that the HEAD link points to dosn't even exist yet - you haven't created the commit that will start your HEAD development branch yet.

- a subdirectory called "objects", which will contain all the git SHA1 objects of your project. You should never have any real reason to look at the objects directly, but you might want to know that these objects are what contains all the real _data_ in your repository.

- a subdirectory called "refs", which contains references to objects.

In particular, the "refs" subdirectory will contain two other subdirectories, named "heads" and "tags" respectively. They do exactly what their names imply: they contain references to any number of different "heads" of development (aka "branches"), and to any "tags" that you have created to name specific versions of your repository.

One note: the special "master" head is the default branch, which is why the .git/HEAD file was created as a symlink to it even if it doesn't yet exist. Bascially, the HEAD link is supposed to always point to the branch you are working on right now, and you always start out expecting to work on the "master" branch.

However, this is only a convention, and you can name your branches anything you want, and don't have to ever even _have_ a "master" branch. A number of the git tools will assume that .git/HEAD is valid, though.

[ Implementation note: an "object" is identified by its 160-bit SHA1 hash, aka "name", and a reference to an object is always the 40-byte hex representation of that SHA1 name. The files in the "refs" subdirectory are expected to contain these hex references (usually with a final '\n' at the end), and you should thus expect to see a number of 41-byte files containing these references in this refs subdirectories when you actually start populating your tree ]

You have now created your first git archive. Of course, since it's empty, that's not very useful, so let's start populating it with data.

Populating a git archive ------------------------

We'll keep this simple and stupid, so we'll start off with populating a few trivial files just to get a feel for it.

Start off with just creating any random files that you want to maintain in your git archive. We'll start off with a few bad examples, just to get a feel for how this works:

echo "Hello World" > a
echo "Silly example" > b

you have now created two files in your working directory, but to actually check in your hard work, you will have to go through two steps:

- fill in the "cache" aka "index" file with the information about your working directory state

- commit that index file as an object.

The first step is trivial: when you want to tell git about any changes to your working directory, you use the "git-update-cache" program. That program normally just takes a list of filenames you want to update, but to avoid trivial mistakes, it refuses to add new entries to the cache (or remove existing ones) unless you explicitly tell it that you're adding a new entry with the "--add" flag (or removing an entry with the "--remove") flag.

So to populate the index with the two files you just created, you can do

git-update-cache --add a b

and you have now told git to track those two files.

In fact, as you did that, if you now look into your object directory, you'll notice that git will have added two ne wobjects to the object store. If you did exactly the steps above, you should now be able to do

ls .git/objects/??/*

and see two files:

.git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238
.git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962

which correspond with the object with SHA1 names of 557db... and f24c7.. respectively.

If you want to, you can use "git-cat-file" to look at those objects, but you'll have to use the object name, not the filename of the object:

git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238

where the "-t" tells git-cat-file to tell you what the "type" of the object is. Git will tell you that you have a "blob" object (ie just a regular file), and you can see the contents with

git-cat-file "blob" 557db03de997c86a4a028e1ebd3a1ceb225be238

which will print out "Hello World". The object 557db... is nothing more than the contents of your file "a".

[ Digression: don't confuse that object with the file "a" itself. The object is literally just those specific _contents_ of the file, and however much you later change the contents in file "a", the object we just looked at will never change. Objects are immutable. ]

Anyway, as we mentioned previously, you normally never actually take a look at the objects themselves, and typing long 40-character hex SHA1 names is not something you'd normally want to do. The above digression was just to show that "git-update-cache" did something magical, and actually saved away the contents of your files into the git content store.

Updating the cache did something else too: it created a ".git/index" file. This is the index that describes your current working tree, and something you should be very aware of. Again, you normally never worry about the index file itself, but you should be aware of the fact that you have not actually really "checked in" your files into git so far, you've only _told_ git about them.

However, since git knows about them, you can how start using some of the most basic git commands to manipulate the files or look at their status.

In particular, let's not even check in the two files into git yet, we'll start off by adding another line to "a" first:

echo "It's a new day for git" >> a

and you can now, since you told git about the previous state of "a", ask git what has changed in the tree compared to your old index, using the "git-diff-files" command:

git-diff-files

oops. That wasn't very readable. It just spit out its own internal version of a "diff", but that internal version really just tells you that it has noticed that "a" has been modified, and that the old object contents it had have been replaced with something else.

To make it readable, we can tell git-diff-files to output the differences as a patch, using the "-p" flag:

git-diff-files -p

which will spit out

diff --git a/a b/a
--- a/a
+++ b/a
@@ -1 +1,2 @@
Hello World
+It's a new day for git

ie the diff of the change we caused by adding another line to "a".

In other words, git-diff-files always shows us the difference between what is recorded in the index, and what is currently in the working tree. That's very useful.

Committing git state --------------------

Now, we want to go to the next stage in git, which is to take the files that git knows about in the index, and commit them as a real tree. We do that in two phases: creating a "tree" object, and committing that "tree" object as a "commit" object together with an explanation of what the tree was all about, along with information of how we came to that state.

Creating a tree object is trivial, and is done with "git-write-tree". There are no options or other input: git-write-tree will take the current index state, and write an object that describes that whole index. In other words, we're now tying together all the different filenames with their contents (and their permissions), and we're creating the equivalent of a git "directory" object:

git-write-tree

and this will just output the name of the resulting tree, in this case (if you have does exactly as I've described) it should be

3ede4ed7e895432c0a247f09d71a76db53bd0fa4

which is another incomprehensible object name. Again, if you want to, you can use "git-cat-file -t 3ede4.." to see that this time the object is not a "blob" object, but a "tree" object (you can also use git-cat-file to actually output the raw object contents, but you'll see mainly a binary mess, so that's less interesting).

However - normally you'd never use "git-write-tree" on its own, because normally you always commit a tree into a commit object using the "git-commit-tree" command. In fact, it's easier to not actually use git-write-tree on its own at all, but to just pass its result in as an argument to "git-commit-tree".

"git-commit-tree" normally takes several arguments - it wants to know what the _parent_ of a commit was, but since this is the first commit ever in this new archive, and it has no parents, we only need to pass in the tree ID. However, git-commit-tree also wants to get a commit message on its standard input, and it will write out the resulting ID for the commit to its standard output.

And this is where we start using the .git/HEAD file. The HEAD file is supposed to contain the reference to the top-of-tree, and since that's exactly what git-commit-tree spits out, we can do this all with a simple shell pipeline:

echo "Initial commit" | git-commit-tree $(git-write-tree) > .git/HEAD

which will say:

Committing initial tree 3ede4ed7e895432c0a247f09d71a76db53bd0fa4

just to warn you about the fact that it created a totally new commit that is not related to anything else. Normally you do this only _once_ for a project ever, and all later commits will be parented on top of an earlier commit, and you'll never see this "Committing initial tree" message ever again.

Making a change ---------------

Remember how we did the "git-update-cache" on file "a" and then we changed "a" afterwards, and could compare the new state of "a" with the state we saved in the index file?

Further, remember how I said that "git-write-tree" writes the contents of the _index_ file to the tree, and thus what we just committed was in fact the _original_ contents of the file "a", not the new ones. We did that on purpose, to show the difference between the index state, and the state in the working directory, and how they don't have to match, even when we commit things.

As before, if we do "git-diff-files -p" in our git-tutorial project, we'll still see the same difference we saw last time: the index file hasn't changed by the act of committing anything. However, now that we have committed something, we can also learn to use a new command: "git-diff-cache".

Unlike "git-diff-files", which showed the difference between the index file and the working directory, "git-diff-cache" shows the differences between a committed _tree_ and the index file. In other words, git-diff-cache wants a tree to be diffed against, and before we did the commit, we couldn't do that, because we didn't have anything to diff against.

But now we can do

git-diff-cache -p HEAD

(where "-p" has the same meaning as it did in git-diff-files), and it will show us the same difference, but for a totally different reason. Now we're not comparing against the index file, we're comparing against the tree we just wrote. It just so happens that those two are obviously the same.

"git-diff-cache" also has a specific flag "--cached", which is used to tell it to show the differences purely with the index file, and ignore the current working directory state entirely. Since we just wrote the index file to HEAD, doing "git-diff-cache --cached -p HEAD" should thus return an empty set of differences, and that's exactly what it does.

However, our next step is to commit the _change_ we did, and again, to understand what's going on, keep in mind the difference between "workign directory contents", "index file" and "committed tree". We have changes in the working directory that we want to commit, and we always have to work through the index file, so the first thing we need to do is to update the index cache:

git-update-cache a

(note how we didn't need the "--add" flag this time, since git knew about the file already).

Note what happens to the different git-diff-xxx versions here. After we've updated "a" in the index, "git-diff-files -p" now shows no differences, but "git-diff-cache -p HEAD" still _does_ show that the current state is different from the state we committed. In fact, now "git-diff-cache" shows the same difference whether we use the "--cached" flag or not, since now the index is coherent with the working directory.

Now, since we've updated "a" in the index, we can commit the new version. We could do it by writing the tree by hand, and committing the tree (this time we'd have to use the "-p HEAD" flag to tell commit that the HEAD was the _parent_ fo the new commit, and that this wasn't an initial commit any more), but the fact is, git has a simple helper script for doing all of the non-initial commits that does all of this for you, and starts up an editor to let you write your commit message yourself, so let's just use that:

git-commit-script

Write whatever message you want, and all the lines that start with '#' will be pruned out, and the rest will be used as the commit message for the change. If you decide you don't want to commit anything after all at this point (you can continue to edit things and update the cache), you can just leave an empty message. Otherwise git-commit-script will commit the change for you.

(Btw, current versions of git will consider the change in question to be so big that it's considered a whole new file, since the diff is actually bigger than the file. So the helpful comments that git-commit-script tells you for this example will say that you deleted and re-created the file "a". For a less contrieved example, these things are usually more obvious).

You've now made your first real git commit. And if you're interested in looking at what git-commit-script really does, feel free to investigate: it's a few very simple shell scripts to generate the helpful (?) commit message headers, and a few one-liners that actually do the commit itself.

Checking it out ---------------

While creating changes is useful, it's even more useful if you can tell later what changed. The most useful command for this is another of the "diff" family, namely "git-diff-tree".

git-diff-tree can be given two arbitrary trees, and it will tell you the differences between them. Perhaps even more commonly, though, you can give it just a single commit object, and it will figure out the parent of that commit itself, and show the difference directly. Thus, to get the same diff that we've already seen several times, we can now do

git-diff-tree -p HEAD

(again, "-p" means to show the difference as a human-readable patch), and it will show what the last commit (in HEAD) actually changed.

More interestingly, you can also give git-diff-tree the "-v" flag, which tells it to also show the commit message and author and date of the commit, and you can tell it to show a whole series of diffs. Alternatively, you can tell it to be "silent", and not show the diffs at all, but just show the actual commit message.

In fact, together with the "git-rev-list" program (which generates a list of revisions), git-diff-tree ends up being a veritable fount of changes. A trivial (but very useful) script called "git-whatchanged" is included with git which does exactly this, and shows a log of recent activity.

To see the whole history of our pitiful little git-tutorial project, we can do

git-whatchanged -p --root HEAD

(the "--root" flag is a flag to git-diff-tree to tell it to show the initial aka "root" commit as a diff too), and you will see exactly what has changed in the repository over its short history.

With that, you should now be having some incling of what git does, and can explore on your own.

[ to be continued.. cvs2git, tagging versions, branches, merging.. ]

 

RELATED LINKS

The Cogito README has more detailed instructions on cogito's various commands:

http://www.kernel.org/pub/software/scm/cogito/README

A brief git tutorial:

http://ksit.dynalias.com/articles.php?art_id=41&s_id=46

Some background on git:

http://kerneltrap.org/node/4982
http://computerworld.com.sg/ShowPage.aspx?pagetype=2&articleid=1176&pubid=3&issueid=49