--- layout: reference ---

book Inspection and Comparison

So now you have a bunch of branches that you are using for short lived topics, long lived features and what not. How do you keep track of them? Git has a couple of tools to help you figure out where work was done, what the difference between two branches are and more.

In a nutshell you can use git log to find specific commits in your project history - by author, date, content or history. You can use git diff to compare two different points in your history - generally to see how two branches differ or what has changed from one version of your software to another.

docs   book git log filter your commit history

We've already seen how to use git log to compare branches, by looking at the commits on one branch that are not reachable from another. (If you don't remember, it looks like this: git log branchA ^branchB. However, you can also use git log to look for specific commits. Here we'll be looking at some of the more commonly used git log options, but there are many. Take a look at the official docs for the whole list.

git log --author look for only commits from a specific author

To filter your commit history to only the ones done by a specific author, you can use the --author option. For example, let's say we're looking for the commits in the Git source code done by Linus. We would type something like git log --author=Linus. The search is case sensitive and also will search the email address. I'll do the example using the -[number] option, which will limit the results to the last [number] commits.

$ git log --author=Linus --oneline -5
81b50f3 Move 'builtin-*' into a 'builtin/' subdirectory
3bb7256 make "index-pack" a built-in
377d027 make "git pack-redundant" a built-in
b532581 make "git unpack-file" a built-in
112dd51 make "mktag" a built-in

git log --since --before filter commits by date authored

If you want to specify a date range that you're interested in filtering your commits down to, you can use a number of options - I use --since and --before, but you can also use --until and --after. For example, if I wanted to see all the commits in the Git project before 3 weeks ago but after April 18th, I could run this (I'm also going to use --no-merges to remove merge commits):

$ git log --oneline --before={3.weeks.ago} --after={2010-04-18} --no-merges
5469e2d Git 1.7.1-rc2
d43427d Documentation/remote-helpers: Fix typos and improve language
272a36b Fixup: Second argument may be any arbitrary string
b6c8d2d Documentation/remote-helpers: Add invocation section
5ce4f4e Documentation/urls: Rewrite to accomodate transport::address
00b84e9 Documentation/remote-helpers: Rewrite description
03aa87e Documentation: Describe other situations where -z affects git diff
77bc694 rebase-interactive: silence warning when no commits rewritten
636db2c t3301: add tests to use --format="%N"

git log --grep filter commits by commit message

You may also want to look for commits with a certain phrase in the commit message. You can use --grep for that. Let's say I knew there was a commit that dealt with using the P4EDITOR environment variable and I wanted to remember what that change looked like - I could find the commit with --grep.

$ git log --grep=P4EDITOR --no-merges
commit 82cea9ffb1c4677155e3e2996d76542502611370
Author: Shawn Bohrer 
Date:   Wed Mar 12 19:03:24 2008 -0500

    git-p4: Use P4EDITOR environment variable when set
    
    Perforce allows you to set the P4EDITOR environment variable to your
    preferred editor for use in perforce.  Since we are displaying a
    perforce changelog to the user we should use it when it is defined.
    
    Signed-off-by: Shawn Bohrer 
    Signed-off-by: Simon Hausmann 

Git will logically OR all --grep and --author arguments. If you want to use --grep and --author to see commits that were authored by someone AND have a specific message content, you have to add the --all-match option. In these examples, I'm going to use the --format option, so we can see who the author of each commit was.

If I look for the commit messages with 'p4 depo' in them, I get these three commits:

$ git log --grep="p4 depo" --format="%h %an %s"
ee4fd1a Junio C Hamano Merge branch 'master' of git://repo.or.cz/git/fastimport
da4a660 Benjamin Sergeant git-p4 fails when cloning a p4 depo.
1cd5738 Simon Hausmann Make incremental imports easier to use by storing the p4 d

If I add a --author=Hausmann argument, instead of further filtering it down to the one commit by Simon, it instead will show me all commits by Simon OR commits with "p4 depo" in the message:

$ git log --grep="p4 depo" --format="%h %an %s" --author="Hausmann"
cdc7e38 Simon Hausmann Make it possible to abort the submission of a change to Pe
f5f7e4a Simon Hausmann Clean up the git-p4 documentation
30b5940 Simon Hausmann git-p4: Fix import of changesets with file deletions
4c750c0 Simon Hausmann git-p4: git-p4 submit cleanups.
0e36f2d Simon Hausmann git-p4: Removed git-p4 submit --direct.
edae1e2 Simon Hausmann git-p4: Clean up git-p4 submit's log message handling.
4b61b5c Simon Hausmann git-p4: Remove --log-substitutions feature.
36ee4ee Simon Hausmann git-p4: Ensure the working directory and the index are cle
e96e400 Simon Hausmann git-p4: Fix submit user-interface.
38f9f5e Simon Hausmann git-p4: Fix direct import from perforce after fetching cha
2094714 Simon Hausmann git-p4: When skipping a patch as part of "git-p4 submit" m
1ca3d71 Simon Hausmann git-p4: Added support for automatically importing newly ap
...

However, if I add a --all-match, I get the results I'm looking for:

$ git log --grep="p4 depo" --format="%h %an %s" --author="Hausmann" --all-match
1cd5738 Simon Hausmann Make incremental imports easier to use by storing the p4 d

git log -S filter by introduced diff

What if you write really horrible commit messages? Or, what if you are looking for when a function was introduced, or where variables started to be used? You can also tell Git to look through the diff of each commit for a string. For example, if we wanted to find which commits modified anything that looked like the function name 'userformat_find_requirements', we would run this: (note there is no '=' between the '-S' and what you are searching for)

$ git log -Suserformat_find_requirements
commit 5b16360330822527eac1fa84131d185ff784c9fb
Author: Johannes Gilger 
Date:   Tue Apr 13 22:31:12 2010 +0200

    pretty: Initialize notes if %N is used
    
    When using git log --pretty='%N' without an explicit --show-notes, git
    would segfault. This patches fixes this behaviour by loading the needed
    notes datastructures if --pretty is used and the format contains %N.
    When --pretty='%N' is used together with --no-notes, %N won't be
    expanded.
    
    This is an extension to a proposed patch by Jeff King.
    
    Signed-off-by: Johannes Gilger 
    Signed-off-by: Junio C Hamano 

git log -p show patch introduced at each commit

Each commit is a snapshot of the project, but since each commit records the snapshot it was based off of, Git can always calculate the difference and show it to you as a patch. That means for any commit you can get the patch that commit introduced to the project. You can either do this by running git show [SHA] with a specific commit SHA, or you can run git log -p, which tells Git to put the patch after each commit. It is a great way to summarize what has happened on a branch or between commits.

$ git log -p --no-merges -2
commit 594f90bdee4faf063ad07a4a6f503fdead3ef606
Author: Scott Chacon schacon@gmail.com
Date:   Fri Jun 4 15:46:55 2010 +0200

    reverted to old class name

diff --git a/ruby.rb b/ruby.rb
index bb86f00..192151c 100644
--- a/ruby.rb
+++ b/ruby.rb
@@ -1,7 +1,7 @@
-class HiWorld
+class HelloWorld
   def self.hello
     puts "Hello World from Ruby"
   end
 end
 
-HiWorld.hello
+HelloWorld.hello

commit 3cbb6aae5c0cbd711c098e113ae436801371c95e
Author: Scott Chacon schacon@gmail.com
Date:   Fri Jun 4 12:58:53 2010 +0200

    fixed readme title differently

diff --git a/README b/README
index d053cc8..9103e27 100644
--- a/README
+++ b/README
@@ -1,4 +1,4 @@
-Hello World Examples
+Many Hello World Examples
 ======================
 
 This project has examples of hello world in

This is a really nice way of summarizing changes or reviewing a series of commits before merging them or releasing something.

git log --stat show diffstat of changes introduced at each commit

If the -p option is too verbose for you, you can summarize the changes with --stat instead. Here is the same log output with --stat instead of -p

$ git log --stat --no-merges -2
commit 594f90bdee4faf063ad07a4a6f503fdead3ef606
Author: Scott Chacon schacon@gmail.com
Date:   Fri Jun 4 15:46:55 2010 +0200

    reverted to old class name

 ruby.rb |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

commit 3cbb6aae5c0cbd711c098e113ae436801371c95e
Author: Scott Chacon schacon@gmail.com
Date:   Fri Jun 4 12:58:53 2010 +0200

    fixed readme title differently

 README |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Same basic information, but a little more compact - it still lets you see relative changes and which files were modified.

docs   book git diff

Finally, to see the absolute changes between any two commit snapshots, you can use the git diff command. This is largely used in two main situations - seeing how two branches differ from one another and seeing what has changed since a release or some other older point in history. Let's look at both of these situations.

To see what has changed since the last release, you can simply run git diff [version] (or whatever you tagged the release). For example, if we want to see what has changed in our project since the v0.9 release, we can run git diff v0.9.

$ git diff v0.9
diff --git a/README b/README
index d053cc8..d4173d5 100644
--- a/README
+++ b/README
@@ -1,4 +1,4 @@
-Hello World Examples
+Many Hello World Lang Examples
 ======================
 
 This project has examples of hello world in
diff --git a/ruby.rb b/ruby.rb
index bb86f00..192151c 100644
--- a/ruby.rb
+++ b/ruby.rb
@@ -1,7 +1,7 @@
-class HiWorld
+class HelloWorld
   def self.hello
     puts "Hello World from Ruby"
   end
 end
 
-HiWorld.hello
+HelloWorld.hello

Just like git log, you can use the --stat option with it.

$ git diff v0.9 --stat
 README  |    2 +-
 ruby.rb |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

To compare two divergant branches, however, you can run something like git diff branchA branchB but the problem is that it will do exactly what you are asking - it will basically give you a patch file that would turn the snapshot at the tip of branchA into the snapshot at the tip of branchB. This means if the two branches have diverged - gone in different directions - it will remove all the work that was introduced into branchA and then add everything that was introduced into branchB. This is probably not what you want - you want the changes added to branchB that are not in branchA, so you really want the difference between where the two branches diverged and the tip of branchB. So, if our history looks like this:

$ git log --graph --oneline --decorate --all
* 594f90b (HEAD, tag: v1.0, master) reverted to old class name
| * 1834130 (erlang) added haskell
| * ab5ab4c added erlang
|/  
*   8d585ea Merge branch 'fix_readme'
...

And we want to see what is on the "erlang" branch compared to the "master" branch, running git diff master erlang will give us the wrong thing.

$ git diff --stat master erlang
 erlang_hw.erl |    5 +++++
 haskell.hs    |    4 ++++
 ruby.rb       |    4 ++--
 3 files changed, 11 insertions(+), 2 deletions(-)

You see that it adds the erlang and haskell files, which is what we did in that branch, but then the output also reverts the changes to the ruby file that we did in the master branch. What we really want to see is just the changes that happened in the "erlang" branch (adding the two files). We can get the desired result by doing the diff from the common commit they diverged from:

$ git diff --stat 8d585ea erlang
 erlang_hw.erl |    5 +++++
 haskell.hs    |    4 ++++
 2 files changed, 9 insertions(+), 0 deletions(-)

That's what we're looking for, but we don't want to have to figure out what commit the two branches diverged from every time. Luckily, Git has a shortcut for this. If you run git diff master...erlang (with three dots in between the branch names), Git will automatically figure out what the common commit (otherwise known as the "merge base") of the two commit is and do the diff off of that.

$ git diff --stat master erlang
 erlang_hw.erl |    5 +++++
 haskell.hs    |    4 ++++
 ruby.rb       |    4 ++--
 3 files changed, 11 insertions(+), 2 deletions(-)
$ git diff --stat master...erlang
 erlang_hw.erl |    5 +++++
 haskell.hs    |    4 ++++
 2 files changed, 9 insertions(+), 0 deletions(-)

Nearly every time you want to compare two branches, you'll want to use the triple-dot syntax, because it will almost always give you what you want.

As a bit of an aside, you can also have git manually calculate the merge-base (first common ancestor commit) of any two commits would be with the git merge-base command:

$ git merge-base master erlang
8d585ea6faf99facd39b55d6f6a3b3f481ad0d3d

So, you can do the equivalent of git diff master...erlang by running this:

$ git diff --stat $(git merge-base master erlang) erlang
 erlang_hw.erl |    5 +++++
 haskell.hs    |    4 ++++
 2 files changed, 9 insertions(+), 0 deletions(-)

I would of course recommend using the easier syntax, though.

In a nutshell you can use git diff to see how a project has changed since a known point in the past or to see what unique work is in one branch since it diverged from another. Always use git diff branchA...branchB to inspect branchB relative to branchA to make things easier.

And that's it! For more information, try reading the Pro Git book.