Friday, June 28, 2013

Git Branches Are Not Branches

Git branches have confused me (someone who uses mercurial a lot and git a little) for a while, I have finally realized why. The problem is that git branch is a poorly chosen name for the thing that they really are. You see, all the changeset history in git is stored as a Directed Acyclic Graph (DAG). The code history might be simple and linear which will make the DAG have a simple path like so (o's are nodes in the graph, called changesets, -'s are references from one node to another, with time progressing from left to right):

o-o-o-o-o

Or the code history and corresponding DAG could be more complicated:

                 o-o-o
                /     
    o-o-o     o-o-o-o-o
   /     \   /         \
o-o-o-o-o-o-o-o-o-o-----o-o

Most English language speakers would agree that those parts of the DAG (code history) where a node has two children (representing two parallel lines of development) are called, branches. The above example has four branches in the history, four branches in the DAG, right? The confusion with git branches, however, is that the above diagram may actually represent a git repository with only one git branch, and the diagram above that with the linear history could represent a git repository with any number git branches. A git branch is not a branch in the DAG representation of the changeset history.

The reason this is possible is because a git branch is actually just a label attached to a changeset. It's just a name associated with a node in the DAG, and you can add labels to any node you want. You can also delete these labels any time you want as well. I believe the git developers chose to use the term branch for these labels because the labels are primarily used to keep track of DAG branches, but in practice the overloading of the term causes a lot of confusion. When a git users says he's deleting a branch, he's really just deleting the label on the branch in the DAG. When a git user shows you a linear history like in the first diagram and then starts talking about the branches contained in that history, he's really just talking about the different labels applied to various changesets in that history.

Labels such as these are very common in computer programs and there are a number of common English terms that convey a much more clear picture of their function and purpose: label, tag, pointer, and bookmark come to mind. There are pages and pages of explanation on the internet that try to explain and clarify what git branches are and what you can and can't do with them, when, I believe, using a better name would alleviate the need for most of that. Personally, I now just say label or tag or bookmark in my head whenever I read branch in a git context and things are much less confusing.

I hope that helps someone besides me who is learning git. Next week I'll talk about how the git index is nothing like an index :-)

(By the way, if you have a choice in which to use, mercurial works about the same as git and has better names for things)

Tuesday, June 11, 2013

SystemVerilog Constraint Gotcha

I found another one (I guess I still need to order that book). In using the UVM, I have some sequences randomizing other sub sequences. I really want it to work like this (simplified, non-UVM) example:

class Foo;
   rand int bar;
   function void display();
      $display("bar: %0d", bar);
   endfunction 
endclass

class Bar;
   int bar;
   function void body();
      Foo foo = new();
      bar = 3;
      foo.display();
      assert(foo.randomize() with {bar == this.bar;});
      foo.display();
   endfunction 
endclass

module top;
   Bar bar;
   initial begin 
      bar = new();
      bar.body();
   end 
endmodule 

See the problem there? Here's what prints out when you run the above:

bar: 0
bar: -1647275392

foo.bar is not constrained to be 3 like you might expect. That's because this.bar refers to bar that is a member of class Foo, not bar that's a member of class Bar. As far as I can tell, there is no way to refer to bar that is a member of Bar in the constraint. I guess Foo could have a reference back up to Bar, but that's really awkward. Has anyone else run into this? How do you deal with it?

UPDATE: Thank you to Mihai Oncica for pointing out that the local keyword with the scope resolution operator can be used to solve this problem. Here is the now working code example:

class Foo;
   rand int bar;
   function void display();
      $display("bar: %0d", bar);
   endfunction 
endclass

class Bar;
   int bar;
   function void body();
      Foo foo = new();
      bar = 3;
      foo.display();
      assert(foo.randomize() with {bar == local::bar;});
      foo.display();
   endfunction 
endclass

module top;
   Bar bar;
   initial begin 
      bar = new();
      bar.body();
   end 
endmodule 

And here is the result:

bar: 0
bar: 3