DVCS and DAGs, Part 2
1 of this article, I talked about the differences between modeling version
control history as a DAG vs. a Line. The two most noteworthy kinds of feedback
I received on this entry were:
- Several people accused me of spreading pro-Line-model FUD
because I mentioned some of the problems that happen with the DAG model
and stopped short of saying that the DAG model is going to cure cancer,
eliminate global warming and bring peace to the Middle East.
- Several people asked me how I drew those really cool
Before I continue with Part 2, allow me to briefly respond
to these two pieces of feedback.
My response about DVCS advocacy
Yes, my company ships a version control tool that is built
on the Line model of history. Therefore, any DVCS is, to a certain extent, a
competitor to my product.
I further acknowledge that I am breaking the rules.
- Business folks like me aren't supposed to ever say
anything positive about their competitors.
- Our job is to feel threatened by change, and to spread
that fear around to others.
- We're supposed to pretend like we don't know that every
design choice has tradeoffs, and to insist that our way is better in all
As my Mother can confirm, I don't always follow the rules
very well. :-)
The simple fact is that I find this stuff interesting. I have
been working in the version control industry for over a decade. I am writing a
book on the topic. This is what I do. It's interesting to me.
But there is more happening here than just me being an
entrepreneurial rebel. Let's see -- how can I say this nicely?
You Git fans need to chill.
Seriously, rabid advocacy by Git fans is making the world a
lousy place to live. Git is really cool, but it is not the right tool for
In their defense, let's acknowledge that the apple didn't
fall far from the tree on this particular issue. When people begin exploring
DVCS, often one of the first things they find is the video of Linus Torvalds
and his 2007 presentation about Git. And what they find there is someone who doesn't
seem to get it.
Folks, Subversion is probably the most popular version
control tool in the world right now. Almost everyone using a version control
tool today is using one that is built on the Line model of history, and they're
using these tools successfully and productively. When someone refuses to
acknowledge any validity in that model, they look clueless.
The Torvalds video has done plenty of damage. That kind of
attitude is a big turn-off for people interested in what's new in the world of
So, my fellow admirers of Git, if you are trying to prevent
people from using DVCS tools and make sure that they stay confined to their
current niches, then keep up the good work.
But if you really want to help the world see the benefits of
Git and similar tools, then start realizing that people were getting productive
work done before they existed.
My response about those cool diagrams
My DAG pictures were drawn by SourceGear's graphic artist,
John Woolley, who also did all the artwork for the Evil Mastermind comic books.
John is doing the layout and illustration work for my upcoming source control
book as well.
However, because John's DAG pictures got more praise than my
"thousand words", I have decided to be bitter and refuse to include any of his
work in this blog entry. :-)
OK, let's talk more about DAGs
As I mentioned in Part 1, if a DAG is allowed to grow
without guidance, things can turn into a real mess. DAGs are easier to
create. Lines are easier to use. As soon as we embrace the DAG model to gain
all its benefits, the very next thing that happens is that we want Lines back.
This is why every DVCS has features that can be used to make
sure the DAG grows with guidance. Those features are designed to discourage
people from committing without taking any responsibility for the complexity
that increases every time we add another point of divergence.
In other words, every DVCS has features that allow
developers to take a piece of the DAG and treat it like a Line.
Git guides the growth of the DAG through its support for named
branches. You are discouraged from committing something unless its parent is a
So, if I use the git checkout command to point my working
directory to a DAG node which is not a leaf, Git politely fusses at me:
eric$ git checkout 9542b
Note: moving to "9542b" which isn't a local branch
If you want to create a new branch from this checkout, you may do so
(now or later) by using -b with the checkout command again. Example:
git checkout -b <new_branch_name>
HEAD is now at 9542b5f... initial
If you only commit things that are based on the leaf, then
your history stays very Line-like.
Historically, Mercurial has been described as supporting
only one branch per repository instance. Comparisons to Git often focused on
Mercurial's apparent lack of inter-repository branching.
I speak in the past tense here because I have heard that
Mercurial has added additional features in this area.
I mention Mercurial here only so that its fans don't feel
too left out. I can't speak from much experience using this particular tool.
Still, I feel comfortable citing Mercurial as an example of
my point: In [at least] its early releases, Mercurial was guiding the growth
of the DAG by preventing the user from diverging it. This almost certainly
contributed to the widespread perception of Mercurial as a very easy-to-use
This tool is the DVCS I have used the most, but I still
can't call myself an expert. From my own experience, I would characterize Bzr as
a tool that works very hard to guide the growth of the DAG.
Whenever I push changes from my local repo to a central
server, Bzr requires me to merge in other changes and commit from the leaf,
just like a Line model tool would do.
It's rather cool that Bazaar offers me the option of using a
central server instead of as a pure DVCS. But in this mode, the same basic
restriction applies: I can't commit anything unless my baseline is the leaf in
When I use Bzr, it usually feels like I am using a
My own preferences
On this particular issue, I actually prefer Git's way of
Bazaar seems to believe that DAG divergence is only
legitimate when it happens in separate repo instances and must be resolved
before anything can be pushed or committed together. This just feels too
heavy-handed for a DVCS. Once I know about the DAG, I want to be allowed to
think that way. I don't mind being warned when I am about to commit a DAG node
which would have an older sibling. But forcing me to merge in order to commit
feels very un-DVCS-like to me.
I like Git's ability to switch my baseline using "git
checkout branchname". I understand that people who are not accustomed to
thinking about the DAG do find this capability to be unintuitive. But I like
Note that I still like Line-model tools like Subversion and
Vault as well. I'm just saying that a DAG-model tool should act like one.
Lately, the DVCS which intrigues me the most is Fossil. It
was written by D. Richard Hipp, the same guy who wrote SQLite.
Fossil has a number of interesting features. Most notable
is the built-in support for bug tracking. This is one area where the other
DVCS's all fail. They bring you distributed version control, but when it comes
time for a developer to update the bug tracking system, things suddenly go
back to the centralized world.
Anyway, I'm just getting started with looking closely at
Fossil, but I do like the way its website talks
about this problem of DAG divergence:
Having more than one leaf in the
check-in tree is usually considered undesirable, and so forks are usually
either avoided entirely, as in figure 1, or else quickly resolved as shown in
figure 3. But sometimes, one does want to have multiple leaves. For example, a
project might have one leaf that is the latest version of the project under
development and another leaf that is the latest version that has been tested.
When multiple leaves are desirable, we call the phenomenon branching instead of
Nice. So far, I get the impression that Fossil works like
Git does in this respect. When the DAG diverges, complexity increases. Feel
free to offer me a little protection from that complexity by informing me of
what's going on. But don't get in my way.