Git is a purely functional data structure (2013) (opens in new tab)

(blog.jayway.com)

227 pointseisokant8y ago97 comments

97 comments

I sometimes like to explain things the other way around. Immutability being version control for program state.

I rarely use functional programming but I certainly see its appeal for certain things.

I think the concept of immutability confuses people. It really clicked for me when I stopped thinking of it in terms of things not being able to change and started instead to think of it in terms of each version of things having different names, somewhat like commits in version control.

Functional programming makes explicit, not only which variables you are accessing, but which version of it.

It may seem like you are copying variables every time you want to modify them but really you are just giving different mutations, different names. This doesn't mean things are actually copied in memory. The compiler doesn't need to keep every versions. If it sees that you are not going to reference a particular mutation, it might just physically overwrite it with the next mutation. In the background "var a=i, var b=a+j", might compile as something like "var b = i; b+=j";

runeks8y ago

> I think the concept of immutability confuses people.

I think it confuses people because it’s framed oddly. Immutability isn’t about being unable to mutate state, it’s about no longer using containers (registers) as variables, such that the equal operator actually means “equals” as opposed to “store”.

In most programming languages, the equal operator works as a “store” operation, which stores a value in a named container/register. In “immutable by default” languages like Haskell, the equal operator actually means “equals”, as in “is synonymous with”.

The essence of immutability is referencing values directly, through synonyms, as opposed to storing them in named registers for later retrieval. When it’s done this way, immutability no longer makes sense: is the number 3 mutable? Can the number 3 be mutated into 4, or are they just two distinct numbers?

canes1234568y ago

I think I understand the point you are making but this is way more confusing for me. Partly this can be blamed on imperative lanuages and = vs ==.

2 more replies

sobellian8y ago

> In the background "var a=i, var b=a+j", might compile as something like "var b = i; b+=j";

Believe it or not, many compilers internally represent mutable variables as a sequence of immutable variables: https://en.wikipedia.org/wiki/Static_single_assignment_form

Edit: I should clarify that I mean "immutable" in the context of primitive values like integers.

macintux8y ago

That's an interesting way to describe it. I've talked a lot about immutability at conferences, but I've never thought about it in those terms. Thanks.

masklinn8y ago

That realisation has been used by environments like Elm's Reactor[0] or its (sadly defunct) Time Traveling debugger. I think Om also had something like that.

Basically, if you use persistent data structures and a unified application state, you can keep a list of all previous application states and you can browse it or ship it for debugging, it's not that expensive.

[0] in debug mode, you get a list of all events having occurred in your application and can instantly move back to that state, and you can import/export that state history: http://elm-lang.org/blog/the-perfect-bug-report

1 more reply

runT1ME8y ago

Nice, good explanation. And one way to look at the State Monad is when you always want the 'latest version' of an immutable 'value' that keeps getting new versions. :)

antoaravinth8y ago

That's good explanation. I guess Immutable.js does uses the same concept behind the scene to retrieve each references i.e like commits. Looks like Immutable.js uses Tries data structures for such operations. May be I'm wrong here as well.

Edmond8y ago

Perhaps for those familiar with "functional data structures" such an analogy is helpful but I find it easier to simply explain git for what it is without adding more exotic nomenclature to it.

Git lets you do version control via full snapshots as opposed to just tracking diffs (even though it does actually do this too behind the scene).

You can think of a full snapshot as saving a copy of your project structure every time you do a commit. The key trick is that git doesn't actually create new copies of the content for each commit but simply maintains a tree structure whose nodes are pointers (via hashing) to the content they represent.

The complication from git is not in understanding the core concept but knowing how best to apply them. There are all sorts of crazy workflows you could implement by manipulating git pointers and their associated patches. As with anything that is flexible, difficulty comes in knowing how to constraint yourself when using it.

pdonis8y ago

> git doesn't actually create new copies of the content for each commit

More precisely, it doesn't create new copies of content that you didn't change. For example, if you have 100 files in your repo and you change one of them and then commit, git creates a new copy of the content of the file you changed--a new blob storing the new file content--and a new tree object that references the new blob instead of the old one, plus the other 99 blobs that store the contents of files you didn't change; the new commit object then references the new tree object (plus the message and metadata). But git never stores diffs between old and new content; it just creates a new blob every time the content of a file changes.

jdmichal8y ago

> But git never stores diffs between old and new content; it just creates a new blob every time the content of a file changes.

Git pack files compress objects by storing them as diff files going backwards. That is, it stores the most recent state in full, then uses patches to go backwards. Because you're more likely to need a recent version in full than an older one.

https://git-scm.com/book/en/v2/Git-Internals-Packfiles

1 more reply

nine_k8y ago

Exactly. Those familiar with functional data structures would thus point you at git and say: heres a structure from the Okasaki book that you use every day.

One more step is to point that futures / promises, and even lists, are monads that a e.g. JS programmer uses every day, too. It reminds me of the old literary character who did not know that he's been speaking prose all his life.

rst8y ago

Particularly since a git repo, as a whole, isn't a functional data structure. The commits (and the graph in which they're embedded) are immutable, but the mapping between branch names and commits gets mutated all the time. (To say nothing of slightly deeper esoterica like the index and stashes.)

masklinn8y ago

That's pretty functional still, the "branches" are just (in clojure terms) atoms, aka atomically mutable references to a structure (namely a commit which is some metadata + a reference to a tree).

kazinator8y ago

> Git lets you do version control via full snapshots as opposed to just tracking diffs.

That's completely orthogonal to whether the version history is immutable with each branch being like singly linked list where we "cons" new things onto the front.

misnome8y ago

That’s the point; because sentences like

> the version history is immutable with each branch being like a singly linked list where we “cons” new things onto the front

Just contributes to the general confusion around git where people decide it’s too complicated to learn.

(Apologies if my sarcasm detector is just broken today)

kazinator8y ago

Git is a purely functional data structure, except for the mutating head pointers, rewriting of tags, various state in the repo related to things like on-going rebases, cherry picks, bisects, ... oh and the index which is one object changing in-place (not to mention working tree, of course).

vickychijwani8y ago

While what you say is technically accurate, I think you're missing the bigger picture here. The author's point still stands: git's commit history (arguably the most important part of a version-control system) can be viewed as a purely-functional data structure, and that view has practical benefits too. I tried to explain more here: https://news.ycombinator.com/item?id=15892013

1 more reply

rubenbe8y ago

I often recommend people to read "Git Internals". If you know how git works internally, it's much easier to understand how it works and the reasons behind it.

https://git-scm.com/book/en/v1/Git-Internals

nine_k8y ago

Even more enlightening is The git Parable.

http://tom.preston-werner.com/2009/05/19/the-git-parable.htm...

randomsearch8y ago

This suggests a poor abstraction.

goialoq8y ago

"Internals" is a poor choice of term. "Data structure" is a better term. Git is "plumbing and porcelain". The plumbing is the core of git. Porcelain are shortcuts. In general, Torvalds projects (Linux, Git) aren't big on abstractions that maximize simplicity-of-use, they focus on doing complex things correctly and quickly. Adding abstraction makes it hard to get details correct and run quickly.

1 more reply

mannykannot8y ago

I think the author has a point in saying that learning Git by trying to map it to Subversion is not the best way to do it, but I don't think analyzing it as a functional data structure adds much insight. To me, it is easier to understand when you look at its purpose, and how it solves the problems of that domain - and the biggest difficulties of version control are on account of the problem being essentially one of distributed, lockless concurrency, something not mentioned in this article.

icc978y ago

I found the explanation from the Immutable JS presentation easier to understand when talking about Immutable data structures [0]

[0]: https://youtu.be/I7IdS-PbEgI?t=5m7s

shurcooL8y ago

Even after 4 years, this remains my favorite, most influential article that helped me understand and feel comfortable with git. It's just a very good analogy.

dustingetz8y ago

for a real database that works like git, see http://www.datomic.com

if git killed svn, datomic kills postgres

xj98y ago

if it were open source maybe, but i'm definitely not going to switch (even though i might want to) for licensing reasons. in fact, i have more motivation to write an libre datomic clone than to pay cognitect anything for their proprietary db.

dustingetz8y ago

do it!

1 more reply

macintux8y ago

As much as I appreciate datomic, that's a poor conclusion to draw. git is objectively better than svn. Postgres is not objectively worse than datomic; there are things that datomic simply can't do efficiently.

dustingetz8y ago

if you go back and read the git vs svn flame wars back when it first came out, people said the same thing. it happens every time there is a paradigm shift technology. the reactjs flame wars of 2013/4 were particularly brutal as everyone and their mom felt qualified to comment. the key idea here is that, at scale, immutability is just better, at nearly everything

Scarbutt8y ago

datomic is to slow on writes, even slower than sqlite, that's because it serializes all writes.

ioquatix8y ago

.... and one day I had the crazy idea to make a database on top of it: https://github.com/ioquatix/relaxo because the underlying immutable data structure makes this quite feasible.

snissn8y ago

Is git a blockchain?

icebraining8y ago

They both use Merkle/Hash trees:

Hash trees are used in the IPFS, Btrfs and ZFS file systems, BitTorrent protocol, Dat protocol, Apache Wave protocol, Git and Mercurial distributed revision control systems, the Tahoe-LAFS backup system, the Bitcoin and Ethereum peer-to-peer networks, the Certificate Transparency framework, and a number of NoSQL systems like Apache Cassandra, Riak and Dynamo.

https://en.wikipedia.org/wiki/Merkle_tree

masklinn8y ago

Mostly the reverse, if you need to relate them a blockchain is a degenerate Git (history, which is a subset of Git itself):

A Git history is a DAG[0] (each commit can have multiple parents) and beyond that a polytree (it can have multiple roots); while a blockchain is an arborescence[2] (there's a single root — the "genesis block"; and each block can only have a single parent).

Further, beyond the technicalities blockchains are generally very linear (the side-chains tend to be pretty short, forks aside) while Git repositories can be extremely broad (have lots of concurrent branches).

[0] https://en.wikipedia.org/wiki/Directed_acyclic_graph

[1] https://en.wikipedia.org/wiki/Polytree

[2] https://en.wikipedia.org/wiki/Arborescence_(graph_theory)

finnthehuman8y ago

Is a hot dog a sandwich?

The answer depends entirely on definition. What properties of bitcoin are essential to a blockchan vs which properties are simply how bitcoin happens to use a blockchain?

If blockchain just means the Merkle tree, then yes.

If it means Merkle tree + a computational consensus system for adding nodes, then no.

linschn8y ago

Short answer, yes. The current commit includes the hash of its parent(s), so its own hash reflects the whole history, and one can not change the history without also changing the current hash. Just like a block contains the hash of the previous block.

sparkie8y ago

That's a Merkle Tree. A blockchain is an application of a Merkle tree in which each node contains transaction data, and a majority of clients agree that the longest chain of blocks is the correct one.

Git also uses a Merkle-DAG, but it is not a blockchain.

2 more replies

doug10018y ago

well the top-level git data structure is pretty close to eg, Scala's Vector, which is an immutable container implemented as a tree with a high branching factor of 32. Modification to such a vector, rebound to a new variable, relies on structural sharing of the original (http://www.codecommit.com/blog/scala/implementing-persistent...)

erikb8y ago

A data structure cannot be functional. I understand what he's trying to say, and agree with most of it, but the word "functional" is purely wrong. What he wants to say is "good". But not all "functional programming" is good, nor is all good programming functional by necessity, despite what your local Lambda The Ultimate nerd tries to tell you.

The Best, when it comes to data structures is a Directed, Acyclic Graph. For instance your typical linux filesystem is a DAG. But there's one problem with DAGs: When they reach a certain complexity human brains are not fit enough to parse them anymore. (programs still can though)

So in many circumstances at least a human programmer needs to take a look at the state of your program and make assumptions about its correctness, which is called debugging. And that's why in Good programs we often use Good data structures instead of The Best.

Good data structures are key->value stores (which you may know as "hash tables" or "dictionaries"), trees, and trees in a simplified special form: lists, each of them being somewhat able to represent the other two, if one can accept a performance hit and/or increased complexity in source code. Dictionaries, trees, lists. That's it. And you do that in every programming language that is at least a little bit interested in being Good.

So there's nothing special or functional about git's data structures, it's just normal Good programming, and a few programmers who are so good at programming that they don't even need to mention it anymore, they breath good programms.

Then of course to the normal bread-earning coder good programs are a rare sight. But the reason is not that they are really rare, the reason is that successful business doesn't really require Good programs to succeed. Mediocre programs are good enough to earn their rent, and most of us spend most of our coding hours to earn our rent.

All that being said, if you don't just want to make money, go and spend some time studying git internals. It will teach you a lot more than most of your teachers/professors taught you combined. Sadly the source code is written by Linux gurus, who like to encrypt their source code with a very special key that only people from their tribe can understand. But the Git Book is actually good enough that you can study quite a lot of the internals from that book. I also suggest writing your own git in your favorite programming language once, to really understand it.

how_gauche8y ago

> A data structure cannot be functional. I understand what he's trying to say, and agree with most of it, but the word "functional" is purely wrong. What he wants to say is "good". But not all "functional programming" is good, nor is all good programming functional by necessity, despite what your local Lambda The Ultimate nerd tries to tell you.

Why is it that whenever functional programming comes up, "real programmers" come out of the woodwork to bloviate, only to expose just how little they actually know about the topic?

"Purely functional" in this case is a jargon term that relates to Okasaki's 1996 PhD thesis, available online (https://www.cs.cmu.edu/~rwh/theses/okasaki.pdf). It's a classification for data structures much in the way "regular" or "context-free" or "turing-complete" serve as classifications for grammars.

For a data structure to be "purely functional" (and no, article author does not mean "good" here) simply means that it's implementable in a pure functional language without mutation. Examples of non-functional data structures would be ones where mutation is intrinsic to how its algorithms work: traditional linear probing hash tables, union-find for graphs, etc.

By this technical criterion, the git object graph is clearly a purely functional data structure, sorry.

erikb8y ago

See the other leaf of this purely functional comment tree for my answer.

1 more reply

LukeShu8y ago

As others have said, "purely functional data structure" is a jargon term with a precise meaning--that fact that this is true indicates that data structures can, in fact, be functional in this sense. To claim that "the word 'functional' is purely wrong" is, well, purely wrong; the data structure in Git being described is purely functional.

Beyond that, you seem to have shown misunderstanding of what that means (despite your claims of "I understand but I disagree"), as the other examples you give DAGs, dictionaries, trees, lists, and this comment thread, do not have the interesting properties of purely functional data structures that make them worth discussing in the context of Git. Namely, the immutability of existing entries.

The thesis of the article isn't that purely functional data structures are Good, or that functional programming is Good. Those are quality judgments that are not present in the article. The thesis is: Explaining Git in terms of the purely functional data structure it uses might be helpful to learning Git.

erikb8y ago

Yes, I didn't accept that part of the logic. Neither do I believe that immutability is something good, nor do I believe that git is very immutable, nor that git gains from immutability.

I believe mutability is a trade off. You can make objects more immutable by being less efficient in terms of storage size, search time, source code simplicity. In exchange for these negative attributes you get source code you can proof more easily. It's good when that attribute is needed, but not most of the time.

Git is only immutable on that meta level where we talk about the object key-value store. The implementation uses mutable files though, and even steps away from that immutable key-value store when the size gets too big and needs to store more efficiently.

And on the user level you can amend, edit, squash and delete commits easily, which is actually a strength of git not a weakness. People love it for being more mutable than SVN.

2 more replies

davidcuddeback8y ago

> A data structure cannot be functional.

I think the author is using the term "purely functional data structure" to mean a data structure that lends itself to an implementation in a purely functional language [1,2].

[1] https://en.wikipedia.org/wiki/Purely_functional_data_structu...

[2] https://www.amazon.com/Purely-Functional-Structures-Chris-Ok...

erikb8y ago

I feel like I already said "I understand but I disagree".

A round wheel is more useful for a car than a triangular wheel. That doesn't mean it's a "car wheel". It's just as good on a horse wagon or a bike.

1 more reply

_0w8t8y ago

Git is not a purely functional data structure [1]

[1] man git-rebase

198702138y ago

But git-rebase does not alter existing commits in the commit tree, it simply creates a new branch (meaning new commits) on the tree.

Simon_says8y ago

Alright wiseguy, git gc.

2 more replies

klodolph8y ago

If Git were purely functional, you would expect rebase not to modify the existing data in any way, and indeed that is exactly what happens. You can create, delete, or modify only the top-level pointers: branch names, reflog, etc. Instead, rebase creates a completely new set of commits, and points the current branch at a new one. This is exactly how functional data structures work in e.g. Haskell, where "inserting" an element into a dictionary means that you get a new copy of the dictionary, and all existing references to the original are unmodified.

_0w8t8y ago

Git rebase alters the structures that are relevant for me, like heads of named branches. In Haskell let bindings are immutable. To reference to the results one has to put them into new bindings. I.e. if Git was purely functional, the rebase would create new names for branches.

4 more replies

fiatjaf8y ago

I don't see that anywhere. A rebase is just a fork.

ianamartin8y ago

I find this conversation fascinating because there is so much disagreement on the meaning of "functional" and "immutable"

What I've gathered so far from reading the article and the comments is that some people who are in the know about a very specific paper agree that Git is a purely functional data structure. And that others look at the ways you can use Git and point a finger and say, "Look! It can be mutated! Therefore it cannot be functional!" And the response to that is, "Don't be so technical about how you define functional. Or immutable. You know it when you see it."

Is this some kind of Obi-wan Kenobi from a certain point of view stuff? Why is this so difficult to get a handle on?

If a thing says immutable on the tin, and it's mutable, how is that purely functional? I know, read the paper. I know. But still, it's a legit question.

It seems to me that a data structure so amazing as being purely functional shouldn't be so easy to misunderstand as what we're seeing here. And it's clearly being misunderstood. And not only by me.

how_gauche8y ago

> some people who are in the know about a very specific paper

This stuff isn't obscure just because you don't happen to know about it already. Chris Okasaki's publications have been cited over 1000 times, mostly for his PhD work -- those papers, especially on functional data structures and amortization analysis in lazy languages, are considered foundational for a whole research area in Computer Science.

> Is this some kind of Obi-wan Kenobi from a certain point of view stuff? Why is this so difficult to get a handle on?

Did you learn calculus overnight, or expect to understand a technical conversation about differential equations and Cauchy sequences without taking two years of classes or reading a couple of really thick books first? Why should this be any different?

> If a thing says immutable on the tin, and it's mutable, how is that purely functional? .... It seems to me that a data structure so amazing as being purely functional shouldn't be so easy to misunderstand as what we're seeing here. And it's clearly being misunderstood. And not only by me.

Sigh. Explaining it properly would involve a tour though the untyped lambda calculus, simply-typed lambda calculus and the Curry-Howard isomorphism, a discussion on denotational vs. operational semantics (Hoare logic, functional interpreters, type-preserving compilation, small-step vs large-step operational semantics, etc).

TL;DR: "purely functional" is a description of the program's meaning (in a technical sense), not its implementation.

ianamartin8y ago

Okay, so if a program means for an ibject to be immutable but it actually is mutable then it’s still immutable if you explain it in terms of basic CS theory.

Got it. Thanks for the attitude.

vickychijwani8y ago

> "Look! It can be mutated! Therefore it cannot be functional!" And the response to that is, "Don't be so technical about how you define functional. Or immutable. You know it when you see it."

I think I can offer something to the discussion here, as I'm straddling the 2 camps - having a fairly intricate understanding of git (I've held training seminars at my company about it), and basic familiarity with functional (technically "persistent") data structures thanks to a long-standing interest in Clojure.

What the functional camp is talking about, when they refer to git as a "purely functional data structure", is the core of git's implementation - commits are immutable snapshots of the repository arranged in a directed acyclic graph (or a tree, if there are no merge commits), and branches are "just" movable pointers to some commit. In this view, everything other than this core model of history as a graph is extra details layered on top.

The other camp (let's call them the "git camp") looks at git in its entirety, including branches, tags, the index, the working tree, stashes, and so on, and they can't help but come to the conclusion you mentioned - "Branches are mutable references! The index is mutable! The working tree is mutable! How can git possibly be called a functional data structure?"

While this 2nd perspective is understandable and technically correct, I think it misses the point the other camp is trying to make: that the immutable commit is the "fundamental unit of git" so to speak, that we interact with everyday, and that most of git's history-manipulating commands (git commit, git commit --amend, git rebase, git merge, git cherry-pick) can be described in terms of purely-functional operations on the commit graph. Once you understand this, many complex git scenarios become easier to understand (rebase, and filter-branch, for example), so this view does have practical value. Then branches, tags, the index, stashes, etc can indeed be understood separately.

In fact, this perspective of git as a data structure is so useful, that I rarely use "git log" as it is, preferring these additional flags to view the entire graph instead:

git log --graph --all --decorate --oneline

I hope that helps, cheers!

ianamartin8y ago

Thank you that. My understanding is that the unit of work in git—the commit—is not actually immutable. Commits can be changed. Am I wrong about that?

1 more reply

j / k navigate · click thread line to collapse

97 comments

BenoitEssiambre8y ago

I sometimes like to explain things the other way around. Immutability being version control for program state.

I rarely use functional programming but I certainly see its appeal for certain things.

Functional programming makes explicit, not only which variables you are accessing, but which version of it.

runeks8y ago

> I think the concept of immutability confuses people.

canes1234568y ago

I think I understand the point you are making but this is way more confusing for me. Partly this can be blamed on imperative lanuages and = vs ==.

2 more replies

sobellian8y ago

> In the background "var a=i, var b=a+j", might compile as something like "var b = i; b+=j";

Believe it or not, many compilers internally represent mutable variables as a sequence of immutable variables: https://en.wikipedia.org/wiki/Static_single_assignment_form

Edit: I should clarify that I mean "immutable" in the context of primitive values like integers.

macintux8y ago

That's an interesting way to describe it. I've talked a lot about immutability at conferences, but I've never thought about it in those terms. Thanks.

masklinn8y ago

That realisation has been used by environments like Elm's Reactor[0] or its (sadly defunct) Time Traveling debugger. I think Om also had something like that.

1 more reply

runT1ME8y ago

Nice, good explanation. And one way to look at the State Monad is when you always want the 'latest version' of an immutable 'value' that keeps getting new versions. :)

antoaravinth8y ago

Edmond8y ago

Perhaps for those familiar with "functional data structures" such an analogy is helpful but I find it easier to simply explain git for what it is without adding more exotic nomenclature to it.

Git lets you do version control via full snapshots as opposed to just tracking diffs (even though it does actually do this too behind the scene).

pdonis8y ago

> git doesn't actually create new copies of the content for each commit

jdmichal8y ago

> But git never stores diffs between old and new content; it just creates a new blob every time the content of a file changes.

https://git-scm.com/book/en/v2/Git-Internals-Packfiles

1 more reply

nine_k8y ago

Exactly. Those familiar with functional data structures would thus point you at git and say: heres a structure from the Okasaki book that you use every day.

rst8y ago

masklinn8y ago

That's pretty functional still, the "branches" are just (in clojure terms) atoms, aka atomically mutable references to a structure (namely a commit which is some metadata + a reference to a tree).

kazinator8y ago

> Git lets you do version control via full snapshots as opposed to just tracking diffs.

That's completely orthogonal to whether the version history is immutable with each branch being like singly linked list where we "cons" new things onto the front.

misnome8y ago

That’s the point; because sentences like

> the version history is immutable with each branch being like a singly linked list where we “cons” new things onto the front

Just contributes to the general confusion around git where people decide it’s too complicated to learn.

(Apologies if my sarcasm detector is just broken today)

kazinator8y ago

vickychijwani8y ago

1 more reply

rubenbe8y ago

I often recommend people to read "Git Internals". If you know how git works internally, it's much easier to understand how it works and the reasons behind it.

https://git-scm.com/book/en/v1/Git-Internals

nine_k8y ago

Even more enlightening is The git Parable.

http://tom.preston-werner.com/2009/05/19/the-git-parable.htm...

randomsearch8y ago

This suggests a poor abstraction.

goialoq8y ago

1 more reply

mannykannot8y ago

icc978y ago

I found the explanation from the Immutable JS presentation easier to understand when talking about Immutable data structures [0]

[0]: https://youtu.be/I7IdS-PbEgI?t=5m7s

shurcooL8y ago

Even after 4 years, this remains my favorite, most influential article that helped me understand and feel comfortable with git. It's just a very good analogy.

dustingetz8y ago

for a real database that works like git, see http://www.datomic.com

if git killed svn, datomic kills postgres

xj98y ago

dustingetz8y ago

do it!

1 more reply

macintux8y ago

dustingetz8y ago

Scarbutt8y ago

datomic is to slow on writes, even slower than sqlite, that's because it serializes all writes.

ioquatix8y ago

.... and one day I had the crazy idea to make a database on top of it: https://github.com/ioquatix/relaxo because the underlying immutable data structure makes this quite feasible.

snissn8y ago

Is git a blockchain?

icebraining8y ago

They both use Merkle/Hash trees:

https://en.wikipedia.org/wiki/Merkle_tree

masklinn8y ago

Mostly the reverse, if you need to relate them a blockchain is a degenerate Git (history, which is a subset of Git itself):

[0] https://en.wikipedia.org/wiki/Directed_acyclic_graph

[1] https://en.wikipedia.org/wiki/Polytree

[2] https://en.wikipedia.org/wiki/Arborescence_(graph_theory)

finnthehuman8y ago

Is a hot dog a sandwich?

The answer depends entirely on definition. What properties of bitcoin are essential to a blockchan vs which properties are simply how bitcoin happens to use a blockchain?

If blockchain just means the Merkle tree, then yes.

If it means Merkle tree + a computational consensus system for adding nodes, then no.

linschn8y ago

sparkie8y ago

Git also uses a Merkle-DAG, but it is not a blockchain.

2 more replies

doug10018y ago

erikb8y ago

how_gauche8y ago

Why is it that whenever functional programming comes up, "real programmers" come out of the woodwork to bloviate, only to expose just how little they actually know about the topic?

By this technical criterion, the git object graph is clearly a purely functional data structure, sorry.

erikb8y ago

See the other leaf of this purely functional comment tree for my answer.

1 more reply

LukeShu8y ago

erikb8y ago

Yes, I didn't accept that part of the logic. Neither do I believe that immutability is something good, nor do I believe that git is very immutable, nor that git gains from immutability.

And on the user level you can amend, edit, squash and delete commits easily, which is actually a strength of git not a weakness. People love it for being more mutable than SVN.

2 more replies

davidcuddeback8y ago

> A data structure cannot be functional.

I think the author is using the term "purely functional data structure" to mean a data structure that lends itself to an implementation in a purely functional language [1,2].

[1] https://en.wikipedia.org/wiki/Purely_functional_data_structu...

[2] https://www.amazon.com/Purely-Functional-Structures-Chris-Ok...

erikb8y ago

I feel like I already said "I understand but I disagree".

A round wheel is more useful for a car than a triangular wheel. That doesn't mean it's a "car wheel". It's just as good on a horse wagon or a bike.

1 more reply

_0w8t8y ago

Git is not a purely functional data structure [1]

[1] man git-rebase

198702138y ago

But git-rebase does not alter existing commits in the commit tree, it simply creates a new branch (meaning new commits) on the tree.

Simon_says8y ago

Alright wiseguy, git gc.

2 more replies

klodolph8y ago

_0w8t8y ago

4 more replies

fiatjaf8y ago

I don't see that anywhere. A rebase is just a fork.

ianamartin8y ago

I find this conversation fascinating because there is so much disagreement on the meaning of "functional" and "immutable"

Is this some kind of Obi-wan Kenobi from a certain point of view stuff? Why is this so difficult to get a handle on?

If a thing says immutable on the tin, and it's mutable, how is that purely functional? I know, read the paper. I know. But still, it's a legit question.

It seems to me that a data structure so amazing as being purely functional shouldn't be so easy to misunderstand as what we're seeing here. And it's clearly being misunderstood. And not only by me.

how_gauche8y ago

> some people who are in the know about a very specific paper

> Is this some kind of Obi-wan Kenobi from a certain point of view stuff? Why is this so difficult to get a handle on?

TL;DR: "purely functional" is a description of the program's meaning (in a technical sense), not its implementation.

ianamartin8y ago

Okay, so if a program means for an ibject to be immutable but it actually is mutable then it’s still immutable if you explain it in terms of basic CS theory.

Got it. Thanks for the attitude.

vickychijwani8y ago

> "Look! It can be mutated! Therefore it cannot be functional!" And the response to that is, "Don't be so technical about how you define functional. Or immutable. You know it when you see it."

In fact, this perspective of git as a data structure is so useful, that I rarely use "git log" as it is, preferring these additional flags to view the entire graph instead:

git log --graph --all --decorate --oneline

I hope that helps, cheers!

ianamartin8y ago

Thank you that. My understanding is that the unit of work in git—the commit—is not actually immutable. Commits can be changed. Am I wrong about that?

1 more reply

j / k navigate · click thread line to collapse