I don't mean a non-sucky CLI for git. I mean something more fundamental, something that connects with common programming workflows so well that we can stop discussing the tool altogether.
I'm not sure what that would be, but I hope that one day someone smarter than me will invent it.
Bzr just does what you mean. Revert reverts, pull pulls, merge merges. I don't have to remember whether I need a soft or hard reset or which takes a file as an argument or which doesn't or which can potentially destroy my changes (also, no command in bzr ever destroys your changes, not even hard reset, it keeps backups you must delete yourself).
This makes for a simpler mental model and also makes it simpler to keep separate things separate (it also has its downsides, but life is full of trade-offs). It also makes it easier to visualize the revision history and allows one to identify versions globally via branch + serial number rather than a hash.
It is rather unfortunate that Bzr development has stagnated and that DAG-based tools (Git and Mercurial) are the only major players left. Different workflows and organizational requirements benefit from different tools, and the Git/HG monoculture has started to worry me a bit.
[1] To be clear, Bzr has added co-located branches as an option since then on its own.
Fortunately, Git does 99% of that, and with rebase it's just the right for the job. Especially with git-rerere enabled.
As far as I am concerned, the need to (sometimes!) do rebase by hand is artifact of Git's commit history being strictly ordered by time. But just try to remove that constraint, and whomever considered rebase complex, will go completely crazy ^^
That's the default of git log output but can be adjusted via --topo-order and --date-order. It's also worth pointing out that git commits have two timestamps, the author timestamp which is not normally affected by rebasing/amending, and the commit timestamp which is reset by rebasing/amending. Git log (again by default) shows the author timestamp but orders by commit timestamp.
It signals to me that these tools are complicated in a way that will irritate me and that I should avoid these topics until the smart people have fixed the problem and reached a consensus.
Hey. It worked for CSS. I avoided it for a couple of years and most of the problems have been solved for me. ;-)
Workflows are very specific and personal. I do not believe that there will be any consensus.
Bazaar is one revision control system that does this right. Each commit is tagged with the branch to which it belongs, so any visualization of the commit graph will by default hide all the side commits. History appears neat and linear at first, but if you need to track something down to the original commits, you can expand the merges to see the exact order in which commits were made.
These suspicions aren't really based on anything, though.
This feeling lasted about five minutes. Then I moved on doing real work.
PS: you won't be able to pry rebase -i and --onto from my cold, dead hands because I'll be clinging on them all the way into whatever form of afterlife.
Also, take a step back and look at the history of git. Git was created by Linus Torvalds specifically for Linux kernel development. I'd argue that a key reason that the kernel is so successful is because people are able to maintain history as a first-class entity in their project. The idea the you can 'rebase -i' to build up small, neat commits that will almost always apply cleanly to a sane codebase is wonderful. The fact that I don't need extreme foresight to capture my meaningful units of work into individual commits means that years from now I can look back and see what I was actually doing instead of "wait, was that line deleted as part of the feature, or was he just cleaning up warnings?"
Remember that these features aren't for developers, they're for maintainers. If you want your code in the kernel, you follow the kernel development process or GTFO. Linus doesn't sit around saying "shucks darn, it didn't merge cleanly, I guess I'll go fix it for them." He just doesn't have the time, and neither do his "deputies."
That's not to say that these features don't benefit developers; they do. It's just that you need to have seen them in action to understand why.
And finally, I'm genuinely curious... Why are some people so obsessed with perfect preservation of history? Is this some sense of fear/paranoia? In practice I've never found project history to be useful without modification, so what am I missing? What are people trying to preserve?
I think it's a conflation of having something like incremental backups versus having (as you so eloquently put it) a cleaned up log of development. Sure, you can use a VCS to record the minutiae of every little thing that changes so you have a "snapshot" of the code at any point in time. And git will do that if you want it.
But I'd also have to second your thoughts that git is VCS done right, that is, by maintainers. All code will have to be maintained sooner or later, and as someone who has had to maintain plenty of code, I can tell you I don't care at all about every little change that's made. Even when I'm bisecting a bug, I don't want to have to skip over every stupid bit that was twiddled, or see commits that are immediately reverted by the next commit. That's garbage. I want to see conceptual chunks, things that hang together because a human thought of them in the terms of "this is a feature" or "this fixes a bug". Should commits make the Minimum Necessary Change? Yes. Should a new feature or bug fix be split across several commits, possibly separated by other, unrelated commits, because that's the way some sleep deprived programmer thought of them? Do you like to read author's notes about their novels instead of the edited novels?
To me, its the same as testing code. You don't need tests when things work perfectly. You only need tests/history when things aren't... And then you are seriously happy you have them.
On the topic of `git pull --rebase`, I think if you have a hard-and-fast rule that you employ without thinking about what you are doing to your commits and the state of the repository then you are doing it wrong (whether that is blindly merging or rebasing)... But that's just me.
I've found that on projects which disallow the modification of history answering this question is more difficult than if each committer was responsible for recomposing their commits before merging their features (preferably a FF-merge, of course). Meaningful/useful code isn't lost as you're not modifying the long-term history of the project, just your own recent commits relative to the task at hand. Authorship isn't lost, as even if the recomposition is handled by another person, you can always set the author for a commit arbitrarily, and indicate your presence as the maintainer by signing.
Put differently, responsible devs never modify other people's history (and unless you're sharing the same machine, git makes this difficult with push vs push -f). They modify their own history in an effort to limit the noise that other devs are exposed to and to make the maintainer's job easier. The goal is to treat the repository as a full-fledged mechanism for communication and coordination with the rest of the team.
For example, have a central repo that is the source of immutable history, and have every developer clean up their history into a small linear set of commits before they merge into that. You still have just as much accountability -- nothing can get into master without a developer looking at it and tagging it with a commit message. It's just that the commit message comes from a developer looking at and curating the work he just did on a feature or bugfix, instead of the vague assumptions and notions he was working with during development.
If you think people looking back on their recent work will be better at summarizing their motives and achievements than they were while working and experimenting, as I do, then rewriting local history makes a lot of sense. If you don't trust people, and think they are likely to lose relevant information by haphazardly rebasing with messages like "squash for pushing to master, bug #1933" then you might not.
All in all, I think that, for example, 2 clean messages from 2 developers (even relatively uninformative ones) are better to look at than 1 commit from one developer and 13 from the other with messages like "first stab at xyz" and "Oops, forgot to also change the name here".
So, sure, falling back to the merge when things went wrong is ok and all, but odds are high you should go ahead and relook at all of your commits anyway. (Another thing, doesn't the rebase keep the initial author date? It isn't like the history is completely fabricated at this point.)
Of course, I'm a big fan of git rebase -i to do some basic cleanup of your commits before pushing. Leave an excessive amount of log messages in? Rebase them out. Neglect basic documentation since you weren't sure if things were going to change? Rebase them in. Sure, I can sympathise with the "you are messing with history" argument, but I find it challenging to believe that I actually care that you commented last. Or that you actually had a few extra helper classes at some point. etc.
Or 2 developers; it's still just as annoying. As long as you and the other developer are working at the same time you'll have almost as many merges.
This is the real key here. Most don't really want git-merge(1) or git-rebase(1). They want git-go-back-and-extract-my-commits-into-a-topic-branch(1).
A-B-C-D-E-F-G
Where B is master and origin/master, and you decide you want to make C..G into a topic branch, topicA:git branch topicA
A-B-C-D-E-F-G (master)
A-B-C-D-E-F-G (topicA)
git reset --hard B A-B (master)
A-B-C-D-E-F-G (topicA)
git merge --no-ff topicA A-B-----------H (master)
\ /
C-D-E-F-G (topicA) git checkout -b a-topic-branch
git checkout master
then you reset master to the remote master, there is a command to do it but I do this kind of things with gitk. git checkout master
git reset --hard origin/masterThe reflog tracks this locally, but is there any way to push it alongside commits centrally so that the people who wish to preserve a physical development commit history can achieve that? I imagine it will work something like: by default, you see the logical history; but if you wish to delve into the physical history (including a history of who ran rebase commands, and when), you could do that.
Does this make sense and would it be valuable?
This video comes to mind: http://www.youtube.com/watch?v=CDeG4S-mJts
Git is fast, but it's a clusterfuck of weird command calls and esoteric flags. I kind of miss Mercurial in this regard, but I had to make the switch due to the popularity of Github. Having open source projects is a very nice way to show potential employers that you are a good asset.
Everything about GitHub is great except for the fact that you have to use Git.
I have 40 developers working in my company, all doing pull --rebase, I even blocked trivial merges on the server itself (see my answer at http://stackoverflow.com/a/8936474/258689)
Laziness is only acceptable when you work alone.
If you are curious, check out this project: https://github.com/orefalo/g2
Along the same lines, what is the point of the "it builds" widgets that I'm seeing lately? Unless you have some kind of stable release available, it had better build.
If you use topic branches for every feature and bug fix, then you can even test them in an integration branch (often called 'next') so that they can interact with other new features before graduating to 'master'. This makes 'master' more stable which is good for users and good for developers because they can be more confident that a bug in their topic branch was introduced in their branch. It is also easier to make releases.
Use of a 'next' integration branch also relieves some of the pressure from merging new features. Other developers' _work_ is not affected if 'next' is broken and the merge can be reverted without impacting the history that ultimately makes it into 'master'. Running 'git log --first-parent master' [4] will show only merges, one per feature, and each feature has already been tested in 'next', interacting with everything in 'master' as well as other new features. See gitworkflows(7) [5] for more on 'master'/'next'.
If we acknowledge that 'master' (and possibly 'next') are only for integration, then we don't have the problem of 'git pull' creating a funny merge commit because we're developing in a topic branch, but the same behavior occurs when we run 'git merge master' (or 'git pull origin master'). This is a merge from upstream and usually brings a lot of code that we don't understand into our branch. These "just keeping current" commits annoy Linus [2,3] because they do not advance the purpose of the topic branch ("to complete feature/bugfix X so that it can be merged to 'master'"). Linus' short and sweet rule of thumb [3] is
If you cannot explain what and why you merged, you
probably shouldn't be merging.
We can usually only explain a merge from upstream when we (a) merge a known stable point like a release or (b) merge because of a specific conflict/interaction, in which case that should go into the merge commit. If you use 'git merge --log', merges from topic branches contain a nice summary while merges from upstream usually have hundreds or thousands of commits that are unrelated to the purpose of your branch.[1] http://gitster.livejournal.com/42247.html (Junio Hamano: Fun with merges and purposes of branches)
[2] http://lwn.net/Articles/328436/ (Rebasing and merging: some git best practices)
[3] http://yarchive.net/comp/linux/git_merges_from_upstream.html (Linus Torvalds: Merges from upstream)
[4] http://git-blame.blogspot.com/2012/03/fun-with-first-parent.... (Junio Hamano: Fun with --first-parent)
[5] https://www.kernel.org/pub/software/scm/git/docs/gitworkflow...
I could see it making more sense if you're on a well understood periodic release cycle, where breaking next isn't critical, and everyone knows to have it stabilized in time for the next release.
The amount of time required for a topic to stabilize in 'next' depends on the topic and what it affects, but you can easily summarize "branches in next, but not in master" to look for candidates.
Feature releases are tagged on 'master' and 'next' is usually rewound at a release (create a new 'next' branch starting at the release, merge all the branches that failed to graduate in this release cycle, and discard the old 'next'). This is easy to automate.
[1] http://git-scm.com/2010/03/08/rerere.html
[2] http://www.kernel.org/pub/software/scm/git/docs/git-rerere.h...
-imo
http://williamdurand.fr/2012/01/17/my-git-branching-model/
i.e. before merging a feature branch, always rebase it on the tip of the integration branch, then merge it in with --no-ff to record an explicit merge commit on the integration branch, even though a fast-forward is possible. This gets you the temporal straightforwardness of rebase while preserving the fact that there WERE feature branches and their commits are partitioned in history.
It's not a dichotomy, it's about clear semantics - what a feature branch is (clearly defined linear progress off of an upstream) what a merge means (integrating that progress and vetting the result).
I feel like this echoes the tables vs divs debate: use a table for tabular data and position containers for layout.
In both cases, there are some fringes who argue that you should do one or the other for both use cases - semantics be damned. Git is newer so the fringes are just bigger.
git config branch.master.rebase true git config branch.develop.rebase true
This will make any pull be a pull --rebase on the master/develop
Explicit is better than implicit.
Setting up the rebase in config of a specific branch stay explicit because Git will
If the rebase is not straight forward then you can still abort it.
I would love that git has a config feature to force ff-only on pull but base on what I know you need to create an alias to have `pull --ff-only` replacing `pull`
Does this fit into the above workflow at all, or is it only for those who are working off master or sharing branches with other developers?
(I usually follow something approximating this flow: http://julio-ody.tumblr.com/post/31694093196/working-remotel...)