Our own teams have a set of practices which are similar but different from what Linus outlines here. And different projects on my company use different practices from those.
The worst thing is that there's no way of enforcing these workflows or practices other than out-of-band social conventions. And so minor mistakes happen, all the time. Our Git projects are never as pretty as they should be.
In other words, Git provides an awesome set of primitives for source control. I'm not sure what it'd look like, but I'd like to see a product that built on those primitives to enforce a little more order on projects.
Maybe there isn't a "right way". A workflow that suits a simple desktop application is different from what is used by a kernel or another product that has dozens of targets to worry about. Similarly a web app that gets deployed in a controlled environment will most likely need a different way of working than an end-user application that goes into an app store to be downloaded and ran on a variety of devices.
> Our own teams have a set of practices which are similar but different from what Linus outlines here. And different projects on my company use different practices from those.
The culture around your product is probably very different from the kernel devs' culture so it makes sense for you to have a different model.
> The worst thing is that there's no way of enforcing these workflows or practices other than out-of-band social conventions. And so minor mistakes happen, all the time. Our Git projects are never as pretty as they should be.
Enforcing certain kinds of work flow would mean not allowing something that is currently possible. Crippling one workflow to standardize on another, while there is no clear evidence that one workflow would be the best for everyone.
Everyone has their own ideas on what is a clean history, whether it's a linear or has --no-ff merges for every feature. The most important thing is that it is useful. To me and my team that means that every commit on master should build on every target we have (dozens!) so "git bisect" won't be painful.
The culture around your product is probably very different from the kernel devs' culture so it makes sense for you to have a different model.
> The worst thing is that there's no way of enforcing these workflows or practices other than out-of-band social conventions. And so minor mistakes happen, all the time. Our Git projects are never as pretty as they should be.
Enforcing certain kinds of work flow would mean not allowing something that is currently possible. Crippling one workflow to standardize on another, while there is no clear evidence that one workflow would be the best for everyone.
I agree 100%. Tools that attempt to defined culture are an enormous pain and often unusable outside the context understood by their creators. Tools that help you reinforce the culture you decide on for your project are wonderful, but they are rarely as un-opinionated as they need to be.
One thing that strikes me about source control culture is that in centralized environments people are very aggressive about installing pre-commit hooks to enforce rules, but I rarely see people using hooks for git, or even including hooks in their project as a suggestion for other developers to use. I wonder why not?
I think he meant he wants the ability to enforce a certain behavior within his own group.
So... I don't understand. Do you want a tool that makes the kernel branching style illegal, or one that breaks your own team's workflow? If you want one that supports both, how is that providing clarity about the "right" way to do things?
You could even write a meta-tool that allows administrators to define and reify a workflow which would then be enforced for developers on a project.
https://github.com/stephenh/git-central/blob/master/server/u...
Unfortunately I haven't done a lot with this project in a few years since github doesn't allow bash post commit hooks; you'd have to run your own git server.
(Edit to add...)
So, I understand your impression that it's impossible to enforce workflow in git, given GitHub doesn't support it, and most users probably don't want to write complex post-commit scripts.
But it is actually possible.
It'd be nice if communities like git-flow/etc. codified their rules into post-commit hooks that you could install, and maybe GitHub could even vet (e.g. that the bash scripts won't nuke their servers), and provide as out-of-the-box/opt-in options in the admin section of their repos. E.g. "Enforce git-flow in my repo".
There is no right way. Think about styling. Is there a right style? No. It is silly to argue over your code's appearance. HOWEVER! As soon as you start collaborating with people and reviewing code, a uniform style is a very nice thing to have.
Teamwork creates the need for shared conventions. And that's where your ability to convince your team members of the value of some standardizations comes into play.
different projects on my company use different practices...
It sounds like your problem is not Git, but lack of organization. I am not sure a more restrictive scm would fix that. You need to find a good way to use Git, and then sell everyone on the benefits of process uniformity.
False. It's called the Dictator and Lieutenants workflow. It's costly, but if you're in a position where you don't trust your own developers or your conventions are severe, then it's a price you have to pay.
If you can't afford it, hire trustworthy developers or dial back your conventions.
It's low-tech, but a human gatekeeper's really your only hope for enforcing whatever conventions your project has.
You need more than one person who can commit to master and is responsible for the merges, but you most certainly don't need every contributor to have commit access.
- hiring developers that appreciate and obey the conventions, or
- reducing the weight of the conventions.
Simply put, if you have conventions that the developers aren't following, you're organization is dysfunctional in some way. Management should include the team when crafting the conventions, and management should take efforts to give the team time/resources to obey them.
I think this is exactly what Linus intended when he designed Git. He explained in a Google talk the way he controls what is committed to the kernel is by just pulling from people he trusts.
If you try to use git as a centralized version control system you lose control of what gets pushed regardless of how many rules and workflows you setup. Have devs send pull requests instead and don't accept/merge bad commits.
That said, IMO there is still quite a lot of room for customization in git workflow when using Github. For example, we don't "send patches around" as Linus says. Our private feature branches live on Github but we've adopted the convention that the "private" branch name is prefixed by who's working on it, e.g. mdeboard-oauth, jschmoe-url-routes. If it has someone's name at the front, don't touch it. That enables us to still use the "D" in DVCS while retaining the ability to safely rebase our own work to keep our history clean.
The only reason I'd want a git-based product to "enforce order" is a culture-related one: ensure that contributors/collaborators do things in line with the conventions we've established. However, IMO it's always better to have a conversation about that than work with an overly prescriptive tool.
Firstly and most importantly, Thanks to rebase I'm constantly working against the most recent mainline, merge pains are reduced by frequently dealing with smaller rebase merges instead of trying to do one massive merge at the end when I'm finished with a longer life task that might last a week or two. The more often you merge the less painful it is.
Secondly there's the cleaning part of history involving squashing. I believe the issue with your viewing the merge history of the main line will miss out on changes that were able to be introduced fastforward without a merge. And frankly no one else on the team cares that I committed 6 times in the process of one task, they want to see all the code relevant to that task, and ideally it's all in one change set.
There's a pretty reasonable summary over here http://blog.sourcetreeapp.com/2012/08/21/merge-or-rebase/
For certain teams rebase just makes a lot of sense.
You can take care of that just by doing frequent regular merges, no need to do rebase ever, and rebase doesn't make this part any easier, does it?
I think the 'cleaning part of history', and trying to avoid those annoying merge commits in the logs, is in fact the only reason to do rebases, no? It's obviously an important one to many people.
With SVN your only real option is to commit something that is working, right? If you commit something broken to SVN then you will likely get yelled at.
With git, you can make a few changes, then think "hmm that might not be the best way to fix it" do a commit and then rip out everything you just did and do it a different way.
Or maybe you Added some instrumentation for debugging the problem, committed, then fixed the problem, committed, then removed the instrumentation.
In both cases git has let you save off information that you might need during the bugfix process, but ultimately isn't needed in the final history. With SVN, you likely wouldn't check in those intermediate steps so the final history in SVN would be a single commit of "Fix bug foo"
Is there any need for everyone to have these intermediate commits in their history? I guess that's a matter of taste. I think the main thought is that rebases improve the signal-to-noise ratio of the changelog.
Sorry for basic questions but I'm new to git.
So why have it?
> I'm still new-ish to git and don't get why rebase is popular.
My most common use case for rebase is actually to keep my private branches up to date with master. `git rebase master` or `git fetch origin && git rebase origin/master` are common tools for me when I'm doing private work for an extended period of time. This way, I don't have a point where my private branch diverges from master; my changes are always fresh and based off the latest and greatest.
When I develop, I split my commits into as many small changes as I can so that the commit messages are single topic. I thought that was basically the idea. Every once in a while I use rebase to combine a few commits that should have been done together as they all addressed the same issue. This all seems right to me. I am left with a clean history of everything I have done on a very fine grained time scale. But the large number of commits, each with little significance to whole program hides the large scale structure of the development.
However, I could use rebase to start combining loosely related commits, trading the time resolution for clarity in the commit history. There seems to be a continuum along this scale. Where is the proper place in that continuum to say this is clean enough? Also, I don't like making changes where I am losing perfectly good information.
I know that I can group certain commits by defining a branch, developing on it, then merging (non-fast-forward) back to the original. The branch should keep the grouping in the commit history. I even suppose that this is can be done after the fact using rebase with the proper amount of git-fu. Is branching and non-fast-forward merges the preferred method of grouping related commits in the history?
If so, this seems troubling as it means that partially fixing something is difficult to do with a clean history. Until the piece of the program you wish to fix is completely working, it shouldn't be merged into master because it would ruin the grouping of the related commits. This means that there can't be any partial thought's like fixing bugs as you find them, because presumably you might want to group all bug fixes of a function together, but have a distinct commit for each.
Now I'm more confused than when I started. Seriously, any references or advice on this sort of topic are welcome.
In general, your commits should be the smallest atomic operation that makes sense. When people talk about 'clean history,' they're talking about working in the awesome workflow git provides:
1. Write half-written broken code. 2. Fix that code up. 3. Add some more onto that. 4. Fix a typo! 5. Forgot to update the README.
Now, you could push that to master, but then the main master is littered with commit messages like 'oops' and 'typo.' Instead, you can rebase 5-1 onto the latest master, squash them together, and have one 'nice' commit that only has the cleaned up final changes.
This is one of the most powerful things about git: in a private repo, you can commit all kinds of garbage and half-written stuff without caring. When you want to make your stuff public, rebase and squash, then send it out. Be careful though! Only rebase your own private branches, or you're gonna have a bad time™.
There is the other issue I raised, however: is there a good way to group a series of commits that happen to be towards a single distinct goal. Using branches is a clear step in that direction, but it seems like a nightmare to perform a rebase like you described if the commits are mixed and I would like the end result to involve grouping via branches. That is confusing, hopefully this will clear it up:
1. Bugfix in function1. 2. Bugfix in function2. 3. New feature in function2. 4. Bugfix in function1. 5. Bugfix in function2
...and we want in the end:
/-- 1 ---- 4 ---\
---< >--HEAD
\- 2 -- 3 -- 5 -/
Can rebase do this easily? Is this a good idea (it seems like it is to me)? The programmer would have to confirm that the code works at every state.After you've published your work and someone else has checked it out, you don't want to touch your history unless there is a serious problem.
But when you're working on something, you can commit all you want, and do many commits. Then at some point you put your work up for reviews and get feedback. Then you fix the feedback and commit as many times you need to. When your code is good enough to be merged into master, you should clean up the history a little with rebase.
You should at least try to squash and rebase your commits so that there will not be any commit in the master history that is completely broken. The whole point of having a history is that you're able to go back. E.g. you might want to search the point in history where a problem originated (git bisect can automate this with a "binary search"). You cannot effectively do that if your history is full of commits that do not work (E.g. won't build or will crash all tests).
To recap: never change published history unless there is a serious issue (like you committed your database password to github). But you can and should change your local history before you publish to master so that there are no broken commits that make it difficult to walk back in history.
git checkout -b featurebranch
git commit -am "foo"
git commit -am "bar"
git rebase master # to update my personal history with public history
git commit -am "baz"
I've used different flavors of merging it back in, though. Method 1 is to `git checkout master; git diff master..featurebranch | git apply`. Method 2 is `git rebase -i HEAD~10; git checkout master; git cherry-pick featurebranch`. I'm sure there are other and better methods, but those are the ones I've used recently that I like.After I collapse a branch down into a single commit (I rarely want a branch to become multiple commits), I typically use `git commit --amend` to modify the commit message to something fitting and push it upstream. --reset-author is also good there to properly denote the correct date/time, rather than the first commit you squashed.
We've now decided to use this model, while only deleting feature branches after RC acceptance.
http://nvie.com/posts/a-successful-git-branching-model/
My colleague just suggested to rebase regularly from the develop branch while developing features "I'm working on a branch. someone - e.g. you - updates the develop branch. I will have no info if that is related to my stuff or not so, I should rebase regularly to the latest version of the develop branch"
I'm kinda clueless now. Git is really powerful and flexible in strageties, and that adds to complexity.
Nope, not simple. Yep, this is a git usability problem.
In the ruby/github world, people generally violate this and DO rewrite 'public' history in order to get 'cleanness', primarily because almost ALL history is 'public', since you tend to show people work in progress on github, or just push it there to have a reliable copy in the cloud. And yes, this sometimes leads to madness.
Or perhaps intentional. I can never tell when I read a Linus fiat.
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/...
* distributed rather than centralised version control brings a new set of concepts to understand
* git is flexible enough to support many different workflows. This means you have to actually choose one, and choice is difficult especially when you're just trying to get to grips with a new tool. svn has much more of a "one standard way to do it" approach
* git's UI is in places confusing, inconsistent and occasionally just randomly and unnecessarily different from most other version control systems
The first two are 'essential complexity'; the third is more 'accidental complexity'. In any case I feel it's having to deal with all three sources of confusion that makes the svn->git transition tricky for many people.
I can totally see git is ridiculously powerful, and general purpose. I just wish it'd default to what most people want a bit more.
Why do I get prompted to enter a commit message when I'm just doing a git pull?
Why do I have to explicitly add every file I want to commit each time? Why can't it just default to "everything under the current dir" like svn does?
Now the most valuable thing to me in source control, history, I'm supposed to keep clean? That's like a sacred cow, you _don't_ mess with history.
>> That's fairly straightforward, no?
No _Linus_ it isn't. Git is hard to get right. If it wasn't for EGit I'd be lost. I tried Canonical's bzr and it is more understandable for ordinary humans.
All that aside I really like Linux. :)
"Don't mess with history"? I don't have to commit to my commits as long as my commits ain't public.
Rewriting history is a lie? Well, if you want to keep everything you do in history, maybe commit on each keystroke? That's insane.
Don't commit unless your ready to commit? Then that be hard to keep track of. Come time to commit you've got 50+ files modified good luck at doing decent commit messages.
I've used a lot of source control systems and the best always have a GUI and so guess what? I want a GUI unless the CLI for such system is inherently intuitive which if you read my comments I do not think git is intuitive at all.
>> I don't have to commit to my commits as long as my commits ain't public.
Huh?!?! I don't get that, it like makes no sense to me whatsoever. Why do you think I should even try to comprehend it?
>> Don't commit unless your ready to commit?
Are you suggesting I said or asked that??? Are you advising me? Seriously what?
>> Then that be hard to keep track of. Come time to commit you've got 50+ files modified good luck at doing decent commit messages.
Huh? I'm sorry is that English because it doesn't even make sense at all to me? Is it 50 lines changed all clearly related? Is it 50 totally different changes?