I got the strange feeling from that article as if changing the code files was the hardest thing about yes upgrade.
for db schema updates: have a look at sqitch by postgres' david wheeler. it should also support mysql (or will in the future).
Solving this in a way that doesn't require restarting any services and doesn't introduce any more race conditions is nontrivial, although it tends to work pretty reliably so long as you have a front controller and it very rarely changes. Basically it comes down to a symlink change detector by hitting a special (internal) URL after your deploy script which kills the stat cache. If people are interested I can post a more concrete example.
Because of the opcode issues we still do a rolling restart of the apaches but we don't see them a lot.
We use `rename` instead of `mv`. Don't know why :)
edit: this also handles the problem of suddenly referencing different files mid-request. Because you don't
That being said, this strikes me as more of a pain under the traditional PHP model, where reloading code from disc per request is normal, than for something like Rails which loads everything into memory once at launch.
We handled deployments a different way though. Each release went into a different directory and was made available under a different url, and users got redirected to the newest release when they logged in. Once in a session they stayed with the same version until they logged out. This also allowed us to do limited deployments; we could choose which version each customer group was sent to.
http://comments.gmane.org/gmane.os.solaris.opensolaris.zfs/2...
I would do a fsync() before switching the symlink to the new dir.
You basically take the nodes off one at a time, wait for connections to finish, sync over code, then bring it back up. This does make some assumptions about assets -- that they are in a different location, such as a CDN or static server. If you are removing assets, you need to do this at the very end of this node-syncing process, so that any live "old nodes" aren't linking to deleted assets.
As for newly updated assets, you should be doing versioning for those anyway (even this 'symlink trick' fails when multiple application servers are involved and no shared code space).
As someone mentioned before, hard-linking old and new deploy files means duplicated content doesn't cost any disk space. Rotate out old deploys past X days, and use strong cache controls to expire the content quickly.
It seemed to work well, but there are issues with <base> tags you should read up on first.
The first is that we find it's a good idea to have your release directories on the server named after tags from your VCS. Each time we want to do a deploy we just make a tag and the deployment script just takes the name of the tag to deploy as it's argument. It's very easy to see what version is deployed on a server by just looking at the address of the symlink.
The second is that you should use rsync with the --link-dest option. --link-dest allows you to specify a previous directory that rsync can use to create hard links from for files that haven't changed. For example, if you a new version to deploy in a directory called "0.9.10/2" and on the remote server you have "0.9.10/1" currently deployed, you can "rsync 0.9.10/2 server:0.9.10/2 --link-dest 0.9.10/1". What this does is create a new dir tree in /2 with all the files that didn't change from /1 hard linked but with new copies for the files that did. This saves a lot of disk space and it means you can keep versions around on the server for as long as you feel the need to.
As our deployment is ~8GB this is quite important for us. This means that we actually have releases sitting on the server for quite a while back.
The third thing is setting something up so you can have simple versioning of your deployment scripts.
We have a script that drives this whole process called "./realmctl". Deployment is split in to a 4 step process. You find scripts like this in each release dir like this:
./0.9.10/1/prepare (create/upload new release)
./0.9.10/1/stop (stop existing servers)
./0.9.10/1/deploy (change symlinks over to this release)
./0.9.10/1/start (start servers)
Each of the releases contains it's own version of the script. That means if you issue a command like "./realmctl restart --release=0.9.10/2" then the script can find the stop script for the current version then run the deploy and start scripts for the new version. In this way if your deployment process changes between versions then you can still freely move around between versions without needing to worry about the version of your deployment scripts.
The last thing is that it's really nice if your writing something similar for your scripts to have some idea about different parts of your infrastructure so that they can be controlled independently. It's really useful to be able to say something like "./realmctl restart all poe_webserver" (restart webserver processes on all servers) or "./realmctl stop ggg4 poe_instance" (stop the game instance servers on ggg4). Those kind of commands are really useful during an emergency.
Do you do staged production deploys of new code for small groups of users? I found it was beneficial to be able to test a change on a random subset of users so if there's a production-only bug it doesn't hit everyone at once.
This also allows you to not have to "stop" the app servers because you're starting up the new version's instance in parallel with the old. The frontend just passes user-specific requests to the new instance and the old instance keeps chugging along with no downtime. Of course this usually requires no schema changes (unless you have lots of spare infrastructure handy).
It's worth bearing in mind that we are actually deploying an application that they play on their desktop machines, it's just that our website is tightly integrated with the live realm so they are deployed together in the same deployment system.
What we do have as a game though is the ability to have a separate alpha realm that we can deploy to for testing a release and we have a trusted set of our player base that is allowed access to it.
So here is the list of realms we have:
Testing (Local continuously integrated deploy of trunk. Updated every commit)
Staging1 (Local staging for the next major patch)
Staging2 (Local staged copy of whatever is on production. This is used for when we want to test bugfixes to production)
Alpha (Deploy of the next major patch for some community members to play and test in advance. This is deployed alongside the production realm on the live servers.)
Production
All of that said though, we are adding the ability very soon for the backend to be able to spawn game instance servers for multiple versions of the realm. This would mean that we can deploy a game patch without a restart (assuming the backend didn't change). Old clients would get old game instance servers but as players restart their game client and patch, they will get on new game instance servers.
If that is your goal, why do the mv at all?
...until your program needs to work with more than one file and the two are related in some way or opens 'the same file' twice.
For example, your compiler could fetch file X from the 'old' directory and file Y from the 'new' one, or your web server could log that it fetched file Z (which does not exist in the 'new' directory) to a log file in the 'new' directory.
It may be possible to mitigate this by requiring everybody to access all fils true a directory you opened atomically for them, but even if it is: good luck enforcing that rule, especially when using third party libraries.
1. Server opens (old) index.php and begins executing
2. Symlink swap
3. Interpreter gets to the line that includes stuff.php, and then opens the *new* fileSo, the assumption "On Unix, mv is atomic operation" is not true. If your underlying FS is fully POSIX-compliant, mv will be an atomic operation.
I think it's important to stress it because there are some distributed FS that might even try to be POSIX-compliant but which are not guaranteeing atomic rename's and therefore this trick would not work well.
My preferred layout is
./releases/<datetime or rev or whatever makes sense for you stamped>
./current symlink to ./releases/<foo>
Keeping releases in directory by themselves make it easy to list them, archive old ones etc.
I still think this is a valuable deployment strategy, just because you can rollback and switch deployed versions easily, which is always useful. And it's certainly better than rsyncing to a live directory, at any rate.