How We Deploy Python Code (opens in new tab)

(nylas.com)

250 pointsspang10y ago125 comments

125 comments

Back when I was doing Python deployments (~2009-2013) I was:

* Downloading any new dependencies to a cached folder on the server (this was before wheels had really taken off) * Running pip install -r requirements.txt from that cached folder into a new virtual environment for that deployment (`/opt/company/app-name/YYYY-MM-DD-HH-MM-SS`) * Switching a symlink (`/some/path/app-name`) to point at the latest virtual env. * Running a graceful restart of Apache.

Fast, zero downtime deployments, multiple times a day, and if anything failed, the build simply didn't go out and I'd try again after fixing the issue. Rollbacks were also very easy (just switch the symlink back and restart Apache again).

These days the things I'd definitely change would be:

* Use a local PyPi rather than a per-server cache * Use wheels wherever possible to avoid re-compilation on the servers.

Things I would consider:

* Packaging (deb / fat-package / docker) to avoid having any extra work done over per-machine + easy promotions from one environment to the next.

s_kilk10y ago

I built a system that did something very like this at a previous employer. We got really quick (mostly) atomic deployments which could be rolled-back instantly with one command.

Even at the time I thought Docker would be a great solution to the problem, but the organization was vehemently against using modern tech to manage servers and deployments, so I ended up writing that tool in bash instead. Good times.

thraxil10y ago

This is basically how we still do it (with nginx + gunicorn rather than apache). Wheels (and recent setuptools/pip versions that build and cache them automatically) just made the build step significantly faster (lxml, argh).

We're moving to the Docker approach, which is really nice, but it does change the shape of the whole deploy pipeline, so it's going to take some time.

crdoconnor10y ago

This is exactly what I do.

>Use a local PyPi rather than a per-server cache

I stil prefer a per-server cache. A local pypi is another piece of infrastructure you need to keep alive. You don't have to worry about the uptime of an rsync playbook.

morgante10y ago

Their reason for dismissing Docker are rather shallow, considering that it's pretty much the perfect solution to this problem.

Their first reason (not wanting to upgrade a kernel) is terrible considering that they'll eventually be upgrading it anyways.

Their second is slightly better, but it's really not that hard. There are plenty of hosted services for storing Docker images, not to mention that "there's a Dockerfile for that."

Their final reason (not wanting to learn and convert to a new infrastructure paradigm) is the most legitimate, but ultimately misguided. Moving to Docker doesn't have to be an all-or-nothing affair. You don't have to do random shuffling of containers and automated shipping of new images—there are certainly benefits of going wholesale Docker, but it's by no means required. At the simplest level, you can just treat the Docker contain as an app and run it as you normally would, with all your normal systems. (ie. replace "python example.py" with "docker run example")

viraptor10y ago

> (not wanting to upgrade a kernel) is terrible considering that they'll eventually be upgrading it anyways.

If they're running ubuntu 12.04 LTS they can keep the 3.2 kernel until late 2017. That's 2 more years. And they wrote "did not", so it was likely the situation months ago, not yesterday.

> (not wanting to learn and convert to a new infrastructure paradigm) is the most legitimate, but ultimately misguided

It depends on the amount of stuff they deploy. If they handle everything using Ansible (and from the list it looks like they do), then it's months of work to migrate to something else. They may need the right users / logging / secret management in the app itself, not outside of it.

morgante10y ago

> If they handle everything using Ansible (and from the list it looks like they do), then it's months of work to migrate to something else.

It's not. It would be months of work if they wanted to convert all their Ansible code to Docker, but that's by no means required.

Docker and Ansible can easily coexist peacefully.

1 more reply

mixmastamyk10y ago

    sudo apt-get install linux-generic-lts-quantal

Easy to upgrade.

1 more reply

Cieplak10y ago

Highly recommend FPM for creating packages (deb, rpm, osx .pkg, tar) from gems, python modules, and pears.

https://github.com/jordansissel/fpm

daurnimator10y ago

Rather than such a blunt tool like fpm, if you're deploying only python, look to something like py2deb: https://github.com/paylogic/py2deb

mixmastamyk10y ago

How does this compare with the "dh-virtualenv" tool recommended by the article?

Edit: found https://py2deb.readthedocs.org/en/latest/comparisons.html

1 more reply

zobzu10y ago

fpm is cool - very cool. that said it wont create proper spec files and friends, thus its harder to maintain the packages properly. also they usually ont get accepted upstream due to that.

that said, for python files and simple packages it works well enough!

jlees10y ago

That seems like a neat tool. I wonder if you could combine it with the sandboxing that dh-virtualenv provides to get the best of both worlds?

rhelmer10y ago

I've used fpm to make rpm and deb packages that simply include a virtualenv, it works ok.

One of the significant tradeoffs to this approach is you lose the carefully-crafted tree-of-dependencies that the distros favor, so it makes the package pretty much automatically unacceptable to package maintainers.

However, being able to have install instructions that amount to "yum/apt-get install <package>" is pretty great.

I am hoping for an app/container convergence at some point, but we might need to drop the fine-grained dependency dream and have them be more self-contained, like Mac OS X apps.

1 more reply

amatix10y ago

we use fpm & something like dh-virtualenv along exactly those lines. Helps us manage a complex mix of native/system-level dependencies (non-python) as well as python packages.

We also incorporate a set of meta packages which means we can have multiple codebase versions installed and switch the "active" one by installing the right version of the meta-package. There's also meta-packages for each service running off the same codebase, which deals with starting/stopping/etc.

doki_pen10y ago

We do something similar at embedly, except instead of dh-virtualenv we have our own homegrown solution. I wish I new about dh-virtualenv before we created it.

Basically, what it comes down to a build script that builds a deb with the virtualenv of your project versioned properly(build number, git tag), along with any other files that need to be installed (think init scripts and some about file describing the build). It also should do things like create users for daemons. We also use it to enforce consistent package structure.

We use devpi to host our python libraries (as opposed to applications), reprepro to host our deb packages, standard python tools to build the virtualenv and fpm to package it all up into a deb.

All in all, the bash build script is 177 LoC and is driven by a standard build script we include in every applications repository defining variables, and optionally overriding build steps (if you've used portage...).

The most important thing is that you have a standard way to create python libraries and application to reduce friction on starting new projects and getting them into production quickly.

remh10y ago

We fixed that issue at Datadog by using Chef Omnibus:

https://www.datadoghq.com/blog/new-datadog-agent-omnibus-tic...

It's more complicated than the proposed solution by nylas but ultimately it gives you full control of the whole environment and ensure that you won't hit ANY dependency issue when shipping your code to weird systems.

sytse10y ago

At GitLab we use Chef Omnibus too and we love it. More than 100k organizations use GitLab with Omnibus and it has lowered our support effort enormously. https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/REA...

jlees10y ago

Looks neat. I think there's definitely a difference in the requirements for a deploy method when you can control the underlying systems (internal servers) vs have to make it work with just about any weird config users throw at you (your agent).

kbar1310y ago

http://pythonwheels.com/ solves the problem of building c extensions on installation.

emidln10y ago

Pair this with virtualenvs in separate directories (so that "rollback" is just a ssh mv and a reload for whatever supervisor process) and you get to skip the mess of building system packages.

Also, are there seriously places that don't run their own PyPI mirrors? Places that have people who understand how to integrate platform-specific packages but can't be bothered to deploy one of the several PyPI-in-a-box systems or pay for a hosted PyPI?

toomuchtodo10y ago

> Also, are there seriously places that don't run their own PyPI mirrors? Places that have people who understand how to integrate platform-specific packages but can't be bothered to deploy one of the several PyPI-in-a-box systems or pay for a hosted PyPI?

Yes. I've seen them, and they've been huge shops.

viraptor10y ago

> you get to skip the mess of building system packages.

Only in cases where you don't have wheels depending on external libraries. If you do, you should still package with the right dependency constraints. Otherwise you can install a wheel which does not work (because of missing .so)

1 more reply

vamega10y ago

Can you point me to your recommended PyPI-in-a-box system?

3 more replies

tschellenbach10y ago

Yes, someone should build the one way to ship your app. No reason for everybody to be inventing this stuff over and over again.

Deploys are harder if you have a large codebase to ship. rSync works really well in those cases. It requires a bit of extra infrastructure, but is super fast.

erikb10y ago

Deployment is hard, system administration is hard. As a software developer you think there should be one way but there can't be one way. There never will be. If you want to make good software really spend some time learning the intricacies of packaging, deployment and sysadmin life. You make a lot of people happy by just knowing the problems, even if you can't apply the correct solution to each problem.

Come from the same island as you, trust me. But the more you learn about this the more you see how complex it is. You can't even say that one solution is better than the other (like apt vs yum). Each and every one of them has their pros and cons. And more often than not architectural decisions make it impossible to get both solutions into the same system working together.

rSync is not deploying. It's syncing files. But even if you have a 1:1 copy from your development computer on a server it still might not work because on that server package xyz is still in version 1.4.3b and not 1.4.3c. Deployment is getting it there AND getting it to work together nicely and maintainable with the other things that run on that computer/vm.

mattbillenstein10y ago

+1 rsync is pretty darn good at any scale -- I'm not sure why the simplest solution possible doesn't beat out docker as a suggestion in this thread.

I've been bundling libs and software into a single virtual environment like package that I distribute with rsync for a long time - it solves loads of problems, is easy to bootstrap a new system with, and incremental updates are super fast. Combine that with rsync distribution of your source and a good tool for automating all of it (ansible, salt, chef, puppet, et al) and you have a pretty fool-proof deployment system.

And a rollback is just a git revert and another push away -- no need to keep build artifacts lying around if you believe your build is deterministic.

viraptor10y ago

Rsync is good for simple things. But it will fail with more complicated apps:

- how do you know which version you're running right now?

- how do you deploy to two environments where different deps are needed?

- how do you tell when your included dependencies need security patches?

1 more reply

stephenr10y ago

They have. Every mainstream OS/Distro has a packaging system and an installer for those packages.

For server-side apps like this, that usually means a Deb or an RPM. These systems handle upgrades, rollbacks, dependencies, etc.

Just because some people decide that writing an RPM specfile or running dh_make is too hard to work out, doesn't mean that the solution doesn't exist.

sandGorgon10y ago

The fact that we had a weird combination of python and libraries took us towards Docker. And we have never looked back.

For someone trying out building python deployment packages using deb, rpm, etc. I really recommend Docker.

craigmccaskill10y ago

They specifically called that out in the article with an entire section called "just use docker".

knicholes10y ago

On the same vein as omni, the other reasons are "we don't want to learn a new technology, even though it was made exactly for this purpose" and "we don't want to learn how to install docker-registry."

x0x010y ago

if you have to choose between a kernel and docker, just choose docker. Python can't get their shit together deployment-wise, and docker is the one true route (tm) to python deployment happiness.

forget virtualenv; forget package dependencies on conflicting versions of libxml; forget coworkers that have 3 different conflicting versions of requests scattered through various services, and goddamnit I just want to run a dev build; forget coworkers that scribble droppings all over the filesystem, and assume certain services will never coexist on the same box

just use docker. It's going to go like this:

step 1: docker

step 2: happy

1 more reply

jlees10y ago

To be fair, what didn't work for Nylas might well not be an issue for others. There's definitely more than one way to skin a cat, especially in the Python world.

1 more reply

dwb10y ago

We love Docker where I work, but running it in production is a big challenge — there's no "just" using it. We build packages of Rails apps and dependencies quite happily. Sure, you have to make sure all dependencies are packaged too, but that's still easier than a full-on Docker roll-out.

Indeed, we actually use Docker to build packages. Blog post coming soon, maybe.

jacques_chester10y ago

In a few months Cloud Foundry will natively support launching, placing, managing, wiring, routing, logging and servicing Dockerised application images.

In the meantime you can get a taste with Lattice[0].

[0] http://lattice.cf/

jedberg10y ago

There was a whole paragraph in the article about why Docker didn't work for them.

x0x010y ago

they had 3 reasons:

one of which was just silly (kernel version -- are you living on that point release forever?)

one of which was valid (necessity to maintain method for distributing docker images), but probably dumb: you only get so many innovation points per company, and innovating on a problem docker just solves means you are supporting your in-house solution ad infinitum

and one of which definitely sounds painful (docker vs extant ansible playbooks)

2 more replies

sophacles10y ago

We use a devpi server, and just push the new package version, including wheels built for our server environment, for distribution.

On the app end we just build a new virtualenv, and launch. If something fails, we switch back to the old virtualenv. This is managed by a simple fabric script.

nZac10y ago

We just commit our dependencies into our project repository in wheel format and install into a virtual env on prod from that directory eliminating PyPi. Though I don't know many other that do this. Do you?

Bitbucket and GitHub are reliable enough for how often we deploy that we aren't all that worried about downtime from those services. We could also pull from a dev's machine should the situation be that dire.

We have looked into Docker but that tool has a lot more growing before "I" would feel comfortable putting it into production. I would rather ship a packaged VM than Docker at this point, there are to many gotchas that we don't have time to figure out.

erikb10y ago

You put the wheels into a git repo? That's the most sad thing I've heard today. You know that if you add a file in commit A and remove it in Commit B each and every clone still pulls in that file? It's okay for text files but it's very much not okay for binaries and packages.

kevinschumacher10y ago

    git clone --depth=1 path/to/repo

when doing a clone for a deploy, since you don't need the history

edit: but yes, cloning as a developer will take a long time. But, if it really gets out of hand, I can hand new devs a HDD with the repo on it, and they can just pull recent changes. Not ideal, but pretty workable

so0k10y ago

we download to a folder on the docker build server and build docker containers from this cache.

see here: http://stackoverflow.com/a/29936384/138469

viraptor10y ago

> curl “https://artifacts.nylas.net/sync-engine-3k48dls.deb” -o $temp ; dpkg -i $temp

It's really not hard to deploy a package repository. Either a "proper" one with a tool like `reprepro`, or a stripped one which is basically just .deb files in one directory. There's really no need for curl+dpkg. And a proper repository gives you dependency handling for free.

sciurus10y ago

Yes, that part really surprised me. Barebones repos are useful enough, and there's also some pretty fancy tools out there like http://www.aptly.info/

mixmastamyk10y ago

Could you elaborate on the simple folder?

For example I find the --instdir option to dpkg but it still would have to be downloaded from the other host, unless of course the folder was mounted somehow.

viraptor10y ago

Search for debian's "trivial archive". It replaces release/component elements with explicit path. It's deprecated now, but I believe still works.

perlgeek10y ago

Note that the base path /usr/share/python (that dh-virtualenv ships with) is a bad choice; see https://github.com/spotify/dh-virtualenv/issues/82 for a discussion.

You can set a different base path in debian/rules with export DH_VIRTUALENV_INSTALL_ROOT=/your/path/here

serkanh10y ago

"Distributing Docker images within a private network also requires a separate service which we would need to configure, test, and maintain." What does this mean? Setting up a private docker registry is trivial at best and having it deploy on remote servers via chef, puppet; hell even fabric should do the job.

juliangregorian10y ago

It's not necessarily true either; it's not difficult to have your continuous build process build images from the Dockerfile, run tests, swap green and blue, etc...

erikb10y ago

No No No No! Or maybe?

Do people really do that? Git pull their own projects into the production servers? I spent a lot of time to put all my code in versioned wheels when I deploy, even if I'm the only coder and the only user. Application and development are and should be two different worlds.

objectified10y ago

I recently created vdist (https://vdist.readthedocs.org/en/latest/ - https://github.com/objectified/vdist) for doing similar things - the exception being is that it uses Docker to actually build the OS package on. vdist uses FPM under the hood, and (currently) lets you build both deb and rpm packages. It also packs up a complete virtualenv, and installs the build time OS dependencies on the Docker machine where it builds on when needed. The runtime dependencies are made into dependencies of the resulting package.

rfeather10y ago

I've had decent results using a combination of bamboo, maven, conda, and pip. Granted, most of our ecosystem is Java. Tagging a python package along as a maven artifact probably isn't the most natural thing to do otherwise.

StavrosK10y ago

Unfortunately, this method seems like it would only work for libraries, or things that can easily be packaged as libraries. It wouldn't work that well for a web application, for example, especially since the typical Django application usually involves multiple services, different settings per machine, etc.

vacri10y ago

> different settings per machine

/etc/default/mycoolapp.conf

Debian packages have the concept of 'config' files. Files will be automatically overwritten when installing a new version of package FOO, unless they're marked as config files in the .deb manifest. This allows you to have a set of sane defaults, but not to lose customisations when upgrading.

acdha10y ago

Just wanted to +1 this. There are literally decades of convention built around patterns where you ship a standard config file which merges in system/user/instance-specific settings from a known location, command-line argument, environment, etc. The Debian world in particular has lead the community for decades with the use of debconf to store values such as a hostname or server role which can automatically be re-applied when otherwise unmodified files are modified upstream.

When I used this approach with a Django site years ago using RPM[1] we used the pattern vacri mentioned or the reverse one where you have an Apache virtualhost file which contains system-specific settings (hostname, SSL certs, log file name, etc.) and simply included the generic settings shipped in the RPM.

In either case the system-specific information can be set by hand (this was a .gov server…), managed with your favorite deployment / config tool, etc. and allows you to use the same signed, bit-for-bit identical package on testing, staging, and production with complete assurance that the only differences were intentional. This was really nice when you wanted to hand things off to a different group rather than having the dev team include the sysadmins.

1. http://chris.improbable.org/2009/10/16/deploying-django-site...

doki_pen10y ago

I use a similar method to ship app.embed.ly (emberapp on top of django). You can package whatever you want in a deb. The virtualenv is just a part of it. Config files are managed with configuration management system (chef in our case). Our django settings.py file just tries to import from /etc/blahblah

tspike10y ago

Nah, the approach is sound. We did this for a Flask app that used celery, redis, etc and we were happy with it. For the software, use a .deb, for the configuration, use a configuration management tool like Ansible.

avilay10y ago

Here is the process I use for smallish services -

1. Create a python package using setup.py 2. Upload the resulting .tar.gz file to a central location 3. Download to prod nodes and run pip3 install <packagename>.tar.gz

Rolling back is pretty simple - pip3 uninstall the current version and re-install the old version.

Any gotchas with this process?

mixmastamyk10y ago

If you are using it for small services it's probably fine. But the original article did say that uninstall sometimes doesn't work correctly. Apt is more formal than pip.

So at some point, as you know you'll need to move on.

webo10y ago

You have to do this every time there's a change in the codebase which is not easy. How do you stick this into a CI without the git & pip issue talked about in the post?

avilay10y ago

I have to do this everytime I have to deploy, which is similar to having to create a deb package everytime Nylas has to deploy.

There are no git dependencies in the process I describe above.

The pip drawback that is discussed in the post is of PyPi going down. In the process described above there is no PyPi dependency. Storing the .tar.gz package in a central location is similar to Nylas storing their deb package on S3.

1 more reply

velocitypsycho10y ago

For installing using .deb files, how are db migrations handled. Our deployment system handles running django migrations by deploying to a new folder/virtualenv, running the migrations, then switching over symlinks.

I vaguely remember .deb files having install scripts, is that what one would use?

viraptor10y ago

Depends on your environment, number of hosts, etc. really. You probably don't want to stick it into the same install script because:

- your app user doesn't need rights to modify the schema

- you need to handle concurrency of schema upgrades (what if two hosts upgrade at the same time?)

- if your migration fails it may leave you in the weird installation state and not restart the service

Ideal solution: deploy code which can cope with both pre-migration and post-migration schema -> upgrade schema -> deploy code with new features.

jlees10y ago

For e.g. changing the format of a column that's easy enough but it's tricky to create that intermediate state at the migration level for every migration. One option is to deploy the migration code without restarting the running services (or to a different box), rollback the code if the migration failed, restart the services to pick up the new code if it succeeded. This still means not writing migrations that actively break the running version though - if you're using database reflection, everything will go boom when the schema changes.

stephenr10y ago

Depends on your infra. If it's a single server with the app + Db (or a single app server + single DB server) you could have a postinst script that calls your app/framework's migration system.

If your migration system is smart enough (or you can easily check the migration status from a shell script) you could also do this in a multi-app-server environment too.

lifeisstillgood10y ago

Weirdly I am re-starting an old project doing this venv/ dpkg (http://pyholodeck.mikadosoftware.com). The fact that it's still a painful problem means Inam not wasting my time :-)

webo10y ago

> Building with dh-virtualenv simply creates a debian package that includes a virtualenv, along with any dependencies listed in the requirements.txt file.

So how is this solving the first issue? If PyPI or the Git server is down, this is exactly like the git & pip option.

stephenr10y ago

You need those things up to build the package not to install it

webo10y ago

Ah I misunderstood the article. I just package my application source in a tar during deployment. I thought that's what most people do.

1 more reply

compostor4210y ago

Great article. I had never heard of dh-virtualenv but will be looking into it.

How has your experience with Ansible been so far? I have dabbled with it but haven't taken the plunge yet. Curious how it has been working out for you all.

emfree10y ago

Ansible works well for us, although we use it in a somewhat different way than most folks. We previously wrote about our approach here, if you're curious: https://nylas.com/blog/graduating-past-playbooks

compostor4210y ago

Thanks for the article. Well written and a very interesting concept.

BuckRogers10y ago

Seems this method wouldn't work as well if you have external clients you deploy for. I'd use Docker instead of doing this, just to be in a better position for an internal or external client deployment.

ytjohn10y ago

If you took this a step further and set up a debian repo, then you could have your clients use that debian repo.

I'm looking to do something pretty similiar, but RPMs. I found rpmvenv that seems to work in the same fashion. https://pypi.python.org/pypi/rpmvenv/0.3.1

stephenr10y ago

Exactly this.

If a company wants to use Docker that's their choice, but I don't think its at all reasonable to insist on or only support that environment as a software vendor. If it works on Debian, give me a .deb or even better an Apt Repo to use.

stephenr10y ago

Why? Saying it runs on Debian X is much more easy to facilitate by your end users than "requires docker"

ah-10y ago

conda works pretty well.

TDL10y ago

Agreed (although biased since I used to work at Continuum.) I am wondering what others think of conda?

4lejandrito10y ago

We have used Conda for our first python deployment and the process has been seamless. It provides the same sandboxing concept using virtualenvs and also uses prebuilt binaries for native dependencies so you don't have to build them every time. The only drawback I would say is that we have to install miniconda in our production servers rather than just deploying an standalone package.

tuckermi10y ago

My team has been rolling our own conda packages for (frequent) internal software releases to local servers and have been pretty happy overall pulling down code from a locally managed conda package repo.

With that said, Conda is not a perfect solution. One thing that can be frustrating is that a package can include compiled code (shared objects/dylibs) that may be incompatible with your system. Unfortunately, while you can indicate dependencies on other conda packages, python versions, etc there isn't currently a convenient way to indicate things like GLIBC dependencies.

hcrisp10y ago

I use conda as well and found it to be great. I love how it detects and manages dependencies for you when you install a new module.

pekk10y ago

What was wrong with Python packaging that conda needed to replace it entirely (including virtualenv)? If you say the installation story for the scipy stack, that is less a matter of Python packaging and more a matter of the scipy stack.

nashequilibrium10y ago

Love it, i don't hear much about it on HN but i personally really like it.

rlvesco710y ago

I like it too, but I had issues when I had to compile packages and other stuff. In particular, its version of qmake can interfere in unpredictable ways.

jacques_chester10y ago

Here's how I deploy python code:

    cf push some-python-app

So far it's worked pretty well.

Works for Ruby, Java, Node, PHP and Go as well.

nobullet10y ago

That's interesting. I've tried to google cf CLI however wasn't able to find good documentation. Is it possible to install cf CLI on my server? Or is it Cloud Foundry tool only?

jacques_chester10y ago

The cf CLI tool interacts with a Cloud Foundry installation.

You'd use it for one in your own data centre, or Pivotal Web Services[0], or BlueMix. You point it at an API and login, then off you go.

If you need something more cut-down to play with, Lattice[1] is nifty, but currently doesn't do buildpack magic.

[0] https://run.pivotal.io/ [1] http://lattice.cf/

daryltucker10y ago

I see your issue of complexity. Glad I haven't ever reached the point where some good git hooks no longer work.

theseatoms10y ago

Does anyone have experience with PEX?

stefantalpalaru10y ago

> The state of the art seems to be ”run git pull and pray”

No, the state of the art where I'm handling deployment is "run 'git push' to a test repo where a post-update hook runs a series of tests and if those tests pass it pushes to the production repo where a similar hook does any required additional operation".

themartorana10y ago

Git deployments work great if you're packing an image (AMI, Docker) using, say, Packer. But we only deploy "immutable" images, not code on to existing servers.

hobarrera10y ago

> The state of the art seems to be ”run git pull and pray”

Looks like these guys never heard of things like CI.

jumpkick10y ago

You had to read further.

This is the core of how we deploy code at Nylas. Our continuous integration server (Jenkins) runs dh-virtualenv to build the package, and uses Python’s wheel cache to avoid re-building dependencies.

andrewchambers10y ago

Or release branches/tags.

j / k navigate · click thread line to collapse

125 comments

svieira10y ago

Back when I was doing Python deployments (~2009-2013) I was:

These days the things I'd definitely change would be:

* Use a local PyPi rather than a per-server cache * Use wheels wherever possible to avoid re-compilation on the servers.

Things I would consider:

* Packaging (deb / fat-package / docker) to avoid having any extra work done over per-machine + easy promotions from one environment to the next.

s_kilk10y ago

I built a system that did something very like this at a previous employer. We got really quick (mostly) atomic deployments which could be rolled-back instantly with one command.

thraxil10y ago

We're moving to the Docker approach, which is really nice, but it does change the shape of the whole deploy pipeline, so it's going to take some time.

crdoconnor10y ago

This is exactly what I do.

>Use a local PyPi rather than a per-server cache

I stil prefer a per-server cache. A local pypi is another piece of infrastructure you need to keep alive. You don't have to worry about the uptime of an rsync playbook.

morgante10y ago

Their reason for dismissing Docker are rather shallow, considering that it's pretty much the perfect solution to this problem.

Their first reason (not wanting to upgrade a kernel) is terrible considering that they'll eventually be upgrading it anyways.

Their second is slightly better, but it's really not that hard. There are plenty of hosted services for storing Docker images, not to mention that "there's a Dockerfile for that."

viraptor10y ago

> (not wanting to upgrade a kernel) is terrible considering that they'll eventually be upgrading it anyways.

If they're running ubuntu 12.04 LTS they can keep the 3.2 kernel until late 2017. That's 2 more years. And they wrote "did not", so it was likely the situation months ago, not yesterday.

> (not wanting to learn and convert to a new infrastructure paradigm) is the most legitimate, but ultimately misguided

morgante10y ago

> If they handle everything using Ansible (and from the list it looks like they do), then it's months of work to migrate to something else.

It's not. It would be months of work if they wanted to convert all their Ansible code to Docker, but that's by no means required.

Docker and Ansible can easily coexist peacefully.

1 more reply

mixmastamyk10y ago

    sudo apt-get install linux-generic-lts-quantal

Easy to upgrade.

1 more reply

Cieplak10y ago

Highly recommend FPM for creating packages (deb, rpm, osx .pkg, tar) from gems, python modules, and pears.

https://github.com/jordansissel/fpm

daurnimator10y ago

Rather than such a blunt tool like fpm, if you're deploying only python, look to something like py2deb: https://github.com/paylogic/py2deb

mixmastamyk10y ago

How does this compare with the "dh-virtualenv" tool recommended by the article?

Edit: found https://py2deb.readthedocs.org/en/latest/comparisons.html

1 more reply

zobzu10y ago

fpm is cool - very cool. that said it wont create proper spec files and friends, thus its harder to maintain the packages properly. also they usually ont get accepted upstream due to that.

that said, for python files and simple packages it works well enough!

jlees10y ago

That seems like a neat tool. I wonder if you could combine it with the sandboxing that dh-virtualenv provides to get the best of both worlds?

rhelmer10y ago

I've used fpm to make rpm and deb packages that simply include a virtualenv, it works ok.

However, being able to have install instructions that amount to "yum/apt-get install <package>" is pretty great.

I am hoping for an app/container convergence at some point, but we might need to drop the fine-grained dependency dream and have them be more self-contained, like Mac OS X apps.

1 more reply

amatix10y ago

we use fpm & something like dh-virtualenv along exactly those lines. Helps us manage a complex mix of native/system-level dependencies (non-python) as well as python packages.

doki_pen10y ago

We do something similar at embedly, except instead of dh-virtualenv we have our own homegrown solution. I wish I new about dh-virtualenv before we created it.

We use devpi to host our python libraries (as opposed to applications), reprepro to host our deb packages, standard python tools to build the virtualenv and fpm to package it all up into a deb.

The most important thing is that you have a standard way to create python libraries and application to reduce friction on starting new projects and getting them into production quickly.

remh10y ago

We fixed that issue at Datadog by using Chef Omnibus:

https://www.datadoghq.com/blog/new-datadog-agent-omnibus-tic...

sytse10y ago

jlees10y ago

kbar1310y ago

http://pythonwheels.com/ solves the problem of building c extensions on installation.

emidln10y ago

Pair this with virtualenvs in separate directories (so that "rollback" is just a ssh mv and a reload for whatever supervisor process) and you get to skip the mess of building system packages.

toomuchtodo10y ago

Yes. I've seen them, and they've been huge shops.

viraptor10y ago

> you get to skip the mess of building system packages.

1 more reply

vamega10y ago

Can you point me to your recommended PyPI-in-a-box system?

3 more replies

tschellenbach10y ago

Yes, someone should build the one way to ship your app. No reason for everybody to be inventing this stuff over and over again.

Deploys are harder if you have a large codebase to ship. rSync works really well in those cases. It requires a bit of extra infrastructure, but is super fast.

erikb10y ago

mattbillenstein10y ago

+1 rsync is pretty darn good at any scale -- I'm not sure why the simplest solution possible doesn't beat out docker as a suggestion in this thread.

And a rollback is just a git revert and another push away -- no need to keep build artifacts lying around if you believe your build is deterministic.

viraptor10y ago

Rsync is good for simple things. But it will fail with more complicated apps:

- how do you know which version you're running right now?

- how do you deploy to two environments where different deps are needed?

- how do you tell when your included dependencies need security patches?

1 more reply

stephenr10y ago

They have. Every mainstream OS/Distro has a packaging system and an installer for those packages.

For server-side apps like this, that usually means a Deb or an RPM. These systems handle upgrades, rollbacks, dependencies, etc.

Just because some people decide that writing an RPM specfile or running dh_make is too hard to work out, doesn't mean that the solution doesn't exist.

sandGorgon10y ago

The fact that we had a weird combination of python and libraries took us towards Docker. And we have never looked back.

For someone trying out building python deployment packages using deb, rpm, etc. I really recommend Docker.

craigmccaskill10y ago

They specifically called that out in the article with an entire section called "just use docker".

knicholes10y ago

x0x010y ago

if you have to choose between a kernel and docker, just choose docker. Python can't get their shit together deployment-wise, and docker is the one true route (tm) to python deployment happiness.

just use docker. It's going to go like this:

step 1: docker

step 2: happy

1 more reply

jlees10y ago

To be fair, what didn't work for Nylas might well not be an issue for others. There's definitely more than one way to skin a cat, especially in the Python world.

1 more reply

dwb10y ago

Indeed, we actually use Docker to build packages. Blog post coming soon, maybe.

jacques_chester10y ago

In a few months Cloud Foundry will natively support launching, placing, managing, wiring, routing, logging and servicing Dockerised application images.

In the meantime you can get a taste with Lattice[0].

[0] http://lattice.cf/

jedberg10y ago

There was a whole paragraph in the article about why Docker didn't work for them.

x0x010y ago

they had 3 reasons:

one of which was just silly (kernel version -- are you living on that point release forever?)

and one of which definitely sounds painful (docker vs extant ansible playbooks)

2 more replies

sophacles10y ago

We use a devpi server, and just push the new package version, including wheels built for our server environment, for distribution.

On the app end we just build a new virtualenv, and launch. If something fails, we switch back to the old virtualenv. This is managed by a simple fabric script.

nZac10y ago

erikb10y ago

kevinschumacher10y ago

    git clone --depth=1 path/to/repo

when doing a clone for a deploy, since you don't need the history

so0k10y ago

we download to a folder on the docker build server and build docker containers from this cache.

see here: http://stackoverflow.com/a/29936384/138469

viraptor10y ago

> curl “https://artifacts.nylas.net/sync-engine-3k48dls.deb” -o $temp ; dpkg -i $temp

sciurus10y ago

Yes, that part really surprised me. Barebones repos are useful enough, and there's also some pretty fancy tools out there like http://www.aptly.info/

mixmastamyk10y ago

Could you elaborate on the simple folder?

For example I find the --instdir option to dpkg but it still would have to be downloaded from the other host, unless of course the folder was mounted somehow.

viraptor10y ago

Search for debian's "trivial archive". It replaces release/component elements with explicit path. It's deprecated now, but I believe still works.

perlgeek10y ago

Note that the base path /usr/share/python (that dh-virtualenv ships with) is a bad choice; see https://github.com/spotify/dh-virtualenv/issues/82 for a discussion.

You can set a different base path in debian/rules with export DH_VIRTUALENV_INSTALL_ROOT=/your/path/here

serkanh10y ago

juliangregorian10y ago

It's not necessarily true either; it's not difficult to have your continuous build process build images from the Dockerfile, run tests, swap green and blue, etc...

erikb10y ago

No No No No! Or maybe?

objectified10y ago

rfeather10y ago

StavrosK10y ago

vacri10y ago

> different settings per machine

/etc/default/mycoolapp.conf

acdha10y ago

1. http://chris.improbable.org/2009/10/16/deploying-django-site...

doki_pen10y ago

tspike10y ago

avilay10y ago

Here is the process I use for smallish services -

1. Create a python package using setup.py 2. Upload the resulting .tar.gz file to a central location 3. Download to prod nodes and run pip3 install <packagename>.tar.gz

Rolling back is pretty simple - pip3 uninstall the current version and re-install the old version.

Any gotchas with this process?

mixmastamyk10y ago

If you are using it for small services it's probably fine. But the original article did say that uninstall sometimes doesn't work correctly. Apt is more formal than pip.

So at some point, as you know you'll need to move on.

webo10y ago

You have to do this every time there's a change in the codebase which is not easy. How do you stick this into a CI without the git & pip issue talked about in the post?

avilay10y ago

I have to do this everytime I have to deploy, which is similar to having to create a deb package everytime Nylas has to deploy.

There are no git dependencies in the process I describe above.

1 more reply

velocitypsycho10y ago

I vaguely remember .deb files having install scripts, is that what one would use?

viraptor10y ago

Depends on your environment, number of hosts, etc. really. You probably don't want to stick it into the same install script because:

- your app user doesn't need rights to modify the schema

- you need to handle concurrency of schema upgrades (what if two hosts upgrade at the same time?)

- if your migration fails it may leave you in the weird installation state and not restart the service

Ideal solution: deploy code which can cope with both pre-migration and post-migration schema -> upgrade schema -> deploy code with new features.

jlees10y ago

stephenr10y ago

Depends on your infra. If it's a single server with the app + Db (or a single app server + single DB server) you could have a postinst script that calls your app/framework's migration system.

If your migration system is smart enough (or you can easily check the migration status from a shell script) you could also do this in a multi-app-server environment too.

lifeisstillgood10y ago

Weirdly I am re-starting an old project doing this venv/ dpkg (http://pyholodeck.mikadosoftware.com). The fact that it's still a painful problem means Inam not wasting my time :-)

webo10y ago

> Building with dh-virtualenv simply creates a debian package that includes a virtualenv, along with any dependencies listed in the requirements.txt file.

So how is this solving the first issue? If PyPI or the Git server is down, this is exactly like the git & pip option.

stephenr10y ago

You need those things up to build the package not to install it

webo10y ago

Ah I misunderstood the article. I just package my application source in a tar during deployment. I thought that's what most people do.

1 more reply

compostor4210y ago

Great article. I had never heard of dh-virtualenv but will be looking into it.

How has your experience with Ansible been so far? I have dabbled with it but haven't taken the plunge yet. Curious how it has been working out for you all.

emfree10y ago

compostor4210y ago

Thanks for the article. Well written and a very interesting concept.

BuckRogers10y ago

ytjohn10y ago

If you took this a step further and set up a debian repo, then you could have your clients use that debian repo.

I'm looking to do something pretty similiar, but RPMs. I found rpmvenv that seems to work in the same fashion. https://pypi.python.org/pypi/rpmvenv/0.3.1

stephenr10y ago

Exactly this.

stephenr10y ago

Why? Saying it runs on Debian X is much more easy to facilitate by your end users than "requires docker"

ah-10y ago

conda works pretty well.

TDL10y ago

Agreed (although biased since I used to work at Continuum.) I am wondering what others think of conda?

4lejandrito10y ago

tuckermi10y ago

hcrisp10y ago

I use conda as well and found it to be great. I love how it detects and manages dependencies for you when you install a new module.

pekk10y ago

nashequilibrium10y ago

Love it, i don't hear much about it on HN but i personally really like it.

rlvesco710y ago

I like it too, but I had issues when I had to compile packages and other stuff. In particular, its version of qmake can interfere in unpredictable ways.

jacques_chester10y ago

Here's how I deploy python code:

    cf push some-python-app

So far it's worked pretty well.

Works for Ruby, Java, Node, PHP and Go as well.

nobullet10y ago

That's interesting. I've tried to google cf CLI however wasn't able to find good documentation. Is it possible to install cf CLI on my server? Or is it Cloud Foundry tool only?

jacques_chester10y ago

The cf CLI tool interacts with a Cloud Foundry installation.

You'd use it for one in your own data centre, or Pivotal Web Services[0], or BlueMix. You point it at an API and login, then off you go.

If you need something more cut-down to play with, Lattice[1] is nifty, but currently doesn't do buildpack magic.

[0] https://run.pivotal.io/ [1] http://lattice.cf/

daryltucker10y ago

I see your issue of complexity. Glad I haven't ever reached the point where some good git hooks no longer work.

theseatoms10y ago

Does anyone have experience with PEX?

stefantalpalaru10y ago

> The state of the art seems to be ”run git pull and pray”

themartorana10y ago

Git deployments work great if you're packing an image (AMI, Docker) using, say, Packer. But we only deploy "immutable" images, not code on to existing servers.

hobarrera10y ago

> The state of the art seems to be ”run git pull and pray”

Looks like these guys never heard of things like CI.

jumpkick10y ago

You had to read further.

This is the core of how we deploy code at Nylas. Our continuous integration server (Jenkins) runs dh-virtualenv to build the package, and uses Python’s wheel cache to avoid re-building dependencies.

andrewchambers10y ago

Or release branches/tags.

j / k navigate · click thread line to collapse