You've spent a lot of time setting up a private cloud of servers. Everything's virtualized and you have it organized by function. Your messaging VMs run on these hosts and your web servers run on those hosts. You've tested it extensively and you're happy with how everything talks to each other. The worst is over, right? Wrong. Now you have to move past the theoretical and actually use this thing in production. It's time to start deploying the applications you're building into this cloud of virtualized resources. It's time to develop some scheme to keep your applications updated when changes are made. Keep in mind, whatever mistakes you inject at this point will be multiplied by the number of machines that deploys to.
Don't be! It's really not that hard. In this article, I'll introduce you to some concepts I used in developing the fairly simple system of messages and scripts that deploy artifacts into our private cloud. This won't be a technical HOWTO so much as it will be a casual dinner conversation about the pitfalls and rewards. Above all, I want to get across that having a bunch of virtual machines that do the same thing doesn't have to keep you up at night.
I don't think I've really made it a point to clarify how we handle security with an internal, private cloud. Since we don't run servers that are publicly-accessible, we can use tools that make assumptions about our environment that you likely won't be able to make if you're running the same thing on a public cloud. Our servers are protected from all external access. They are segregated onto their own VLAN. They are secured by exposing only essential ports like messaging, web, and SSH. We can get away with exposing critical system management functionality to things like messaging servers because we have other "compensating controls" in place to manage our security. Running in a public cloud opens all manner of cans of worms when it comes to security. To circumvent these entirely, we run our cloud in a private setting.
One of the backbones of UNIX-based server clusters is SSH. It's a great tool for managing groups of servers non-interactively—assuming you have your keys exchanged properly. Our pre-cloud architecture relied on just such a system. A Subversion check-in ran a hook script that pushed changes to the application servers by running an "svn update" command on those servers via SSH. But the severe limitation here is that you have to propagate SSH keys to however many machines you want to connect to. Doing that, some might argue, isn't all that hard. I'd agree—if you're talking about doing it only once and only from one master machine to multiple slave machines. But when each server needs to potentially talk to all the other servers, propagating that many SSH keys would become, let's just say: inconvenient (I wanted to say: "a huge pain in the arse"). Beyond that, using SSH as a deployment mechanism usually means becoming a slave to Bash and being forced to learn to use Awk correctly. I'd rather have Chinese water torture.
It makes a lot of sense to tie your deployment mechanism, in some manner or fashion, to your source code control mechanism. If you don't already have a source code control mechanism, I'd have to frankly ask: why the heck not? We used to use Subversion. Now we've switched to using Git. I love Git. It's a fantastic SCM; it's lightweight and easy to use. But the problem with using your SCM as your deployment tool is pretty obvious: I don't want stuff deployed to production every time I check something in. When we were using Subversion checkins as the deployment triggers, we would simply not check anything in until we were ready to deploy. That just doesn't work in a team environment. I can't wait until the far future deployment date to get your changes incorporated into my own work. I need those changes whenever you're finished with them. At a minimum, the SCM hook scripts can't have the final say in what gets deployed.
SpringSource's tcServer can be managed by Hyperic, which has an administrative function to deploy artifacts to clusters of configured servers. I don't use Hyperic, so I can't speak to the efficacy of such a solution, but I'm sure it's a viable way to manage artifact deployment—so long as all you're talking about is WAR files destined for application servers. If you also have to deploy HTML artifacts and/or configuration files with that, then you'll have to supplement the WAR file deployment with one of the other mechanisms.
This is the route I took because none of the three other solutions I've mentioned here would work for me. And until private cloud architectures are a first-class citizen and have a bevy of tools that are as easy to use as "sudo gem install cloud-utils && deploy myapp.war" (I've almost got us there, the only caveat is that you have to edit a config file before running the "deploy"), you'll likely be writing something on your own. I'll help you along as much as I can, but I suspect that, given the dramatic variation in private cloud architectures, you'll still be stuck writing some custom scripts. Look at it as a learning experience—a chance to pad your resume with more cool stuff so you can finally bail when the economy picks back up, right?
When the initial build has completed successfully and all the tests have passed and all the artifacts have been built, TeamCity stages these changes immediately onto the development server. That's the only server where we actually want every change we check in to show up right away. The first half of the automated deployment process stops right here. This cycle of building source code and staging to development might happen once or it might happen a half dozen times before we've tested our changes and are ready to let them graduate to production. TeamCity has a web interface which allows the developer to simply click a button to run a build configuration. To deploy our artifacts it is, without exaggeration, as simple as clicking a button in a web page. But to do this, we need a message broker.
We've built some TeamCity configurations that leverage our asynchronous messaging infrastructure to notify the other servers that the artifacts it has just assembled are ready to be deployed. The reason we use a messaging broker to notify clients of deployable artifacts rather than some sort of active "scp" or "ftp" process is because the servers we might want to deploy to may not be active at the moment. By using a messaging server, we gain the ability to let the individual VMs themselves figure out whether or not they need to download the new artifact and deploy it whenever it's appropriate for that VM to do so. You probably also don't want new WAR files being deployed willy-nilly. If there's an urgent fix for some application that we need to deploy immediately, we can do so by calling the deploy script manually. But if at all possible, we let the automated process check for new deployable artifacts at 7:00 a.m. every morning. We don't have many users at that time and it's close enough to normal business hours that, if something goes horribly wrong, we can fix it without a lot of downtime between the actual deployment and when we get in of a morning. It doesn't matter all that much, of course, what messaging system you choose to perform this function. But if you've read any of my previous articles, you already know I'm a huge fan of the RabbitMQ AMQP message broker for all my messaging needs. The Ruby scripts I've written (the Gem of which is on rubygems.org) use the Ruby AMQP client and are written against RabbitMQ 1.8. Within TeamCity, we have a build configuration which simply dumps a message into a queue. That script puts the name of the artifact in the body of the message. It's no problem to send a bunch of these. When the monitor script on the application server runs, it will only look at the last message in the queue and disregard any intervening messages.
There are two parts to deploying artifacts via RabbitMQ: the "monitor" script, which checks to see what artifacts need to be deployed, and the "deploy" script, which does the actual deployment. When we want to deploy artifacts manually, or we want to force a server to download and deploy a new artifact, we can run the "deploy" script from the command line. Otherwise, we try and let the automated processes handle it, if for no other reason than that it enforces a little bit of consistency on when artifacts get deployed. At 7:00 a.m. every morning, a cron job runs the cloud-utils' monitor script. This script reads a configuration file and, for each configured entry, checks that queue for a deployment message. Since we leave old messages on the queue for an undetermined amount of time, there might be several messages to sort through. Each deployment message contains the name of the artifact to be deployed and the MD5 sum of that artifact. Each server keeps track of which artifacts have already been deployed and will simply skip over any message it finds with one of these MD5 sums. Each monitor configuration also contains a command to run when it's been determined that a deployment artifact is ready to be deployed. If the monitor script sees a deployment message with an MD5 sum it hasn't seen yet, it will run whatever command you put in that configuration section. That said, we use the other half of the cloud-utils package: the "deploy" script. The deploy script attempts to download the artifact from the location you've configured in the deployer's configuration file. It will even go so far as to check the ETags (assuming your deployment server sends such a header) to see whether it needs to re-download that file again. If the ETag doesn't match one it has already downloaded (or you specify the "force" flag), the script will pull that artifact down over HTTP and either copy it directly to the location you specify, or (if you've configured it to do this) will attempt to unzip that downloaded file to the location you've set. It will do some rudimentary checking and try to use the "unzip" or "tar" command, depending on the file extension (".zip" or ".tar.gz", respectively). The flexibility inherent in using asynchronous messaging instead of an active SSH-based solution for deploying artifacts into your private cloud makes it worth the time it takes to set this system up. VMs don't need to share SSH keys and developers need only click a button in a web interface to deploy their artifacts into the cloud. You have better things to do than schlepp about on the command line come time to deploy your web artifacts into your private cloud.