Automating Openstack Testing on Ubuntu

(Original Post)

During the Ubuntu precise development cycle the Canonical Platform Server Team have been working on automating testing of Openstack on Ubuntu.

The scope of this work was:

  1. Per-commit testing of Openstack trunk to evaluate the current state of the upstream codebase in-conjunction with the current packaging in Ubuntu precise and the current Juju charms to deploy Openstack.
  2. SRU testing for Openstack Diablo on Ubuntu 11.10.

Openstack do a lot of pre-commit testing through the use of gerrit with Jenkins; we wanted to supplement this with Ubuntu focused testing to provide another dimension to the testing already completed upstream.

So grab a coffee and make yourself comfortable; this is not a short read….

Lab Setup

The Ubuntu Openstack QA lab consists of 12 servers; the primary server in the solution is an Ubuntu 11.10 install providing the following functions:

  1. Juju – used to deploy Openstack charms in the Lab
  2. Cobbler to support server provisioning (using the Ubuntu Orchestra packages in Oneiric)
  3. Jenkins CI – provides triggering based on upstream commits to github repositories and general job control and reporting.
  4. Schroots for Oneiric and Precise for building packages locally
  5. A reprepro managed local archive for Oneiric and Precise
  6. Squid based archive caching to reduce installation times in the lab

This server also acts at the gateway into and out of the Lab (it’s setup as a NAT router).

The other 11 servers are registered in Cobbler; All servers are connected to a Sentry CDU (Cabinet Distribution Unit) which allows full power control from Cobbler – thanks goes to Andres Rodriguez for developing the required fence component for Cobbler to support this type of CDU.

Preseeded LVM Snapshot Installs

To initiate a new integration test run requires all machines to be powered down and re-provisioned from scratch.  It is essential that our deployment and test runs can cope the frequency of upstream commits, particularly as the frequency increases as Openstack approaches milestones and releases.   After getting the initial lab setup in place, we were able to tear down all machines, re-provision and deploy Openstack in ~30mins.

It was important that we are able to minimize the time taken to complete the testing cycle.   To do so, we’ve employed the use of LVM snapshotting and restoration of the root partition during the the netboot installation.   The process is as follows:

  1. Test run begins
  2. Juju deploys a service (i.e. nova-compute)
  3. A machine is netbooted and a preseeded LVM-based Ubuntu installation takes place onto /dev/qalab/root
  4. At the end of the installation, the root filesystem is moved to /dev/qalab/pristine-[release]-root and a snapshot created at /dev/qalab/root
  5. The machine reboots, runs Juju and deploys nova-compute as pat of the rest of the Openstack deployment. This deployment is smoke tested.
  6. The next test run begins.  All machines are terminated. Juju redeploys nova-compute, a machine is netbooted and Ubuntu installation kicks off.
  7. The installation checks for the existence of a logical volume at /dev/qalab/pristine-[release]-root.  If it exists, it creates a new snapshot at /dev/qalab/root and reboots. If it does not, continues with installation and goto step 4.
  8. System reboots, Juju installs and redeploys nova-compute to a fresh Ubuntu installation.

This process takes place on all nodes in parallel.  With it in place, we were able to cut down the time it took to tear-down and re-provision a node from ~30 minutes to 10 to 15 minutes depending on the service being deployed.

By taking this approach we are also minimize the chance of any nodes hitting an archive inconsistency during installation. This is a known issue when deploying the development release and halts installation on any node that hits it, failing the entire deployment.

All of this is embedded in debian-installer preseeds via Cobbler snippets.  The snippets and kick starts are available at lp:~openstack-ubuntu-testing/+junk/cobbler-lvm-snapshot.

In the future, we’ll be investigating the use of kexec as an alternative to reboot after snapshot restoration to reduce the time spent waiting on servers to boot.  This should minimize the test cycle even more. Credit to James Blair for the idea (see http://amo-probos.org/post/11).

Management of Jenkins

All of the projects in Jenkins are managed using Jinja2 XML templates in-conjunction with python-jenkins (python-jenkins); this makes it really easy to setup new jobs in the lab and reconfigure existing ones as required (as well as providing great backup!).

Templates and management scripts can be found in lp:~openstack-ubuntu-testing/+junk/jenkins-qa-lab

Testing Openstack Essex on Ubuntu Precise

This testing was the first to be setup in the lab.  Jenkins (using the git plugin) monitors the upstream github.com repositories for commits on the master branch.  When a change is detected the following process is triggered:

Build

Objective: Validate that upstream trunk still builds OK with current packaging for Ubuntu.

  1. A new snapshot upstream tarball is generated based on the latests commit to the upstream component.
  2. The latest archive packaging for the component is pulled in from lp:~ubuntu-server-dev/<COMPONENT>/essex
  3. Any changes in the testing packaging for the component are merged from lp:~openstack-ubuntu-testing/<COMPONENT>/essex
  4. New changelog entries are automatically created for the new upstream commits.
  5. The source package is generated and built in a clean schroot using sbuild locally.

On the assumption that the package built OK locally:

  1. The source package is uploaded to the Testing PPA (ppa:openstack-ubuntu-testing/testing)
  2. The testing packaging branch is push back to lp:~openstack-ubuntu-testing/<COMPONENT>/essex.
  3. The binary packages from the sbuild are installed into the local reprepro managed archive.

This process is managed by a single script (tarball.sh); Credit to Chuck Short for pulling together this part of the process based on work from Openstack upstream.

For changes to the nova project the deploy phase is then executed.

Deploy

Objective: Validate that packages install, can be configured and reach a know good state prior to execution of testing.

This phase of testing uses Juju with Cobbler to deploy Openstack into the QA lab infrastructure; It utilizes branches of the Openstack charms to support use of a local archive along with a deployer wrapper around Juju written by Adam Gandelman which executes the actual deployment using Juju and monitors for errors.

The deployer is configured to know where to get the right codebase for the Openstack charms, which services to deploy and which relations to setup between services. As you can see from the above diagram this is non-trivial but the charms and Juju do most of the hard work.

Once Openstack is deployed successfully the test phase is then executed.

Test

Objective: Validate that the Openstack deployment in the lab actually works!

At this point, we can run any integration tests we wish against the newly deployed cloud.  This testing is able to help us achieve multiple goals:

  • Early detection of upstream bugs that break Openstack functionality on Ubuntu
  • Verification that packaging branches in the development version of Ubuntu are compatible with upstream trunk.
  • Using these packages, verification that our Juju charms are deploying a functional Openstack cloud and are up-to-date with any deployment-related configuration changes upstream.

At the moment this phase looks like this:

  1. Configure the Openstack deployment (Adams deployer script provides some utility functions for locating specific services in the environment)
    • Creates network configuration in Nova for the private instance network as well as a pool of public floating IPs.
    • Upload an image into the Glance server for use during testing
    • Creates EC2 credentials in the Keystone server for use during testing.
  2. Run the devstack exercise test scripts which ensure basic functionality of the deployment. Currently, this includes:
    • Basic euca-tools EC2 API for starting and stopping instances
    • EC2 AMI bundle uploads
    • Floating IP allocation, association and connectivity to instance
    • Volume creation and attachment to instance

Note: These are the same sets of tests that are currently run against proposed commits to gerrit upstream.

Longer term we aim to use the Openstack Tempest test suite in the lab; Adam is currently working on getting this up and running.

Reporting

The Jenkins instance in the QA lab is not publicly accessible; however all jobs run in the lab are published out (using the Jenkins build-publisher plugin) to http://jenkins.qa.ubuntu.com so that people can see the current state of the testing packaging in Ubuntu precise.

We are also working on setting up email notifications.

Success so far

Juju charms deploy Openstack components in a configuration that is compatible with upstream trunk prior to updates to packaging in Ubuntu.  Previously packages were updated in the archive first while Juju charm updates lagged behind as incompatibilities were uncovered after the fact.

We enabled automated testing 2 days prior to the 3rd Essex milestone release.  We were able to uncover and help fix a handful of bugs upstream before the release, including critical bugs like 921784.  In the past, these bugs were typical uncovered after the release (both upstream and in Ubuntu).

Since E3, there have been even more critical bugs uncovered by this testing and fixed upstream, some of which are only applicable to Ubuntu-specific configurations (not tested upstream) and would have been uncovered by users after code hit the Ubuntu archive (See 922232).

Further Plans for the Lab

Pre-commit  testing of changes to stable branches;  The Ubuntu Server team are  working upstream on maintaining the stable branches of released versions  of OpenStack – this work will validate patches proposed to stable  branches in review.openstack.org against the current version of the  packaging in released versions of Ubuntu.  Initially this will target  Diablo on Ubuntu 11.10 but will also support Essex on Ubuntu 12.04 once  released.  Ideally the testing process will provide feedback on  review.openstack.org to help the stable release team review proposed  patches.

References

Jenkins job configurations: lp:~openstack-ubuntu-testing/+junk/jenkins-qa-lab

Scripts supporting the lab: lp:~openstack-ubuntu-testing/+junk/jenkins-scripts

LVM snapshot preseeds and Cobbler snippets: lp:~openstack-ubuntu-testing/+junk/cobbler-lvm-snapshot

All other relevant scripts, charm branches, etc: https://code.launchpad.net/~openstack-ubuntu-testing/

Credits

Overall management of delivery and general whip cracking: Dave Walker

Lab installation and base configuration: Pete Graner, Tim Gardner, Brad Figg, James Page

Fence agent for network power control of servers: Andres Rodriguez

Source package creation and build process: Chuck Short and James Page

Deployment testing using Juju: Adam Gandelman

Testing of Openstack: Adam Gandelman

Jenkins packaging, configuration and management: James Page

Gerrit Plugin for pre-commit testing and generally great ideas: Monty Taylor and James Blair

Writing and reviewing this post: Adam Gandelman, Chuck Short and Dave Walker.

Tags: