Recap: OpenStack Atlanta Meetup Mar 01

Armed to the teeth with coffee, Danishes and laptops, a band of merry men, led by Sir Duncan McGreggor*, set out yesterday morning to rob the royal Essex release of any and all bugs.  The gathering of this crack dev team marked the inaugural Meetup of OpenStack Atlanta, a new local chapter in the rapidly growing community.

DreamHost OpenStack Atlanta Meetup

DreamHost’s team in Atlanta facilitated the Hack-In event and hosted it in a cozy cottage.  Rakish beards aside, these happy hackers made great strides in testing Essex.  In order to maximize their impact, the Atlanta-based group reached across the country and coordinated their efforts with fellow coders at Canonical in Colorado and DreamHost in San Francisco.

 

Tech highlights include:

  • Testing out development deployments of OpenStack using Vagrant (some successes, some blockers)
  • Testing out dev deployments of OpenStack using VirtualBox directly
  • Filed some bugs for issues in Horizon regarding error feedback to users and how the documentation is generated
  • Dug into issues with logging and inconsistencies in datestamps
  • Uncovered some weirdness with the usage of gnu screen and hanging services/partial DevStack installs due to sudo assumptions (DevStack assumes a passwordless sudo, and will label an install as failed if it gets hung up on the apache log tail, waiting for a password, even if the install was successful and all the services started correctly)
  • Doug Hellmann made his first commit upstream to OpenStack

Bug banishment:  A bug in Horizon was uncovered and was confirmed later that day.  It’s currently marked as high priority and is slated to be fixed in the first Essex release candidate.

Special thanks to Lloyd Dewolf and Tristan Goode for facilitating the Essex OpenStack Global Hack-In.

*Knighthood could not be confirmed at the time of this posting.

 

Written by: Brent Scotten

Another Step Forward for OpenStack in Production

Today Mark Interrante, Rackspace’s head of product, announced that the next generation of Rackspace’s Cloud Servers, powered by OpenStack Compute (Nova), is moving from Alpha to Beta for customers.  This is a major endorsement of the continuing maturation of OpenStack and its readiness to be deployed today.  Here are a couple of highlights about this launch:

  1. Rackspace is running very close to trunk, with basically a two week delta between what the Cloud Servers Beta is deploying and what is available at OpenStack.org.  This means that anyone has access to the same technology Rackspace is using today.
  2. Rackspace is exposing the OpenStack Compute API v2 developed by the community over the past year.
  3. This offering is a major step forward in terms of performance.  Mark highlights the ability to rapidly spin up instances and the better performance from the API.  The early demos of the beta have been truly remarkable.
  4. Rackspace continues to run OpenStack Object Storage (Swift) in production with its Cloud Files offering, and is actively working on offerings powered by other projects in the OpenStack community.  There is much more to come.
  5. Rackspace is continuing to increase its investment in moving the OpenStack community forward.  We are continually adding developers to push the code base forward, and adding our expertise to other community efforts including testing and documentation.

This is offer is not production-ready yet, but it is getting much closer.  Much of the work that needs to get done is specific to Rackspace’s service provider scale.  In the meantime, companies with smaller deployment needs are putting OpenStack into full production today.  We expect to see that pace continue to accelerate.

Congrats to Mark and the entire Rackspace team!

Jim Curry
@jimcurry

Essex OpenStack Global Hack-In

Essex is a release focused on improving the integrity of OpenStack, so our first Global Hack-In is right at the start of the release candidate cycle, the start of March, 2012. The event brings together physically developers that spend their days or their nights making OpenStack.

http://wiki.openstack.org/OpenStackUsersGroup/EssexGlobalHackIn

The event is focused on testing and getting familiar with the release and exploring areas outside of your expertise. Collaborators will be working on resolving high priority bugs, and exploring, reviewing and testing OpenStack Essex.

Thanks in advance to DreamHost, Gold Coast Techspace, Piston Cloud, MercadoLibre, Yahoo, Mirantis, Citrix, Ubuntu, and Rackspace for sponsoring these events. Special thanks to my co-conspirator Tristan Goode of Aptira and the OzStackers for representing with 3 local events in Australia!

It’s not too late to have a Hack-In at your location, just add the information to the wiki. Even if it’s just you and a friend opening your door to fellow stackers, it’s a great way to start a local community.

OpenStack Sydney Meetup Special Event with James Williams

Tristan Goode from Aptira, James Williams from NASA, and Phil Rogers from Aptira

Last Friday February 24 at the Harbour View Hotel in Sydney the Australian OpenStack User Group had a special meetup with James Williams, CIO of NASA AMES Research Centre. For those not in the know James and AMES had a key role in the establishment of Nebula and OpenStack.

Last week we heard James was in the country for a conference and was passing through Sydney, so an invitation to meet the User Group was extended and James graciously accepted at very short notice. James’ enthusiastic evangelism for OpenStack was very inspiring and everyone attending managed to get a chat in. On behalf of the Australian OpenStack User Group and the evening’s sponsor Aptira, thanks to James for fitting us into his hectic schedule.

More pictures are available over at TechWorld.

Head to our Australian Meetup group to get involved, or join the AU Google group.

 

Tags:

OpenStack Governance Elections Spring 2012: Time to vote!

UPDATE: PPB election update: we need to reboot the voting process. Please accept our apology. Read more on http://ow.ly/9lNGf

The OpenStack community is called to elect the Project Technical Leads and two seats of the Project Policy Board. The nominations process is now officially closed and voting can start: all entitled to vote will receive a personal message via email on February 28 and have time until March 3 11:59 PST to vote. The email message will go to the email address included in the Authors file and the one provided during the registration for PPB votes.

The official list of nominees (in random order) is the following:

NOVA Project Technical Lead (1 position)

KEYSTONE Project Technical Lead (1 position)

HORIZON Project Technical Lead (1 position)

SWIFT Project Technical Lead (1 position)

GLANCE Project Technical Lead (1 position)

PROJECT POLICY BOARD (2 positions)

Voting process

Like previous OpenStack Governance Elections, we will use the Condorcet Internet Voting Service from Cornell University, http://www.cs.cornell.edu/andru/civs.html.  This tool uses the Condorcet method of voting which invokes ranking the  nominees instead of just selecting one choice. More information on this  methodology is at http://www.cs.cornell.edu/w8/~andru/civs/rp.html. All registered voters will receive an email with a unique link allowing them to privately vote.

Please note that the voting system is run using private polls with  restricted access to ensure voter authenticity; however all results will  be made public once the election ends. Voter anonymity is guaranteed.  The result’s ranking will be evaluated using Schulze (also known as Beatpath or CSSD) completion rule. If an individual should happen to be elected as both a PTL and General  Member of the PPB, then they will take their PTL seat only and the  elected General Member seat will go to the next highest vote getter in  the most recent election.  Thanks for participating in this essential process.

The election committee is made of Stefano Maffulli, Lloyd Dewolf and Dave Nielsen.

Tags:

Community Weekly Review (Feb 17-24)

OpenStack Community Newsletter –February 24, 2012

HIGHLIGHTS

EVENTS

OTHER NEWS

COMMUNITY STATISTICS

  •  Activity on the main branch of OpenStack repositories, lines of code added and removed per developer during week 7 of 2012 (from Mon Feb 13 00:00:00 UTC 2012 to Mon Feb 20 00:00:00 UTC 2012)

This weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

Tags:

OpenStack Governance Elections Spring 2012: Action Item For All Candidates

The OpenStack community is electing its Project Technical Leads and two members of the Project Policy Board. Details are at http://www.openstack.org/blog/2012/02/openstack-governance-elections-spring-2012/. On February 26 the nominations will close and the voting process will start on February 28 and finish on March 3rd.

The list of nominees is at http://etherpad.openstack.org/Spring2012-Nominees. It’s still open. You must register to vote for PPB on http://ppbelectionsregistration.openstack.org/

Before the voting process starts the election committee asks all nominees to create a page on OpenStack wiki and answer three simple questions:

1a. [for PPB] Since the last elections, what areas have you focused on and what contributions have you made in order to improve OpenStack as a whole?

1b. [for PTL] Since the last elections, what areas have you focused on and what contributions have you made in order to improve your project?

2a. [for PPB] What are the most pressing/important issues facing OpenStack as a whole?

2b. [for PTL] What are the most pressing/important issues facing your project?

3. What is your relationship to OpenStack & why is its success important to you and/or your company?

If you’re a candidate, create a wiki page using the template http://wiki.openstack.org/Governance/ElectionsSpring2012/[Firstname_Lastname] and answer those questions there. Feel free to add more content, too. Those pages will be included in the link sent to all voters.

The election committee is made of Stefano Maffulli, Lloyd Dewolf and Dave Nielsen.

Tags:

Community Weekly Review (Feb 10-17)

OpenStack Community Newsletter –February 17, 2012

HIGHLIGHTS

EVENTS

OTHER NEWS

COMMUNITY STATISTICS

  •  Activity on the main branch of OpenStack repositories, lines of code added and removed per developer during week 6 of 2012 (from Mon Feb 06 00:00:00 UTC 2012 to Mon Feb 13 00:00:00 UTC 2012)

This weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

Tags:

TryStack.org – A Sandbox for OpenStack!

Today, a project that has been a long time in the making is finally coming to fruition. Back last summer, when I was working at Rackspace, Nati Ueno from NTT PF Labs came up with the idea of establishing a “Free Cloud” — a site running OpenStack that developers using the OpenStack APIs could use to test their software.

The involvement of a number of companies — Dell, NTT PF Labs, Rackspace, Cisco, Equinix and HP Cloud Services — eventually drove the idea of “The Free Cloud” from concept to reality. Though there were many delays, I’m happy to announce that this new OpenStack testing sandbox has now been launched.

The new name of the artist formerly known as The Free Cloud is now called TryStack. The front page at http://trystack.org now houses some information about the effort and a few FAQs. If you’re interested in trying out OpenStack, TryStack.org is right up your alley, and we encourage you to sign up. Instructions for signup are on the front page.

Using TryStack

Once you’ve gone through the TryStack registration process, you will receive a username and password for logging in to the TryStack Dashboard and Nova API. After logging in, you’ll see the user dashboard for TryStack. This dashboard is essentially the stock upstream OpenStack Dashboard — only lightly modified with a couple extensions that Nati wrote to show consumption of Stack Dollars and the TryStack logo.

Stack Dollars are the “unit of currency” for TryStack. When you perform certain actions in TryStack — launching instances, creating volumes, etc — you consume Stack Dollars. Likewise, instances consume Stack Dollars as long as they are running. When you run out of Stack Dollars, you won’t be able to use TryStack services until your Stack Dollars are replenished. Stack Dollars will be replenished on a periodic basis (haven’t quite decided on the interval yet…)

To prevent people from gaming the system or using TryStack as a tool for evil, instances will remain alive for up to 24 hours or until your Stack Dollar balance is depleted. Also, always keep in mind that TryStack should only be used for testing. Under no circumstances should you use it for any production uses. There is no service level agreement with TryStack.

An Automation and Administration Sandbox Too!

In addition to being a useful sandbox for developers using the OpenStack APIs and others interested in seeing what OpenStack is all about, TryStack.org is also a very useful testbed for work that the OpenStack community is doing to automate the deployment and administration OpenStack environments.

TryStack is deployed using the Chef Cookbooks from the upstream repository, and changes that are needed will be pushed back upstream immediately for consumption by the broad community. We have a limited HA setup for the initial TryStack availability zone and lessons learned from the deployment of these HA setups are being incorporated into an online TryStack Administrator’s Guide that will serve as input for the upstream documentation teams as well.

Roadmap for TryStack

In the next three to six months, we’re planning to bring on-line at least one more availability zone. The next availability zone will be running HP hardware and will be housed in a datacenter in Las Vegas. It is likely that this new zone will be deployed with the Essex release of OpenStack components, enabling users to test against both a Diablo-based OpenStack installation and an Essex-based installation.

This first availability zone does not contain an installation of Swift. Of course, we want to change that, so an installation of Swift is definitely on the roadmap for either the next availability zone or as a separate service itself. Note that, just like the instances launched in TryStack, objects stored in a TryStack Swift cluster would be temporary. After all, TryStack is for trying out OpenStack services, not for providing a free CDN or storage system! 🙂

We will also eventually move towards a different registration process to accomodate non-Facebook users. If you are interested in helping with this effort, please do let us know.

Finally, we’ll be adding things like an automated Twitter status feed for each zone, lots of documentation gathered from running TryStack, and hopefully a number of videos showing usage of TryStack as well as common administrative tasks — all with the goal of providing more and better information to the broad and growing OpenStack community. I fully expect numerous hiccups and growing pains in these first couple months of operation, but we promise to turn any pain points into lessons learned and document them for the benefit of the OpenStack community.

Please do check out the trystack.org service. We look forward to your feedback. You can find us on Freenode.net #trystack. Nati and I will be hosting a webinar February 23, and we’ll be speaking at a San Francisco meetup March 6 if you’re interested in learning more or getting involved.

Update: I totally goofed and left Cisco off the list of donor organizations. My apologies to Mark Voelker and the excellent folks at Cisco who provided two 4948-10GE switches that are in use in the TryStack cloud. I also got the link wrong to HP Cloud… which is pretty lame, considering I work for HP. 🙁 That’s been corrected.

Under the hood of Swift: the Ring

This is the first post in series that summarizes our analysis of Swift architecture. We’ve tried to highlight some points that are not clear enough in the official documentation. Our primary base was an in-depth look into the source code.

The following material applies to version 1.4.6 of Swift.

The Ring is the vital part of Swift architecture. This half database, half configuration file keeps track of where all data resides in the cluster. For each possible path to any stored entity in the cluster, the Ring points to the particular device on the particular physical node.

There are three types of entities that Swift recognizes: accounts, containers and objects. Each type has the ring of its own, but all three rings are put up the same way. Swift services use the same source code to create and query all three rings. Two Swift classes are responsible for this tasks: RingBuilder and Ring respectively.

Ring data structure

Every Ring of three in Swift is the structure that consists of 3 elements:

  • a list of devices in the cluster, also known as devs in the Ring class;
  • a list of lists of devices ids indicating partition to data assignments, stored in variable named _replica2part2dev_id;
  • an integer number of bits to shift an MD5-hashed path to the account/container/object to calculate the partition index for the hash (partition shift value, part_shift).
List of devices

A list of devices includes all storage devices (disks) known to the ring. Each element of this list is a dictionary of the following structure:

Key Type Value
id integer Index of the devices list
zone integer Zone the device resides in
weight float The relative weight of the device to the other devices in the ring
ip string IP address of server containing the device
port integer TCP port the server uses to serve requests for the device
device string Disk name of the device in the host system, e.g. sda1. It is used to identify disk mount point under /srv/node on the host system
meta string General-use field for storing arbitrary information about the device. Not used by servers directly

Some device management can be performed using values in the list. First, for the removed devices, the 'id' value is set to 'None'. Device IDs are generally not reused. Second, setting 'weight' to 0.0 disables the device temporarily, as no partitions will be assigned to that device.

Partitions assignment list

This data structure is a list of N elements, where N is the replica count for the cluster. The default replica count is 3. Each element of partitions assignment list is an array('H'), or Python compact efficient array of short unsigned integer values. These values are actually index into a list of devices (see previous section). So, each array('H') in the partitions assignment list represents mapping partitions to devices ID.

The ring takes a configurable number of bits from a path’s MD5 hash and converts it to the long integer number. This number is used as an index into the array('H'). This index points to the array element that designates an ID of the device to which the partition is mapped. Number of bits kept from the hash is known as the partition power, and 2 to the partition power indicates the partition count.

For a given partition number, each replica’s device will not be in the same zone as any other replica’s device. Zones can be used to group devices based on physical locations, power separations, network separations, or any other attribute that could make multiple replicas unavailable at the same time.

Partition Shift Value

This is the number of bits taken from MD5 hash of '/account/[container/[object]]' path to calculate partition index for the path. Partition index is calculated by translating binary portion of hash into integer number.

Ring operation

The structure described above is stored as pickled (see Python pickle) and gzipped (see Python gzip.GzipFile) file. There are three files, one per ring, and usually their names are:

account.ring.gzcontainer.ring.gzobject.ring.gz

These files must exist in /etc/swift directory on every Swift cluster node, both Proxy and Storage, as services on all these nodes use it to locate entities in cluster. Moreover, ring files on all nodes in the cluster must have the same contents, so cluster can function properly.

There are no internal Swift mechanisms that can guarantee that the ring is consistent, i.e. gzip file is not corrupt and can be read. Swift services have no way to tell if all nodes have the same version of rings. Maintenance of ring files is administrator’s responsibility. These tasks can be automated by means external to the Swift itself, of course.

The Ring allows any Swift service to identify which Storage node to query for the particular storage entity. Method Ring.get_nodes(account, container=None, obj=None) is used for identification of target Storage node for the given path (/account[/container[/object]]). It returns the tuple of partition and dictionary of nodes. The partition is used for constructing the local path to object file or account/container database. Nodes dictionary elements have the same structure as the devices in list of devices (see above).

Ring management

Swift services can not change the Ring. Ring is managed by swift-ring-builder script. When new Ring is created, first administrator should specify builder file and main parameter of the Ring: partition power (or partition shift value), number of replicas of each partition in cluster, and the time in hours before a specific partition can be moved in succession:

swift-ring-builder <builder_file> create <part_power> <replicas> <min_part_hours>

When the temporary builder file structure is created, administrator should add devices to the Ring. For each device, required values are zone number, IP address of the Storage node, port on which server is listening, device name (e.g. sdb1), optional device meta-data (e.g., model name, installation date or anything else) and device weight:

swift-ring-builder <builder_file> add z<zone>-<ip>:<port>/<device_name>_<meta> <weight>

Device weight is used to distribute partitions between the devices. More the device weight, more partitions are going to be assigned to that device. Recommended initial approach is to use the same size devices across the cluster and set weight 100.0 to each device. For devices added later, weight should be proportional to the capacity. At this point, all devices that will initially be in the cluster, should be added to the Ring. Consistency of the builder file can be verified before creating actual Ring file:

swift-ring-builder <builder_file>

In case of successful verification, the next step is to distribute partitions between devices and create actual Ring file. It is called ‘rebalance’ the Ring. This process is designed to move as few partitions as possible to minimize the data exchange between nodes, so it is important that all necessary changes to the Ring are made before rebalancing it:

swift-ring-builder <builder_file> rebalance

The whole procedure must be repeated for all three rings: account, container and object. The resulting .ring.gz files should be pushed to all nodes in cluster. Builder files are also needed for the future changes to rings, so they should be backed up and kept in safe place. One of approaches is to put them to the Swift storage as ordinary objects.

Physical disk usage

Partition is essentially the block of data stored in the cluster. This does not mean, however, that disk usage is constant for all partitions. Distribution of objects between the partitions is based on the object path hash, not the object size or other parameters. Objects are not partitioned, which means that an object is kept as a single file in storage node file system (except very large objects, greater than 5Gb, which can be uploaded in segments – see the Swift documentation).

The partition mapped to the storage device is actually a directory in structure under /srv/node/<dev_name>. The disk space used by this directory may vary from partition to partition, depending on size of objects that have been placed to this partition by mapping hash of object path to the Ring.

In conclusion it should be said that the Swift Ring is a beautiful structure, though it lacks a degree of automation and synchronization between nodes. I’m going to write about how to solve these problems in one of the following posts.

More information

More information about Swift Ring can be found in following sources:
Official Swift documentation – base source for description of data structure
Swift Ring source code on Github – code base of Ring and RingBuilder Swift classes.
Blog of Chmouel Boudjnah – contains useful Swift hints

Tags: