The OpenStack Blog

Category: Documentation

OpenStack Outreach Program for Women Accepting Candidates

OpenStack provides open source software for building public and private clouds. We are constantly moving and growing and very excited to invite newcomers to our community. To this end, the OpenStack Foundation has joined the GNOME Outreach Program for Women.

The Women in OpenStack group has already found some mentors for the program and ideas for projects are flowing in. Each internship requires $5,000. We have funding from Rackspace and Red Hat for two participants, and the Foundation is also sponsoring one intern. More funding is welcomed.

Interns are expected to spend 40 hours a week on the project and applicants may not have ever worked on FLOSS before.

Schedule

The program starts in November 2012 and the results of the internships will be ready by Spring 2013 OpenStack Summit.

  • November 14: program announced and application form made available
  • November 14 – December 3: applicants need to get in touch with at least one project and make a contribution to it
  • December 3: application deadline
  • December 11: accepted participants announced
  • January 2 – April 2: internship period
  • April 2013: OpenStack Summit

Five Outreach Program for Women interns from previous rounds – Liansu, Christy, Meg, Tamara, and Barbara – created this cartoon to explain the application process!

If you have a pet project and want to mentor new women members of OpenStack community please contact Stefano Maffulli or Anne Gentle, add your ideas to the wiki page, and discuss on the openstack-dev mailing list.

If you know of students who would be interested in these internship opportunities, help us spread the word by linking to this post and our wiki page about the program.

Starter docs and articles

I wanted to send a note out to discuss the growth of all the starter docs and “articles” on a particular topic. Thanks all who are sending these as links to the mailing list or tweeting ‘em. We are listening.

The doc team has been discussing ways to ensure we help people find what they seek while still getting high-quality content into the “official” documentation. Here are some ideas. I’d like to get input from our wider community as well.

What we’re doing:

  • Add a “Where do I start?” section to the docs landing page. Let us know what you think of this approach by taking a look at the pending review. We discussed quite a bit a more friendly approach to the docs site but I haven’t identified a web dev and designer to do the re-do, contact me if you’re interested.
  • Reach out to writers and where licensing allows and something “official” is not already documented, bring the content into the official docs. We’ve done this a few times now, an example is how to custom brand the OpenStack Dashboard.
  • Add link to helpful blog entries to a “BloggersTips” wiki page.
  • Expand the install/deploy guide to include more distros so the “single distro” guides can standalone. This effort is still a work in progress.
  • Hastexo has offered to write a separate high availability (HA) guide, so we won’t bring in their 12.04 “all in one” install guide after all, since the CSS OSS Starter Guide covers a similar scenario.
  • Remove “articles” from RST docs. (Currently nova only, in further discussion with the Project Technical Leads, QA and CI team leads.)
  • Add blog URLs to the Google Custom Search Engine at http://docs.openstack.org. I took this as an action item from our last doc team meeting.

What we’ve discussed:

  • Removing redundant docs. At the Design Summit, members of the nova core team asked for removal of “article” style RST documents from the nova source repo, creating a more doc-string based nova.openstack.org. Members of the swift core team, when asked, did not want to go to this architecture. I haven’t specifically asked all the PTLs on this particular item. So there’s still a potential problem here of consistency, where to write what, and having all the project.openstack.org sites that aren’t really tied together. I don’t have a good solution to suggest just yet but know we’re thinking about this particular problem. One idea was to have devs who want to write compose WordPress “articles” and that would aggregate together, but we haven’t found an ideal implementation (design is fine, working code, not so much).
  • Setting up a separate WordPress blog for documentation only. Apparently the aggregation tools just don’t give us all the requirements for version labels, bringing in one blog entry at a time (RSS feeds are needed), and so on.
  • Setting up a “support knowledge base” article site such as http://support.mozilla.org. We discussed this at the last doc team meeting. It seems to solve a lot of problems we have, but my current thinking (which of course can change) is that a support KB is for troubleshooting articles, while the “official” docs should create a happy path. These are two different scenarios, and I’m pretty sure the docs team cannot take on the support scenario with our current resources. A support knowledge base with translation built-in will go a long way in supporting our growing base, so this is important to me, but not in the Folsom plans currently.

I’ll follow up with each PTL for the docstring discussion, and welcome all input. Thanks for reading this far, and thanks for the docs. Now get started!

Starting line

Here is what happens inside Nova when you provision a VM

At the Essex conference summit this past month, we presented a session on  OpenStack Essex architecture. As a part of that workshop we visually demonstrated the request flow for provisioning a VM and went over Essex arthicture. There was a lot of interest in this material; it’s now posted in Slideshare:

In fact, we’ve packaged up the architecture survey/overview as part of our 2-day Bootcamp for OpenStack. The next session is scheduled 14-15 June. This time around will carry out the training at the Santa Clara CA offices of our friends at Nexenta. Last course was delivered at our Mountain View office right before the OpenStack summit in April to a sold out crowd. You can find more information about the course at www.mirantis.com/training

Under the hood of Swift: the Ring

This is the first post in series that summarizes our analysis of Swift architecture. We’ve tried to highlight some points that are not clear enough in the official documentation. Our primary base was an in-depth look into the source code.

The following material applies to version 1.4.6 of Swift.

The Ring is the vital part of Swift architecture. This half database, half configuration file keeps track of where all data resides in the cluster. For each possible path to any stored entity in the cluster, the Ring points to the particular device on the particular physical node.

There are three types of entities that Swift recognizes: accounts, containers and objects. Each type has the ring of its own, but all three rings are put up the same way. Swift services use the same source code to create and query all three rings. Two Swift classes are responsible for this tasks: RingBuilder and Ring respectively.

Ring data structure

Every Ring of three in Swift is the structure that consists of 3 elements:

  • a list of devices in the cluster, also known as devs in the Ring class;
  • a list of lists of devices ids indicating partition to data assignments, stored in variable named _replica2part2dev_id;
  • an integer number of bits to shift an MD5-hashed path to the account/container/object to calculate the partition index for the hash (partition shift value, part_shift).
List of devices

A list of devices includes all storage devices (disks) known to the ring. Each element of this list is a dictionary of the following structure:

Key Type Value
id integer Index of the devices list
zone integer Zone the device resides in
weight float The relative weight of the device to the other devices in the ring
ip string IP address of server containing the device
port integer TCP port the server uses to serve requests for the device
device string Disk name of the device in the host system, e.g. sda1. It is used to identify disk mount point under /srv/node on the host system
meta string General-use field for storing arbitrary information about the device. Not used by servers directly

Some device management can be performed using values in the list. First, for the removed devices, the 'id' value is set to 'None'. Device IDs are generally not reused. Second, setting 'weight' to 0.0 disables the device temporarily, as no partitions will be assigned to that device.

Partitions assignment list

This data structure is a list of N elements, where N is the replica count for the cluster. The default replica count is 3. Each element of partitions assignment list is an array('H'), or Python compact efficient array of short unsigned integer values. These values are actually index into a list of devices (see previous section). So, each array('H') in the partitions assignment list represents mapping partitions to devices ID.

The ring takes a configurable number of bits from a path’s MD5 hash and converts it to the long integer number. This number is used as an index into the array('H'). This index points to the array element that designates an ID of the device to which the partition is mapped. Number of bits kept from the hash is known as the partition power, and 2 to the partition power indicates the partition count.

For a given partition number, each replica’s device will not be in the same zone as any other replica’s device. Zones can be used to group devices based on physical locations, power separations, network separations, or any other attribute that could make multiple replicas unavailable at the same time.

Partition Shift Value

This is the number of bits taken from MD5 hash of '/account/[container/[object]]' path to calculate partition index for the path. Partition index is calculated by translating binary portion of hash into integer number.

Ring operation

The structure described above is stored as pickled (see Python pickle) and gzipped (see Python gzip.GzipFile) file. There are three files, one per ring, and usually their names are:

account.ring.gzcontainer.ring.gzobject.ring.gz

These files must exist in /etc/swift directory on every Swift cluster node, both Proxy and Storage, as services on all these nodes use it to locate entities in cluster. Moreover, ring files on all nodes in the cluster must have the same contents, so cluster can function properly.

There are no internal Swift mechanisms that can guarantee that the ring is consistent, i.e. gzip file is not corrupt and can be read. Swift services have no way to tell if all nodes have the same version of rings. Maintenance of ring files is administrator’s responsibility. These tasks can be automated by means external to the Swift itself, of course.

The Ring allows any Swift service to identify which Storage node to query for the particular storage entity. Method Ring.get_nodes(account, container=None, obj=None) is used for identification of target Storage node for the given path (/account[/container[/object]]). It returns the tuple of partition and dictionary of nodes. The partition is used for constructing the local path to object file or account/container database. Nodes dictionary elements have the same structure as the devices in list of devices (see above).

Ring management

Swift services can not change the Ring. Ring is managed by swift-ring-builder script. When new Ring is created, first administrator should specify builder file and main parameter of the Ring: partition power (or partition shift value), number of replicas of each partition in cluster, and the time in hours before a specific partition can be moved in succession:

swift-ring-builder <builder_file> create <part_power> <replicas> <min_part_hours>

When the temporary builder file structure is created, administrator should add devices to the Ring. For each device, required values are zone number, IP address of the Storage node, port on which server is listening, device name (e.g. sdb1), optional device meta-data (e.g., model name, installation date or anything else) and device weight:

swift-ring-builder <builder_file> add z<zone>-<ip>:<port>/<device_name>_<meta> <weight>

Device weight is used to distribute partitions between the devices. More the device weight, more partitions are going to be assigned to that device. Recommended initial approach is to use the same size devices across the cluster and set weight 100.0 to each device. For devices added later, weight should be proportional to the capacity. At this point, all devices that will initially be in the cluster, should be added to the Ring. Consistency of the builder file can be verified before creating actual Ring file:

swift-ring-builder <builder_file>

In case of successful verification, the next step is to distribute partitions between devices and create actual Ring file. It is called ‘rebalance’ the Ring. This process is designed to move as few partitions as possible to minimize the data exchange between nodes, so it is important that all necessary changes to the Ring are made before rebalancing it:

swift-ring-builder <builder_file> rebalance

The whole procedure must be repeated for all three rings: account, container and object. The resulting .ring.gz files should be pushed to all nodes in cluster. Builder files are also needed for the future changes to rings, so they should be backed up and kept in safe place. One of approaches is to put them to the Swift storage as ordinary objects.

Physical disk usage

Partition is essentially the block of data stored in the cluster. This does not mean, however, that disk usage is constant for all partitions. Distribution of objects between the partitions is based on the object path hash, not the object size or other parameters. Objects are not partitioned, which means that an object is kept as a single file in storage node file system (except very large objects, greater than 5Gb, which can be uploaded in segments – see the Swift documentation).

The partition mapped to the storage device is actually a directory in structure under /srv/node/<dev_name>. The disk space used by this directory may vary from partition to partition, depending on size of objects that have been placed to this partition by mapping hash of object path to the Ring.

In conclusion it should be said that the Swift Ring is a beautiful structure, though it lacks a degree of automation and synchronization between nodes. I’m going to write about how to solve these problems in one of the following posts.

More information

More information about Swift Ring can be found in following sources:
Official Swift documentation – base source for description of data structure
Swift Ring source code on Github – code base of Ring and RingBuilder Swift classes.
Blog of Chmouel Boudjnah – contains useful Swift hints

Hacking on Ebooks

Gentlemen prefer PDF, according to Tim O’Reilly’s data from Rough Cuts five years ago. At OpenStack we see some preference for PDF, though there are three times as many visits to the HTML version of our Compute Admin manual. Still, the PDF version of the guide is downloaded about five times a day. I do believe that gentlemen prefer PDF or some sort of book-like reading material. When asked, readers cite portability and search scope as two benefits to the form. However, as David Cramer, our doc tools developer put it at our recent hackathon, “PDFs are like cement.” With the boom of mobile and tablet screens, a stretchy and flexible screen-reader format like epub fills a need – we need content that works well on the 200 plus devices that fit into one hand.


So on 11/11/11, in the Austin Rackspace office, we did some hacking to be able to create epub output from our DocBook source files. I blogged about it for the OpenStack Planet blog from my blog, DocBook, ePub, Hackathon, what more could you ask for? prior to the event, talking about some of our prep work.

I’m pleased to show you the results – we did get output for epub and also tested the Mobi output on a Kindle, all in one day, with a team of about seven hackers including developers, writers, and testers.

We first tested the process using built-in epub transforms that ship with Oxygen, our XML editor, who supports open source projects like OpenStack by donating licenses to documentation contributors. Thank you Oxygen! We were able to use that output to start testing. Here’s our white board with the list of bugs.

While the writers and testers were hacking on output, programmers were working on ensuring we could get the epub output through Apache Maven, our build tool. By the end of the day, we could output epub through our automatic build process also!

As happens with hackathons, there’s some cleanup work to do – for example, our neat-o dynamic SVG cover page that takes in variables like the book title doesn’t output a cover for the epub. Also, most “real” epub output workflows convert tables from text to image (I know, crazy huh, when you think about the loss of search capability), but the tables in epub output act a little funky when resizing. Also, mobi, the Kindle format, has a problem with the way lists are marked up, but these are fixable and on the bug log.

I haven’t decided yet whether the epub output is high quality enough to offer it for download for every book on the OpenStack docs site, nor do I know if there’s much demand for the output, but I’d like to offer the OpenStack Starter Guide as an epub download. The team at CSS OSS works hard on this content, and I’d like to see it get spread onto many devices. Let me know how well it works for you and if you think epub has a place as a regular output for OpenStack documentation.

Happy Ada Lovelace Day

Ada Lovelace day, October 7th, is a day for bloggers to write a story about an inspirational influence in their life in technology.

For me, there were two influential woman in my life as an undergraduate chemistry student in the early 90s at Butler University in Indianapolis, Indiana. One was my first college chemistry professor, Anne McCowan, and the other was Butler’s scientific librarian, Mrs. Howes. Both influenced me through words, and bringing the importance of words to my attention. Professor McCowan stated on the first day of class:

“Chemistry is a study of nomenclature. Once you understand the naming and vocabulary, the world of chemistry is opened to you.”

It was such a simplification of an intimidating subject that it crystallized the learning process for me. If I studied the vocabulary, the rest would follow. And here I am, combining the wonder of worlds and technology every day.

So on today, Ada Lovelace day, I want to ask, how can OpenStack be a welcoming community for women in technology? I have ideas and want to share them with the community. These are both small ideas and large ideas.

  • Inspire girls when they’re young. I have volunteered with an organization called GirlStart here in Austin, Texas, and I think they’ve got the right idea, influence girls to enter technology in middle school and elementary and encourage them to go to college. A few years ago I went to lunch once a month with middle school girls where we talked about simple ideas such as “what does it mean to be smart?” That group of girls will be in high school now, and I hope they find technology a good path for them.
  • Invite women specifically. I spoke with Noirin Plunkett at OSCon this summer, and she said that women don’t necessarily have the confidence (or is it ego) to understand they are being specifically invited to participate in a tech initiative or open source project. You can specifically say to a group of female collage students for example, by saying “our project needs you specifically, not just your male colleagues.”
  • Start in your neighborhood, at your company. Since Rackspace is a huge supporter and founder of OpenStack, we want to ensure that we bring our women to the project and make them feel like Stackers are their kind of people. Stackers are professional, mature, and respectful of each other. We certainly have heated discussions but all input is valuable. I want to start locally by inviting women to Austin Cloud User Group meetings, by recruiting women for Rackspace jobs, and putting myself out there constantly, which is not always comfortable but it is rewarding.

How about your perspective here? Where will you start and when? Let’s take these first steps towards inviting more women to join our open source cloud computing efforts.

Documentation Contributors Styling Ts

Why give your time and efforts to an online community? Researchers like Peter Kollock have identified and studied reasons for people to contribute to online communities. I try to keep the basic principles of online participation in mind for documentation contributors all the time, and find ways to recognize the people making a difference with the docs. The motivating reasons for contributing to technical doc or offering technical support in a community include:

1. Reciprocity – Help out others who will help you later or already did help you out.
2. Reputation – Build your reputation as an expert in a given area.
3. Efficiency – Write it down so you save time later, either your own time or others’ time.
4. Attachment – Feel like you’re part of a bigger mission and vision.

It’s within these motivating reasons to find a place where you belong that prompted me to send some t-shirts out last month. I also want to recognize their efforts here on the blog! Here is the CSS Corp Open Source Service Team sporting their OpenStack t-shirts in a team photo, led by Murthyraju Manthena (far left). This team contributed the OpenStack Compute Starter Guide, which quickly jumped to the top ten list in the web stats. They’re working hard on revisions for Diablo, and this manual was a great addition to the OpenStack technical library for the Cactus release.

CSS Corp OSS Team led by Murthyraju Manthena

 

 

 

 

 

There’s also the sense of reciprocity – giving back your info since you got Volumes working in your environment. Here’s Razique Mahroua sporting his shiny new ringer T as well, after re-vising the entire Volume Management section of the Compute Administration Manual.

Wearing your OpenStack t-shirt is a great way to show you are a Stacker. I realize that sending t-shirts to contributors can seem like a small token of appreciation for the sweat poured into docs, but I like to send them any way when I’m especially impressed with the dedication. These guys are also building a great reputation as OpenStack knowledge experts. They are also a huge reason why the number of doc contributors has jumped from six to nearly twenty in six months’ time!

OpenStack Documentation Blitz

I had a great idea come across my radar this week – a Documentation Blitz! I’ve been working on case studies for a second edition of my book, Conversation and Community: The Social Web for Documentation, and in one of the case studies from Sarah Maddox at Atlassian, I uncovered a gem of an idea. From Sarah:

We have also held a couple of documentation blitz tests. This is a very successful way of involving the development and support teams in testing the documentation just before the release date. The technical writers set up a plan, including a list of the documents to focus on and a couple of ways people can give us feedback. We usually include an IRC channel, as well as wiki pages and comments, so that the engineers can choose the way that suits them best. We allocate a time period, usually just an hour, and everyone dives into the documentation. The chat session goes wild, comments fly, and we end up with a lot of useful feedback.

I love this idea and want to experiment with it for OpenStack. Fortunately the timing is just right, with the Diablo release ready for a September 22nd release. So, here’s the plan.

On Monday September 19th, from 2:00-3:00 CST (Monday, September 19, 2011 at 17:00:00):

To get coverage on the other side of the globe, we’ll run the Doc Blitz for a second hour at 11:00 pm – 12:00 midnight CST (Tuesday, September 20, 2011 at 04:00:00).

Let’s go find some doc bugs!

 

Documentation Wrangling and Statistics Sharing

I’ve been tracking web analytics on the documentation site since we put it up in February, and I thought I’d share some of the more interesting nuggets of data I’ve mined. I believe the documentation statistics offer a crystal ball, a window showing the future of what’s up-and-coming for OpenStack. Let’s gaze together.

Flickr: pasukaru76

The docs.openstack.org site regularly tops 1,700 visits a day which is about 40,000 a month. Nearly 10% of visitors are site regulars, with 9-14 visits in a month, and new visitors account for over a third of the traffic. I find search and content analytics much more interesting than just site traffic, though.

At the top of the docs.openstack.org site is a custom search engine that searches the docs site, the wiki, and each developer doc sites (such as nova.openstack.org). The engine is fine-tuned to only show results for the Cactus release documents in docs.openstack.org/cactus so that there aren’t a lot of duplicates with docs.openstack.org/trunk. Yesterday I further expanded the custom search engine to include the documentation for projects in docs.openstack.org/incubation, namely Keystone, the Identity Service for OpenStack. As a result, you can more easily find Keystone API documentation and Keystone developer documentation. Hopefully it means those of you tweeting that you can’t find the Keystone docs while you’re out shopping with your family can now find them no matter your mobile circumstances!

Last month, the top search term for the docs.openstack.org site was Quantum, which revealed the need for our newly incubated project Quantum to add more documentation. Fortunately Dan Wendlandt is on the case and working on developer and administrator documentation now. Also, the custom search engine gives results on the OpenStack wiki for Quantum.

We also have a rather fancy implementation of custom Event Tracking so I can track search data when a reader searches within a particular manual. We have data starting with mid-June. Popular searches once someone’s within a manual are glance, dashboard, vlan, floating, and zone. Interestingly, terms like accounting and billing show up in both the individual guides search and on the main search. I can extrapolate a couple of items from this type of data:

  1. People recognize project names, and the Image Service (glance) docs are embedded within the Compute book for the Cactus release. For Diablo, the Image Service will have its own set of books.
  2. The Dashboard had been trending for a while, so I put the docs in the Compute books prior to its incubation. That looks to be a good decision still.
  3. Accounting or billing solutions don’t exist in the OpenStack ecosystem yet, but people are certainly searching for them.

Our custom event tracking tells us that we’re also getting about 100 comments a month using the Disqus tool, and users are answering other users, which is excellent, keep it up!

One additional tracking item that I find interesting is that downloading the PDF of the OpenStack Compute Admin Manual is in the top 10 exit pages. I think people get in, download what they need, and get out. PDF output is considerably more popular than I had realized. I guess a lot of people hop on a plane and read docs or want the manual at their bedside table to go to sleep with?

Hopefully this tracking doesn’t creep you out, because the data really can help me shape the future for OpenStack documentation. You can always opt out of these tracking devices, and I’m sure some of you do. Let me know if there are any other documentation insights you would like to know.

Community Weekly Newsletter (June 24 – July 1)

OpenStack Community Newsletter – July 1, 2011

This weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please email stephen.spector@openstack.org.


Musical Entertainment at OpenStack Meetup

HIGHLIGHTS

EVENTS

DEVELOPER COMMUNITY

GENERAL COMMUNITY

COMMUNITY STATISTICS (6/24– 6/30)

  • Data Tracking Graphs – http://wiki.openstack.org/WeeklyNewsletter
  • OpenStack Compute (NOVA) Data
    • 12 Active Reviews
    • 279 Active Branches – owned by 78 people & 15 teams
    • 1382 commits by 65 people in last month
  • OpenStack Object Storage (SWIFT) Data
    • 1 Active Reviews
    • 67 Active Branches – owned by 22 people & 6 teams
    • 101 commits by 12 people in last month
  • OpenStack Image Registry (GLANCE) Data
    • 6 Active Reviews
    • 34 Active Branches – owned by 11 people & 5 teams
    • 164 commits by 12 people in last month
  • Twitter Stats for Week:  #openstack 287total tweets; OpenStack 762 total tweets  (does not include RT)
  • Bugs Stats for Week: 500 Tracked Bugs; 76 New Bugs; 44 In-process Bugs; 6 Critical Bugs; 35 High Importance Bugs; 339 Bugs (Fix Committed)
  • Blueprints Stats for Week:  202 Blueprints; 9 Essential, 14 High, 16 Medium, 24 Low, 139 Undefined
  • OpenStack Website Stats for Week:  12,207 Visits, 30,317 Pageviews, 48.14% New Visits
    • Top 5 Pages: Home 40.51%; /projects 11.53%; /projects/compute 16.47%; /projects/storage 11.00%; /community 6.38%

OPENSTACK IN THE NEWS

Back to top