Boston
May 8-11, 2017

Event Details

Please note: All times listed below are in Central Time Zone


When Dataverse Meets OpenStack...

Cloud Dataverse is a new service for accessing and processing public data sets in an OpenStack Cloud. It is based on Dataverse, a popular framework for sharing, preserving, and analyzing research data. Cloud Dataverse extends Dataverse to replicate datasets from per-institution repositories to a cloud-based repository and store data in Swift, enabling applications running in the cloud to access data in-situ. We use OpenStack Sahara to launch on-demand Big Data applications that use Swift as a datasource for analytics jobs running on Hadoop, Spark, or Pig.

We follow the user's journey through the Cloud Dataverse: browsing datasets, the harvesting/replication process, viewing files in the object store, and the use of compute provided by Sahara. To enhance user experience in Sahara, we plan to provide the automatic generation of default cluster templates via a new UI providing users with an option to bypass the complexity of Horizon.


What can I expect to learn?
  • The features of the existing Dataverse project
  • The relevant new functionality which allow the integration of Dataverse with OpenStack
  • The basics of OpenStack Sahara
Wednesday, May 10, 11:00am-11:40am (3:00pm - 3:40pm UTC)
Difficulty Level: Intermediate
Technical Lead / Architect
Gustavo Durand works at Harvard University's Institute for Quantitative Social Science, as the Technical Lead and architect of the Dataverse application, an open source web application for publishing, citing, analyzing, and preserving data. He began his Java programming career at Cambridge Technology Partners in 1997 and has more than 20 total years experience as a software developer and... FULL PROFILE
Software Engineer
Jeremy is currently a software engineer at Red Hat, focused on OpenStack. He is the PTL of Sahara for the Train and Ussuri release cycles and has been a core contributor to that project since Pike. He gained his knowledge of OpenStack and cloud computing during many years at the Massachusetts Open Cloud. FULL PROFILE
Senior Software Developer
Leonid Andreev works at Harvard University's Institute for Quantitative Social Science, as the Senior Software Developer of Dataverse, an open source web application for publishing, citing, analyzing, and preserving data. He has more than 20 years experience as an application developer and systems programmer. He has worked on all layers application development, from front to back end, to... FULL PROFILE