OpenStack is one of the top 3 most active open source projects and manages 10 million compute cores     Learn more

OpenStack Job Board

Published on July 21
Senior Site Reliability Engineer Storage

Wikimedia

Remote

Come work within the Technology department at the Wikimedia Foundation! We host a public openstack cluster amongst other curated cloud services for Wikimedia and related communities. We are looking for someone with experience in utlizing distributed storage with openstack, including ceph or glusterfs. Read on for more details!

Our team maintains Infrastructure as a ServicePlatform as a Service, and Data as a Service products. The team works in partnership (our puppet repo is public! And yes, you can contribute to it!) with the larger Wikimedia volunteer community to manage these environments.  Candidates should be comfortable communicating in public and asynchronous ways with volunteers and developers from around the world.

You’ll work remotely with a full-time distributed team, with members spread between Europe and North America, and need to overlap (UTC-4 to UTC+2) working hours.  Some examples of the type of work you’ll be doing include:

And the backlog has even more details!

You will be responsible for:

  • Designing and implementing Cloud Services storage infrastructure and services. We’d like to make it easier to backup our users data as well as do maintenance on our services by utilizing distributed storage.
  • Performing day-to-day operational tasks on Wikimedia’s Cloud Services infrastructure (deployment, maintenance, configuration, troubleshooting). Develop and support automation tools and processes in support of these tasks.
  • Participating in on-call rotation and support in a 24x7 environment

Skills and experience:

  • Experience in designing and operating Ceph or similar distributed storage clusters in production environments
  • Comfortable working and thriving within a Linux ecosystem
  • Software development skills in at least one of the following languages: Python, Go, Javascript, and/or Ruby
  • B.S. or M.S. in Computer Science or related field or equivalent in related work experience.

Qualities that are important to us:

  • Share our values, appreciate our code of conduct, support our team norms, and work in accordance with all three
  • Strong English language skills and ability to work independently, as an effective part of a globally distributed team
  • Support of our users (volunteer and staff developers) using our service offerings
  • Passionate about the value of learning and growing together

Additionally, we’d love it if you have:

  • Utilized configuration management tools such as Puppet, Ansible, Chef, and SaltStack
  • Built data pipelines and or worked with streaming real time data
  • Used Kubernetes, Docker Swarm, Mesos, or similar container orchestration platforms
  • Operated an elastic computing environment such as OpenStack or Cloudstack
  • Operated Open Source database systems like MySQL and Postgres
  • Experience in serverless computing environments
  • Linux systems troubleshooting and debugging skills
  • Interest in open source software projects and communities

See https://grnh.se/60b4936e1us for more information and to apply.