Swift suffers from significant performance degradation when it has to handle lots of small files(0~100KB) because objects are stored as separated files in backend. At iQiyi, we have developed a new durability interface on Hummingbird which is a reimplementation of Swift with Golang. Benchmark shows that write performance of the new interface is about 3X faster under heavy concurrent writes without sacrificing the read performance. And there is no degradation when the number of small objects grows which we believe is of great importance.
In this talk, we will discuss the design and implementation of the new durability interface such as
- Pack multiple small objects into a single POSIX file
- Manage metadata and index effectively with RocksDB
- Locate the objects without modifying current ring mechanism
- Incremental object replicator based on RocksDB transaction log to achieve eventual consistency
- Background garbage collector to reclaim deleted space
- Why Swift performance degrades when it comes to lots of small files applications
- The on disk format of the new disk file interface, namely about how to pack small objects
- How to manage the objects' metadata and index data with RocksDB
- How to locate the physical position of an object with current Ring mechanism and index
- How to implement a new incremental object replicator based on RocksDB transaction log to achieve eventual consistency
- How to implement a background garbage collector to reclaim deleted spaces