Master Thesis Cheng Chen | Database Research Group

Master Thesis
November 1st, 2014 until May 19th, 2015

Abstract

With increasing size of memory in standard servers, In-Memory database systems are more popular nowadays. In-Memory database systems store most of or all relevant data in main memory. Many systems divide data into two parts: a main storage for read operations, and a differential buffer for optimizing the write operations. Data in the differential buffer is merged into main storage after a certain time. In that process, values in the main storage need to be first decompressed, merged with the differential buffer, and finally compressed again. This thesis studies the usage of SIMD instructions to accelerate the merge process.

As there are already several papers focused on accelerating the decompression process with SIMD instruction. This thesis will focus on compression. We also work on combining decompression and compression part together called Rollover.

Implementing SIMD compression, on one hand, will reduce the cost of memory access, since it reads more values at a time, and processing the several data in parallel, on the other hand, will make the algorithms more complex: compressed values are not always 8-bit aligned, they cross the byte boundary, so we cannot use single instructions to manipulate a single value.

The result of thesis presents that our fastest version SIMD compression routine is about 40% faster in the cases, with some cases being up to 2.4x as fast as their scalar counterpart. And for rollover, the fastest version is 6.6x as fast as the naive version.

Master Thesis

Student Theses

Vectorizing Recompression in column-based in-memory database systems

by Cheng Chen

Abstract

More