Deconstructing RDMA-enabled Transaction Processing: Hybrid Is Better!

Authors:
Xingda Wei Shanghai Jiao Tong University
Zhiyuan Dong Shanghai Jiao Tong University
Rong Chen Shanghai Jiao Tong University
Haibo Chen Shanghai Jiao Tong University

Introduction:

There is currently an active debate on which RDMA primitive (i.e., one-sided or two-sided) is optimal for distributed transactions.In this paper, the authors perform a systematic comparison be- tween different RDMA primitives with a combination of various optimizations using representative OLTP workloads. Then we investigate the implementation of optimistic concurrency control (OCC) by comparing different RDMA primitives using a phase-by-phase approach with various transactions from TPC-C, SmallBank, and TPC-E. Our results show that no single primitive (one-sided or two-sided) wins over the other on all phases.We further conduct an end-to-end comparison of prior designs on the same codebase and find none of them is optimal.This number outperforms the pure one-sided and two- sided systems by up to 1.89X and 2.96X for TPC-C with over 49% and 65% latency reduction.

Abstract:

There is currently an active debate on which RDMA primitive (i.e., one-sided or two-sided) is optimal for distributed transactions. Such a debate has led to a number of optimizations based on one RDMA primitive, which was shown with better performance than the other. In this paper, we perform a systematic comparison be- tween different RDMA primitives with a combination of various optimizations using representative OLTP workloads. More specifically, we first implement and compare different RDMA primitives with existing and our new optimizations upon a single well-tuned execution framework. This gives us insights into the performance characteristics of different RDMA primitives. Then we investigate the implementation of optimistic concurrency control (OCC) by comparing different RDMA primitives using a phase-by-phase approach with various transactions from TPC-C, SmallBank, and TPC-E. Our results show that no single primitive (one-sided or two-sided) wins over the other on all phases. We further conduct an end-to-end comparison of prior designs on the same codebase and find none of them is optimal. Based on the above studies, we build DrTM+H, a new hybrid distributed transaction system that always embraces the optimal RDMA primitives at each phase of transactional execution. Evaluations using popular OLTP workloads including TPC-C and SmallBank show that DrTM+H achieves over 7.3 and 90.4 million transactions per second on a 16-node RDMA-capable cluster (ConnectX-4) respectively, without locality assumption. This number outperforms the pure one-sided and two- sided systems by up to 1.89X and 2.96X for TPC-C with over 49% and 65% latency reduction. Further, DrTM+H scales well with a large number of connections on modern RDMA network.

You may want to know: