TY - GEN
T1 - Diff-index
T2 - 17th International Conference on Extending Database Technology, EDBT 2014
AU - Tan, Wei
AU - Tata, Sandeep
AU - Fong, Liana
AU - Tang, Yuzhe
PY - 2014
Y1 - 2014
N2 - Log-Structured-Merge (LSM) Tree gains much attention recently because of its superior performance in write-intensive workloads. LSM Tree uses an append-only structure in memory to achieve low write latency; at memory capacity, in-memory data are flushed to other storage media (e.g. disk). Consequently, read access is slower comparing to write. These specific features of LSM, including no in-place update and asymmetric read/write performance raise unique challenges in index maintenance for LSM. The structural difference between LSM and B-Tree also prevents mature B-Tree based approaches from being directly applied. To address the issues of index maintenance for LSM, we propose Diff-Index to support a spectrum of index maintenance schemes to suit different objectives in index consistency and performance. The schemes consist of sync-full, sync-insert, async-simple and async-session. Experiments on our HBase implementation quantitatively demonstrate that Diff-Index offers various performance/consistency balance and satisfactory scalability while avoiding global coordination. Syncinsert and async-simple can reduce 60%-80% of the overall index update latency when compared to the baseline syncfull; async-simple can achieve superior index update performance with an acceptable inconsistency. Diff-Index exploits LSM features such as versioning and the flush-compact process to achieve goals of concurrency control and failure recovery with low complexity and overhead. Diff-Index is included in IBM InfoSphere BigInsights, an IBM big data offering.
AB - Log-Structured-Merge (LSM) Tree gains much attention recently because of its superior performance in write-intensive workloads. LSM Tree uses an append-only structure in memory to achieve low write latency; at memory capacity, in-memory data are flushed to other storage media (e.g. disk). Consequently, read access is slower comparing to write. These specific features of LSM, including no in-place update and asymmetric read/write performance raise unique challenges in index maintenance for LSM. The structural difference between LSM and B-Tree also prevents mature B-Tree based approaches from being directly applied. To address the issues of index maintenance for LSM, we propose Diff-Index to support a spectrum of index maintenance schemes to suit different objectives in index consistency and performance. The schemes consist of sync-full, sync-insert, async-simple and async-session. Experiments on our HBase implementation quantitatively demonstrate that Diff-Index offers various performance/consistency balance and satisfactory scalability while avoiding global coordination. Syncinsert and async-simple can reduce 60%-80% of the overall index update latency when compared to the baseline syncfull; async-simple can achieve superior index update performance with an acceptable inconsistency. Diff-Index exploits LSM features such as versioning and the flush-compact process to achieve goals of concurrency control and failure recovery with low complexity and overhead. Diff-Index is included in IBM InfoSphere BigInsights, an IBM big data offering.
KW - Cap theorem
KW - Distributed databases
KW - Failure recovery
KW - Log-Structured Merge (LSM) tree
KW - Secondary index
UR - http://www.scopus.com/inward/record.url?scp=84941233947&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84941233947&partnerID=8YFLogxK
U2 - 10.5441/002/edbt.2014.76
DO - 10.5441/002/edbt.2014.76
M3 - Conference contribution
AN - SCOPUS:84941233947
T3 - Advances in Database Technology - EDBT 2014: 17th International Conference on Extending Database Technology, Proceedings
SP - 700
EP - 711
BT - Advances in Database Technology - EDBT 2014
A2 - Leroy, Vincent
A2 - Christophides, Vassilis
A2 - Christophides, Vassilis
A2 - Idreos, Stratos
A2 - Kementsietsidis, Anastasios
A2 - Garofalakis, Minos
A2 - Amer-Yahia, Sihem
PB - OpenProceedings.org, University of Konstanz, University Library
Y2 - 24 March 2014 through 28 March 2014
ER -