TY - GEN
T1 - Leveraging Keys In Key-Value SSD for Production Workloads
AU - Saha, Manoj P.
AU - Desai, Omkar
AU - Kim, Bryan S.
AU - Bhimani, Janki
N1 - Publisher Copyright:
© 2023 Owner/Author.
PY - 2023/8/7
Y1 - 2023/8/7
N2 - Key-Value SSDs reduce host-side resource utilization for unstructured data management by streamlining the I/O stack. However, designing a robust Key-Value SSD with resource constrained flash controllers has always been a challenge. The key-to-page (K2P) mapping inside KV-SSD, which consolidates multiple layers of indirection in the traditional block I/O storage, has its own shortcomings. The sparsely populated NVMe KV namespace leads to very large index, which cannot be optimized similar to hybrid- or block-FTL in block-SSDs. In addition, the background index management tasks (e.g. compaction on LSM-tree index) also lead to performance degradation. Moreover, existing KV index design is not equipped to tackle fast changing workload patterns. These shortcomings have stalled the adoption of KV-SSDs in production environments. In this work, we take the position that these shortcomings can be addressed by leveraging the information embedded inside keys about application keyspaces and groups as prefixes. The prefixes can be used to partition the monolithic large index into smaller ones. We demonstrate a naive prefix-based index partitioning mechanism inside KV-SSD that can reduce on-flash index accesses for multiple production workloads and discuss the shortcomings of this approach. Lastly, we discuss our proposed design of a society of indices that initialize, interact and evolve based on workload characteristics over time.
AB - Key-Value SSDs reduce host-side resource utilization for unstructured data management by streamlining the I/O stack. However, designing a robust Key-Value SSD with resource constrained flash controllers has always been a challenge. The key-to-page (K2P) mapping inside KV-SSD, which consolidates multiple layers of indirection in the traditional block I/O storage, has its own shortcomings. The sparsely populated NVMe KV namespace leads to very large index, which cannot be optimized similar to hybrid- or block-FTL in block-SSDs. In addition, the background index management tasks (e.g. compaction on LSM-tree index) also lead to performance degradation. Moreover, existing KV index design is not equipped to tackle fast changing workload patterns. These shortcomings have stalled the adoption of KV-SSDs in production environments. In this work, we take the position that these shortcomings can be addressed by leveraging the information embedded inside keys about application keyspaces and groups as prefixes. The prefixes can be used to partition the monolithic large index into smaller ones. We demonstrate a naive prefix-based index partitioning mechanism inside KV-SSD that can reduce on-flash index accesses for multiple production workloads and discuss the shortcomings of this approach. Lastly, we discuss our proposed design of a society of indices that initialize, interact and evolve based on workload characteristics over time.
KW - KV indexing
KW - data storage
KW - key prefix
KW - key-value SSD
UR - http://www.scopus.com/inward/record.url?scp=85169584039&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85169584039&partnerID=8YFLogxK
U2 - 10.1145/3588195.3595949
DO - 10.1145/3588195.3595949
M3 - Conference contribution
AN - SCOPUS:85169584039
T3 - HPDC 2023 - Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing
SP - 327
EP - 328
BT - HPDC 2023 - Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing
PB - Association for Computing Machinery, Inc
T2 - 32nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2023
Y2 - 16 June 2023 through 23 June 2023
ER -