Bring Your Own Data Structures to Datalog

Arash Sahebolamri, Langston Barrett, Scott Moore, Kristopher Micinski

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The restricted logic programming language Datalog has become a popular implementation target for deductive-analytic workloads including social-media analytics and program analysis. Modern Datalog engines compile Datalog rules to joins over explicit representations of relations - often B-trees or hash maps. While these modern engines have enabled high scalability in many application domains, they have a crucial weakness: achieving the desired algorithmic complexity may be impossible due to representation-imposed overhead of the engine's data structures. In this paper, we present the "Bring Your Own Data Structures"(Byods) approach, in the form of a DSL embedded in Rust. Using Byods, an engineer writes logical rules which are implicitly parametric on the concrete data structure representation; our implementation provides an interface to enable "bringing their own"data structures to represent relations, which harmoniously interact with code generated by our compiler (implemented as Rust procedural macros). We formalize the semantics of Byods as an extension of Datalog's; our formalization captures the key properties demanded of data structures compatible with Byods, including properties required for incrementalized (semi-naïve) evaluation. We detail many applications of the Byods approach, implementing analyses requiring specialized data structures for transitive and equivalence relations to scale, including an optimized version of the Rust borrow checker Polonius; highly-parallel PageRank made possible by lattices; and a large-scale analysis of LLVM utilizing index-sharing to scale. Our results show that Byods offers both improved algorithmic scalability (reduced time and/or space complexity) and runtimes competitive with state-of-the-art parallelizing Datalog solvers.

Original languageEnglish (US)
Pages (from-to)1198-1223
Number of pages26
JournalProceedings of the ACM on Programming Languages
Volume7
Issue numberOOPSLA2
DOIs
StatePublished - Oct 16 2023

Keywords

  • Datalog
  • Logic Programming
  • Program Analysis
  • Static Analysis

ASJC Scopus subject areas

  • Software
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Bring Your Own Data Structures to Datalog'. Together they form a unique fingerprint.

Cite this