{
  "$type": "site.standard.document",
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreiflo6xt7is6b2iafwghkjahlgggocme5jwjsbeuqqwcywuvjhmszm"
    },
    "mimeType": "image/png",
    "size": 24783
  },
  "path": "/abs/2602.04103v1",
  "publishedAt": "2026-02-05T01:00:00.000Z",
  "site": "https://arxiv.org",
  "tags": [
    "R. Groot Koerkamp"
  ],
  "textContent": "**Authors:** R. Groot Koerkamp\n\nGiven a text, a query $\\mathsf{rank}(q, c)$ counts the number of occurrences of character $c$ among the first $q$ characters of the text. Space-efficient methods to answer these rank queries form an important building block in many succinct data structures. For example, the FM-index is a widely used data structure that uses rank queries to locate all occurrences of a pattern in a text. In bioinformatics applications, the goal is usually to process a given input as fast as possible. Thus, data structures should have high throughput when used with many threads. Contributions. For the binary alphabet, we develop BiRank with 3.28% space overhead. It merges the central ideas of two recent papers: (1) we interleave (inline) offsets in each cache line of the underlying bit vector [Laws et al., 2024], reducing cache-misses, and (2) these offsets are to the middle of each block so that only half of them need popcounting [Gottlieb and Reinert, 2025]. In QuadRank (14.4% space overhead), we extend these techniques to the $σ=4$ (DNA) alphabet. Both data structures require only a single cache miss per query, making them highly suitable for high-throughput and memory-bound settings. To enable efficient batch-processing, we support prefetching the cache lines required to answer upcoming queries. Results. BiRank and QuadRank are around $1.5\\times$ and $2\\times$ faster than similar-overhead methods that do not use inlining. Prefetching gives an additional $2\\times$ speedup, at which point the dual-channel DDR4 RAM bandwidth becomes a hard limit on the total throughput. With prefetching, both methods outperform all other methods apart from SPIDER [Laws et al., 2024] by $2\\times$. When using QuadRank with prefetching in a toy count-only FM-index, QuadFm, this results in a smaller size and up to $4\\times$ speedup over Genedex, a state-of-the-art batching FM-index implementation.",
  "title": "QuadRank: Engineering a High Throughput Rank"
}