External Publication
Visit Post

Qloverleaf - An Overpass QL interpreter using the QLever database

did:plc:4tuge3k3comfj4nfvqnwkemn June 8, 2026
Source

I’ve been working recently on a personal project to investigate how much of the Overpass query language could be implemented using a different back end database, namely the QLever database.

Overpass is the most widely used public interface for querying OSM data. It is relatively easy to use and understand and its outputs in GeoJSON or OSM XML are readily integrated into OSM or other geospatial tool chains for visualization or other processing.

But Overpass is somewhat of a victim of its success. Demand for the public Overpass service has recently outpaced server capacity. My earlier work on a container image for Overpass was intended to help scale capacity by making it easier for data consumers to run their own local Overpass servers.

This project looks in a different direction. What if the Overpass QL with its ease of use and its integration into other tools were available on top of a different data source?

QLever

QLever is a SPARQL database developed by the Chair for Algorithms and Data Structures at the University of Freiburg. SPARQL is an RDF query language where the data are represented as “triples” of a subject, predicate, and object.

QLever represents geospatial data as WKT and can perform geospatial operations on WKT. The osm2rdf conversion generates WKT data for every OSM element. So, the data in QLever stores the complete geometry from OSM.

But the key to QLever’s performance with OSM data is that in the conversion from OSM PBF to TTL (Terse RDF Triple Language), osm2rdf generates triples for every element to describe their spatial relations with other elements. That is, osm2rdf pre-computes the spatial relations sfIntersects, sfContains, sfCovers, sfTouches, sfCrosses, sfOverlaps, and sfEquals between all OSM elements.

QLever excels at querying RDF triples. And the precomputed spatial relations for imported OSM data make many types of queries blindingly fast in QLever. Want to find all the features within a closed way or relation? That answer is one lookup away in QLever.

Differences

The imported RDF triples for OSM data also contain all the structural relations between elements. Want to recurse down to find all the ways and nodes that are direct or indirect members of a relation? Just follow the path through the triples. Traversing upward to find all the parent elements that refer to a node or a way just reverses that path.

That upward path to find parents is something that Overpass struggles with. Overpass has the original structural relationships: a way has member nodes, a relation has member relations, ways, or nodes. But it does not maintain the reverse relationships. To find the parents of a node, for example, Overpass must do a full scan of elements in the same spatial bin to search for references to the node.

On the other hand, QLever does not have a direct geospatial index for OSM data. Finding OSM elements by spatial properties such as intersection with or containment in other elements is trivial in QLever. But finding OSM elements within a bounding box or other arbitrary region requires a full scan over all the elements’ geometry.

QLever also does not store the version history of OSM elements as Overpass can. This notable gap is discussed in more detail below.

Overpass treats all scalar data types as strings, unless they are used in a numeric context and they “look” like numbers, in which case they are treated as numbers. Unlike the dynamic typing in Overpass, SPARQL has strict data typing for scalars with explicit type conversions required in many instances.

In addition, the Overpass query language is imperative. Each statement specifies an operation to be performed and changes the state of the result sets. In contrast, SPARQL is a declarative language. Each statement describes constraints on the result set. One of the challenges of this project was to explore how the one language could be mapped to the other.

Grammar

The main references for the Overpass query language are the Overpass QL language reference and Overpass language guide on the OpenStreetMap Wiki. These two references are somewhat informal and sometimes don’t explain all the details of the query language or its grammar.

This project started with detailed research into Overpass QL using a local v0.7.62.11 Overpass server and the latest source code. The result of that research is a formal W3C EBNF grammar for Overpass QL.

The research into the Overpass QL grammar also looked into the Overpass XML query format and found that this is a direct transformation of the internal AST within the Overpass implementation. Because of the direct relationship between the Overpass XML query format and the internal Overpass implementation, several aspects of the Overpass XML query format could not be expressed using a standard XSD notation.

The fact that an XSD schema is unreachable and the discovery that the Overpass XML query format directly exposes Overpass implementation details made it unsuitable for the Qloverleaf project.

Implementation

The proof-of-concept project was developed in Python.

One of the goals of the Qloverleaf proof-of-concept was to reproduce the Overpass API as faithfully as possible. Overpass implements a CGI application which is typically exposed via an Apache or Nginx web server. The POC project uses Uvicorn and Starlette to directly implement an HTTP endpoint. Qloverleaf’s inputs and outputs are directly compatible with Overpass – it works as a drop in replacement for an Overpass server.

Qloverleaf includes errors and warnings from the query parsing and processing in the output using the same format as Overpass, so the errors and warnings are properly displayed by Overpass Turbo.

Query Parsing and Analysis

Qloverleaf statically types the result data sets and scalar values in Overpass queries and annotates individual set references with a semi-single static assignment name – essentially an individual version for each result set assignment.

Static typing allows Qloverleaf to detect semantic errors such as using a data set that cannot have the required element types in a context where those element types are required – or using an empty data set as input. Static typing also allows Qloverleaf to detect cases where scalars of incompatible types are used together in an expression.

Qloverleaf also statically determines whether the query is “constrained.” An “unconstrained” query will cause QLever to perform a full scan of the entire data set, either as an intermediate query step or while assembling an impractically large output set.

Some Overpass queries are obviously too broad to be executed. For example, nwr; would return every node, way, and relation in the entire OSM data set. However, some Qloverleaf queries unavoidably produce translations that are unworkable as QLever queries, even when they appear to operate on relatively small data sets.

A query like nwr(32.68870,-116.14417,32.88870,-115.84417); would seem to be a perfectly reasonable way to download all the data within a relatively small area. But the QLever translation of this query would load every node, way, and relation in the entire database and scan through each of them individually to find the features within the bounding box. In practice, this “unconstrained” query will fail in QLever either by timing out or running out of memory.

Query Translation

Qloverleaf translates each Overpass statement into a structured SPARQL pattern with components for each part of a SPARQL query and a named result variable to match the versioned output set assigned during transformation and augmentation.

The “evaluators” in Overpass QL (i.e., expressions) are translated first to evaluator patterns as components of expressions with supporting SPARQL query clauses where needed, and then composed into SPARQL clauses that become part of the statement’s SPARQL pattern.

As each statement is translated, it retains references to the SPARQL patterns that generate the input sets that it depends on. This maps the data flow through result sets from static analysis to a relationship between SPARQL queries.

In Qloverleaf, execution of query statements is deferred until a materialized result set is required – typically at an out statement.

When a statement needs to be executed, its SPARQL pattern is composed into a complete SPARQL query by merging SPARQL patterns for the statement’s input sets into the query. This allows SPARQL queries to be composed using constraints from the chain of Overpass QL statements that build the statement’s output set. Effectively, each materialized data set (e.g., at an out statement) maps to a single QLever query composed of all of the constraints that build the output data set.

While this does result in some large and complex SPARQL queries, it prevents round trips to and from the QLever server with large materialized data sets. And it permits the query planner within QLever to optimize the query strategy – which it does very well.

Area Handling

Areas are derived data types in both Overpass and QLever and the derivations differ between the two systems. Overpass includes untagged closed ways in its area derivation and derives areas as a separate data type. QLever (via osm2rdf) does not treat untagged closed ways as areas, but it assigns area attributes to all relations and closed tagged ways.

Qloverleaf preserves QLever’s area model. Relations and tagged ways can be used directly in area queries without adaptation. This differs slightly from Overpass, where ways and relations must be converted to the derived area type before they can be used in area queries.

For example, Qloverleaf allows both rel[name="Imperial County"]; way(area)... and area[name="Imperial County"]; way(area)... and produces identical results for both forms, where Overpass would require area[...] for the query to work.

OSM Version History

Overpass has the very powerful capability of querying prior versions of OSM elements and reconstructing OSM data sets as they would have been at a specific point in time.

Overpass does this by storing a filtered set of OSM element versions with strictly ordered timestamps, which means that although Overpass can query prior versions of OSM data, some element versions are omitted from its data set and are inaccessible.

The TTL data produced by osm2rdf does not include historical versions of OSM data elements – or if it did, assembling a result set to reproduce the OSM elements at a specific point in time would require complex filtering on timestamps and versions. In practice, this means that the prior versions of OSM elements are unavailable in QLever.

Limitations

Some aspects of Overpass QL do not translate well to SPARQL queries or the QLever RDF schema for OSM. As noted above, QLever does not have OSM history data, so none of the Overpass QL operations that rely on attic data can be supported. The derived and constructed types in Overpass QL do not map well to SPARQL, nor have these been implemented in the local query interpreter. Some documented Overpass QL features (e.g., noids) are broken and the behavior is not reproducible. Individual way vertices are not addressible in QLever queries, so Overpass QL expressions that reference individual vertices cannot be implemented.

The following Overpass QL features cannot be implemented in Qloverleaf:

  • [date: ] global setting - no history data
  • [diff: ] global setting - no history data
  • [adiff: ] global setting - no history data
  • retro statement - no history data
  • timeline statement - no history data
  • local statement - no history data
  • compare statement - no history data
  • convert statement - constructed type
  • make statement - constructed type
  • (bbox) filter in out statement - broken in Overpass
  • noids mode in out statement - broken in Overpass
  • qt sort in out statement - no quad tile index
  • (changed: ) filter - no history data
  • (user_touched: ) filter - no history data
  • (uid_touched: ) filter - no history data
  • (way_link: ) filter - no practical query translation
  • keys() evaluator - constructed type
  • :: generic tag evaluator - constructed type
  • geom() evaluator - constructed type
  • center() evaluator - constructed type
  • trace() evaluator - constructed type
  • hull() evaluator - constructed type
  • pt() evaluator - constructed type
  • lstr() evaluator - constructed type
  • poly() evaluator - constructed type
  • per_member() evaluator - not addressable
  • per_vertex() evaluator - not addressable
  • pos() evaluator - not addressable
  • mtype() evaluator - not addressable
  • ref() evaluator - not addressable
  • role() evaluator - not addressable
  • angle() evaluator - not addressable
  • set() evaluator - constructed type
  • gcat() evaluator - constructed type
  • lrs_in() evaluator - constructed type
  • lrs_isect() evaluator - constructed type
  • lrs_union() evaluator - constructed type
  • lrs_min() evaluator - constructed type
  • lrs_max() evaluator - constructed type

Unimplemented Features

Some aspects of the Overpass QL language are technically feasible but were not implemented for the POC:

  • [maxsize: ] global setting
  • count_tags() evaluator
  • count_members() evaluator
  • count_distinct_members() evaluator
  • count_by_role() evaluator
  • count_distinct_by_role() evaluator
  • u() evaluator
  • min() evaluator
  • max() evaluator
  • sum() evaluator
  • count() evaluator
  • foreach statement - requires local execution
  • for statement - requires local execution
  • complete statement - requires local execution
  • if statement - requires local execution
  • set.val evaluator - requires local execution
  • CSV output
  • popup output
  • custom output

The unimplemented features are generally “possible” to implement using QLever and the Qloverleaf interpreter, but simply have not been implemented for the POC. And the relative difficulty of implementing these features varies - some are relatively easy, others are relatively hard.

Lessons Learned

First, and this was the primary goal of the POC, it is possible to implement the Overpass QL using a completely different back end system. The Overpass QL is widely used in OSM community applications that query and process OSM data. Implementing the Overpass QL on top of different back end data sources opens the possibility of broadening the use of the query language and moving specific applications away from the public Overpass servers to a range of public and/or private alternatives.

Second, most of the Overpass QL query statements can be translated into database queries that filter or associate intermediate data sets to produce a result set. That is, any database query system should be able to serve as the back end for an Overpass QL interpreter. The success in translating multi-statement Overpass queries into composite QLever queries suggests that this translation is also possible for other similarly rich database query languages.

This approach of shifting most of the query processing to a dedicated database system substantially reduces the need to materialize large intermediate result sets and allows the database system to perform query planning and optimization based on its own internal data model and indexing.

However, depending on the capability of the underlying database, some of the Overpass QL statements may require local execution on materialized data sets. This was the case with the for, foreach, complete, and if statements which SPARQL could not directly represent.

The Overpass QL statements and evaluators that generate or consume constructed types (e.g., convert, make, geom(), lrs_*) are somewhat idiosyncratic. It is unlikely that another system would natively support these parts of the query language. The alternatives are to find a way to emulate them, if that is possible, or to execute the statements locally using materialized input data.

The Overpass QL geometry evaluators (e.g., geom(), lstr(), poly()) are possible exceptions. The results of these evaluators are opaque and can only be consumed by a ::geom attribute in a convert or make statement. It may be possible to implement these operators with a completely different underlying geospatial data type such as GeoJSON or WKT.

Finally, the query parsing, augmentation with static typing, and evaluation to detect unconstrained queries was very successful. These parts of the code are not directly dependent on QLever or the translation to SPARQL queries and may be useful in future projects either as templates or reusable code.

Acknowledgements

This proof of concept would not have been possible without the prior work of the Chair for Algorithms and Data Structures at the University of Freiburg to develop QLever and the osm2rdf conversion for OSM data, and to host the public QLever instance with the OSM Planet data set.

I would like to thank them for their hard work and to acknowledge that this project hides many of the efficient capabilities of QLever and the standard SPARQL query language behind Overpass’s own (rather arcane) query language.

Additional Information

Qloverleaf is a proof of concept and not intended for reliability or scalability. It also relies directly on the public QLever endpoint hosted by the University of Freiburg. As such, Qloverleaf is not a public service. Interested individuals who have specific use cases to evaluate Qloverleaf should contact me for access to the POC endpoint.

Discussion in the ATmosphere

Loading comments...