External Publication
Visit Post

A weighted angle distance on strings

Theory of Computing Report April 23, 2026
Source

Authors: Grant Molnar

We define a multi-scale metric $d_ρ$ on strings by aggregating angle distances between all $n$-gram count vectors with exponential weights $ρ^n$. We benchmark $d_ρ$ in DBSCAN clustering against edit and $n$-gram baselines, give a linear-time suffix-tree algorithm for evaluation, prove metric and stability properties (including robustness under tandem-repeat stutters), and characterize isometries.

Discussion in the ATmosphere

Loading comments...