External Publication
Visit Post

Challenges in Solving Sequence-to-Graph Alignment with Co-Linear Structure

Theory of Computing Report February 6, 2026
Source

Authors: Xingfu Li

Sequence alignment is a cornerstone technique in computational biology for assessing similarities and differences among biological sequences. A key variant, sequence-to-graph alignment, plays a crucial role in effectively capturing genetic variations. In this work, we introduce two novel formulations within this framework: the Gap-Sensitive Co-Linear Chaining (Gap-CLC) problem and the Co-Linear Chaining with Errors based on Edit Distance (Edit-CLC) problem, and we investigate their computational complexity. We show that solving the Gap-CLC problem in sub-quadratic time is highly unlikely unless the Strong Exponential Time Hypothesis (SETH) fails -- even when restricted to binary alphabets. Furthermore, we establish that the Edit-CLC problem is NP-hard in the presence of errors within the graph. These findings emphasize that incorporating co-linear structures into sequence-to-graph alignment models fails to reduce computational complexity, highlighting that these models remain at least as computationally challenging to solve as those lacking such prior information.

Discussion in the ATmosphere

Loading comments...