External Publication
Visit Post

How to parse specific syntax elements and discard the rest?

Haskell Community [Unofficial] May 7, 2026
Source

Hello,

I’m trying to write a tool to analyze nix files, notably rewriting/analyzing the path literals. I would like to parse a nix file into a list of path literals and the location they can be found. With this data I can then check whether the targets of the path literals are valid given the location of a file, simplify the paths and rewrite using the source location, or generate a directed graph for fun.

I’m having trouble parsing the path literals in nix files. Specifically, I want to only parse the path literals and discard the rest. I thought about using regex, but since I am more familiar with parser combinator libraries I went with Megaparsec.


The difficulty is that I want to fish out all the path literals in a nix file, disregarding all other syntaxic elements. Megaparsec provides the getOffset primitive. However getOffset gives me the position of the start of the failurue, so I can’t jump forward using this information.

ghci> parseTest (liftA2 (,) (optional ("foo" :: Parser Text)) getOffset) "foo"
(Just "foo",3)
it :: ()
(0.02 secs, 80,576 bytes)
ghci> parseTest (liftA2 (,) (optional ("foo" :: Parser Text)) getOffset) "bar"
(Nothing,0)
it :: ()
(0.01 secs, 78,480 bytes)

I also have tried to use observing, but it also only reports the position at the start of the failure.

ghci> parseTest (observing ("foo" :: Parser Text)) "bar"
Left (TrivialError 0 (Just (Tokens ('b' :| "ar"))) (fromList [Tokens ('f' :| "oo")]))
it :: ()

What can I do to parse only the path literals efficiently and correctly while discarding the rest? Thanks a lot =D

Discussion in the ATmosphere

Loading comments...