External Publication
Visit Post

From-text: type class to convert from Text

Haskell Community [Unofficial] May 24, 2026
Source

hasufell:

filepath recommends to not interpret filepaths returned by the OS API unless absolutely necessary

My goal is to write general-purpose applications, and to me “absolutely necessary” means “don’t bother unless you know something extra about the target system”. I therefore assume that the suggested approach is to read/write UTF-8 with no regard for target’s locale.

This directly contradicts how base handles file path handling, which I wouldn’t care much about if the tools given were good enough to replicate what base does (with proper error handling, of course). Instead there’s decodeFS, which manages to compress everything I’m trying to escape into a single definition.

hasufell:

If you are however constructing filepaths out of thin air (e.g. from user input), then you can decide on the shape or form.

There’s no single “you” here.

Library users want to talk about file paths as if they are text, which is fine as long as the conversion is not their direct responsibility.

Library writers know that file paths are not text, and that any introspection more complex than “, , , , and ” requires a choice of encoding. And this choice is unavoidable when, say, parsing command-line options (think e.g. about bar in --foo=bar).

This divide cannot be bridged without an additional type that’s platform-independent (unlike OsString), but is still possibly erroneous (unlike String/Text).

hasufell:

what does it mean to construct a filepath on unix and have a windows system interpret it? Or… what is a reasonable JSON instance?

Both of these are up to library users to decide, squarely outside of the topic of correct file path handling.

hasufell:

WTF-8 is used to have a single internal representation on both unix and windows.

Used by… Rust? I have no relevant experience in Rust. When I’m talking about WTF-8 I solely mean how the bytes are laid out in memory, same with UCS2LE.

I hope you’re not getting confused by the fact that Rust has a type called OsString and that one seems to be in WTF-8. Their raw file path types are raw arrays, I guess? The documentation is lacking.

hasufell:

You can see part of the fallout here: 2295-os-str-pattern - The Rust RFC Book

This RFC looks thoroughly confusing to me. I do think of WTF-8 as “sliceable”, but only within the very narrow limits of the Portable Character Set, which guarantees that the characters are single-byte. This is enough to break down --foo=bar without scrutinizing bar, but that’s the furthest extent of what I found necessary.

Might well be an issue specific to Rust, seems like OsStr is an interface.

Discussion in the ATmosphere

Loading comments...