External Publication

[RFC] Sibyl: Time Series Analysis in Haskell

Haskell Community [Unofficial] March 18, 2026

Hello! I’ve recently begun work on a new time series analysis and forecasting library called Sibyl that will implement functions and models comparable to statsmodels in Python or forecast in R: things like ARIMA/SARIMA, exponential smoothing, classical decomposition, and automatic model selection via the Hyndman-Khandakar algorithm. I’m working with the folks over at DataHaskell (@mchav, @daikonradish) to flesh out the initial specifications and begin drafting design plans.

During initial design discussions, we ran into a roadblock around a major UX choice. My initial approach splits the library into two layers:

An unsafe facade layer for notebook users and statisticians, imported with import Sibyl. Functions throw runtime errors on failure rather than returning Either. The goal is R-style convenience.
A safe layer for production pipelines, where users import modules individually (import Sibyl.Safe.TimeSeries, import Sibyl.Models.SARIMAX, etc.) and handle error types explicitly.

Here’s what each looks like in practice. A notebook user or scripter might write:

main :: IO ()
main = do
  raw <- D.readCsv "./data/sales.csv"
  result <- fromDataFrame "date" "sales" raw
    |> fit (ARIMA (1, 1, 1))
    |> forecast 12
    |> toDataFrame
  D.writeCsv "./artifacts/forecast.csv" result
-- Nice and convenient!

And the production pipeline version, where error handling is explicit:

module Pipeline where

import qualified Sibyl.Safe.TimeSeries as TS
import qualified Sibyl.Models.SARIMAX  as SARIMAX
import qualified Sibyl.Model           as M
import qualified DataFrame             as D

main :: IO ()
main = do
  raw <- D.readCsv "./data/sales.csv"
  case TS.fromDataFrame "date" "sales" raw of
    Left err     -> putStrLn $ "Bad series: " ++ show err
    Right series ->
      case SARIMAX.fitSARIMAWith settings series of
        Left err    -> putStrLn $ "Fit failed: " ++ show err
        Right model -> do
          M.summarize model
          let fc = SARIMAX.forecastSARIMA 12 model
          D.writeCsv "./artifacts/forecast.csv" (TS.toDataFrame (M.point fc))
          putStrLn "Done."
  where
    settings = SARIMAX.defaultSARIMASettings
      { SARIMAX.sarimaP      = 1, SARIMAX.sarimaD      = 1, SARIMAX.sarimaQ      = 1
      , SARIMAX.sarimaBigP   = 0, SARIMAX.sarimaBigD   = 1, SARIMAX.sarimaBigQ   = 1
      , SARIMAX.sarimaPeriod = 12, SARIMAX.sarimaMethod = SARIMAX.CSSML
      }

A third option has also come up in our discussions: using DataKinds and type families to encode model orders at the type level, so that Fitted ('ARIMA 1 1 1) and Fitted ('ARIMA 2 1 0) are different types. This gives much stronger guarantees, like for example, SARIMAX’s requirement for future regressors at forecast time becomes a type-level constraint rather than a runtime check. The tradeoff is that it makes interactive use in GHCi or a notebook harder, since model orders would need to be known at compile time.

The tension we keep running into is this: the new-to-Haskell R or Python user wants minimal friction, sensible defaults, and something that “just works”, per se. They’d likely use the unsafe facade and never touch Either. The Haskell engineer wants (?) strong guarantees, composability, and wants the type system to do a lot of work rather than just wrapping error calls. By trying to serve both audiences…we risk ending up with something that fully satisfies neither. Too un-ergonomic for the statistician, not strong enough for the seasoned Haskeller.

So, as we work through the initial stages of this library, I was wondering if I could get some initial comments and questions answered by anyone who has thoughts on all this; especially:

Is the two-layer approach a reasonable compromise, or is this two half-baked APIs?
For those of you who do statistical work in Haskell: what would you want out of a library like this?
Are there prior examples in the Haskell ecosystem that handle this well? Libraries that serve both “quick and dirty” and “production-grade” users?
Does the DataKinds approach seem worth the added complexity?

The GitHub repo is here if you want to see what exists so far. Happy to answer any questions! I’m also an intermediate Haskell programmer, and it’s possible I’m missing something really obvious in the details for this implementation. I welcome any and all feedback, as every opinion will help me better identify how best to write this library to serve the most amount of people

Discussion in the ATmosphere