External Publication

Mutation Testing in Haskell

Haskell Community [Unofficial] June 9, 2026

andremarianiello:

The question is: does this mutation cause a change in behavior that is caught by a test?

Why should I care?

The basic assumption seems to be that a test suite must fail if the code is altered semantically. Suppose there are two data types, A and B and a function f :: A -> B that is to be tested. Suppose both A and B are finite with n and m total elements, respectively. Then f has m^n - 1 possible mutations. A naive test suite, adding one test after another, might need in the same order of magnitude of test cases, a cleverly designed suite only log(m^n) test cases to rule out all undesired mutations. With infinite types and general recursion, this even becomes infeasible. Consider the example in the announcement. The tests cover the design space of Int -> Int -> Bool while the semantics seem to be concerned with both integers in a finite range from 0 to some number, e.g. 20. Hence there are infinitely many mutations of canCastFireball that could be distinguished by test cases yet to be written, but that are semantically identical in the design space that the programmer cares about. If a mutation alters the behaviour at level = -42, does that mean we ought to add a test for that?

What I am trying to convey is that a spec should pin down the desired behaviour as excactly as possible, and if done right the code could be extracted from it. If a mutation does not fail the tests, it does not necessarily mean the tests are bad. It only means that the desired behaviour is exhibited by a non-singleton set of possibilities. The question is, do my tests let only the desired programs pass? Mutation tests could indeed be very helpful with that. I would expect a mutation test to nudge the code into a direction specified by a “spec for tests”. Perhaps sydtest does just that, but one can not tell from the announcement.

Discussion in the ATmosphere