A Home for Orphan Instances

Notice: This website has moved to a new URL. Please visit us at Renaissance Learning R&D.

A Home for Orphan Instances

by @eborden on December 12, 2018

It’s the holiday season and as Haskellers we turn our attention to those in need, orphans.

There are three things we need to do for orphans this holiday season.

isolate them
prevent them from getting their dirty fingerprints everywhere
hurry up and decrease the surplus compilation

Bah Humbug!

Orphans Are Everywhere

Working with orphan instances is a necessary evil in a production Haskell codebase. They aren’t elegant; they aren’t desirable, but they are occasionally necessary. We almost always prefer a newtype, but there are some scenarios where this is insufficient. Maybe another package defines a pervasively used type, and the maintainer hasn’t responded to your pull request. Maybe the maintainer has responded, but they don’t want to take another dependency where the typeclass is defined.

There are great talks that touch on the conceptual trouble of orphans, but we are only going to talk about one facet:

Where should you put them?

Give your Orphans a Home

It is very easy to spread orphan instances throughout a codebase. We often work iteratively and implement an instance close to where it was necessary. We set {-# OPTIONS_GHC -fno-warn-orphans #-} in that module and move along. This is how orphans infect a codebase.

Instead of allowing orphans to spread, we need to give them a home. Your codebase should have an OrphanInstances module, where all orphan instances reside. This achieves our Scroogly goals, but why?

Isolate Them

The biggest conceptual trouble with orphans is their destruction of global coherence. An orphan opens the door to conflicting instances. If GHC encounters two instances for the same typeclass/type pair, then we must resolve this conflict. We can remove an instance or introduce overlapping instances. When removing an instance, it can be annoying to trace through which import chain a conflict has arisen. Placing all of your orphans in a single module isolates them. This simplifies decision making and resolution.

Where should you go to resolve an instance conflict? OrphanInstances.hs
Where should I add this instance? OrphanInstances.hs
Where should I import this library that declares an orphan instance? OrphanInstances.hs

Done. Simple. Next!

Fewer Dirty Fingerprints

When GHC compiles a module it produces a fingerprint for it. This fingerprint determines if that file has changed since its last compilation. If it hasn’t changed then we can avoid doing more work. If it has changed then we must recompile it.

Orphan instances are extremely effective at invalidating fingerprints. If a module containing an orphan instance’s fingerprint is invalidated then any module which imports its instance, either directly or transitively will have its fingerprint invalidated. This is tenable in small codebases, but a large codebase with orphans scattered throughout can result in pervasive fingerprint invalidation.

Decrease The Surplus Compilation

Once we’ve isolated our orphan instances they fall to the bottom of our stack. They import very few modules, they don’t see a ton of fingerprint churn and we reduce the amount of wasteful compilation. This is great! We are saving tons of compute time and idle waiting.

There is one hitch. What about when I need to add or edit an orphan instance? We’ve forced you to recompile your whole project. That is horrible! Or is it?

This is the recompilation tax that our system imposes. If you want to introduce an orphan instance you might think twice. You might consider using a newtype or composing some plain old functions instead. If your orphan instance is truly necessary then it will pay its way and recompiling the world will be justified. If it isn’t, then this disincentive has saved you the trouble.

A Home for Orphan Instances

the Freckle developer blog