A Home for Orphan Instances
by @eborden on December 12, 2018
It’s the holiday season and as Haskellers we turn our attention to those in need, orphans.
There are three things we need to do for orphans this holiday season.
- isolate them
- prevent them from getting their dirty fingerprints everywhere
- hurry up and decrease the surplus compilation
Bah Humbug!
Orphans Are Everywhere
Working with orphan instances is a
necessary evil in a production Haskell codebase. They aren’t elegant; they
aren’t desirable, but they are occasionally necessary. We almost always prefer
a newtype
, but there are some scenarios where this is insufficient. Maybe
another package defines a pervasively used type, and the maintainer hasn’t
responded to your pull request. Maybe the maintainer has responded, but they
don’t want to take another dependency where the typeclass is defined.
There are great talks that touch on the conceptual trouble of orphans, but we are only going to talk about one facet:
Where should you put them?
Give your Orphans a Home
It is very easy to spread orphan instances throughout a codebase. We often work
iteratively and implement an instance close to where it was necessary. We set
{-# OPTIONS_GHC -fno-warn-orphans #-}
in that module and move along. This is
how orphans infect a codebase.
Instead of allowing orphans to spread, we need to give them a home. Your
codebase should have an OrphanInstances
module, where all orphan instances
reside. This achieves our Scroogly goals, but why?
Isolate Them
The biggest conceptual trouble with orphans is their destruction of global coherence. An orphan opens the door to conflicting instances. If GHC encounters two instances for the same typeclass/type pair, then we must resolve this conflict. We can remove an instance or introduce overlapping instances. When removing an instance, it can be annoying to trace through which import chain a conflict has arisen. Placing all of your orphans in a single module isolates them. This simplifies decision making and resolution.
- Where should you go to resolve an instance conflict?
OrphanInstances.hs
- Where should I add this instance?
OrphanInstances.hs
- Where should I import this library that declares an orphan instance?
OrphanInstances.hs
Done. Simple. Next!
Fewer Dirty Fingerprints
When GHC compiles a module it produces a fingerprint for it. This fingerprint determines if that file has changed since its last compilation. If it hasn’t changed then we can avoid doing more work. If it has changed then we must recompile it.
Orphan instances are extremely effective at invalidating fingerprints. If a module containing an orphan instance’s fingerprint is invalidated then any module which imports its instance, either directly or transitively will have its fingerprint invalidated. This is tenable in small codebases, but a large codebase with orphans scattered throughout can result in pervasive fingerprint invalidation.
Decrease The Surplus Compilation
Once we’ve isolated our orphan instances they fall to the bottom of our stack. They import very few modules, they don’t see a ton of fingerprint churn and we reduce the amount of wasteful compilation. This is great! We are saving tons of compute time and idle waiting.
There is one hitch. What about when I need to add or edit an orphan instance? We’ve forced you to recompile your whole project. That is horrible! Or is it?
This is the recompilation tax that our system imposes. If you want to introduce
an orphan instance you might think twice. You might consider using a newtype
or composing some plain old functions instead. If your orphan instance is truly
necessary then it will pay its way and recompiling the world will be justified.
If it isn’t, then this disincentive has saved you the trouble.