Void Is a Smell
by @pbrisbin on September 23, 2020
Just this week, I and a co-worker were separately stymied by the same kind of
bug. In both cases, it boiled down to mis-use of Haskell’s void
function.
This is not the actual signature or implementation, but you can think of void
like this:
void :: Monad m => m a -> m ()
void action = do
_result <- action
pure ()
It runs the given action
and discards the _result
by returning the unit
value instead. In this case, you’re not just discarding a concrete result
value but also discarding information in the types: the input type m a
can
be anything and the fact that it’s discarded means you may not be aware of what
it is or when it changes. This can hide bugs. For this reason, from now on, I’m
going to consider the use of void
a smell.
Aside: I use the term “smell” very precisely. It’s not an indication that something is wrong. It’s an indication something may be wrong. That smell in your fridge might be rotting food, or it could be some really good cheese. It requires more investigation, that’s all.
Let’s walk through the two examples from this week, to see if you agree.
Example 1
In one of my co-worker’s side projects, they started with this code:
EnemyEvaded iid eid -> do
withQueue $ \queue ->
let
queue' = flip map queue $ \case
Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
msg' -> msg'
in pure (queue', ())
pure g
They were helpfully presented with a warning indicating that withQueue
was
returning a result that was implicitly discarded.
error: [-Wunused-do-bind, -Werror=unused-do-bind]
A do-notation statement discarded a result of type '([Message], ())'
Suppress this warning by saying
'_ <- withQueue $ \ queue -> let queue' = ... in pure (q ueue', ())’
|
1026 | withQueue $ \queue ->
It’s always better to do things explicitly, so GHC suggests the wildcard pattern
(which requires no additional functions in scope) but using void
is the same
outcome.
EnemyEvaded iid eid -> do
void $ withQueue $ \queue ->
let
queue' = flip map queue $ \case
Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
msg' -> msg'
in pure (queue', ())
pure g
This code type-checks, but doesn’t work. Gasp you say! In Haskell? It can’t be! Luckily, we know the if-it-compiles-it-works mantra has its limitations and so we test, and a test caught that this was flushing the queue instead of updating it.
The trouble is the extra pure
:
in pure (queue', ())
The correct code is:
in (queue', ())
This fixes things without requiring any other changes. How can that be? How do both of these type-check in the same context?
The type of withQueue
is (roughly),
withQueue :: ([Message] -> ([Message], r)) -> m r
GHC is trying to treat pure (queue', ())
as ([Message], r)
. Since we are
void
-ing out the final r
later, it has a lot of freedom to get it wrong. It
finds the Applicative
instance for tuples: Monoid a => (,) a
, so pure x
at
this type is ([], x)
– hence the flushed queue.
In this case, the goal was to return ([Message], ())
, and so the expected type
of the withQueue
expression would be m ()
already, so void
is just not
necessary.
EnemyEvaded iid eid -> do
withQueue $ \queue ->
let
queue' = flip map queue $ \case
Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
msg' -> msg'
in (queue', ())
pure g
It’s not surprising my co-worker was led astray. We’re trained to follow GHC’s
guidance somewhat blindly, especially in the case where the code is well-typed
and GHC’s instruction is only a warning. Approaching void
as a smell may have
offset that tendency and led them to question the type of the argument to
withQueue
instead. That makes it more likely they’d spot the extra pure
.
We can help this along with another good practice, which is to extract and explicitly-type higher-order functional arguments. I’ve seen it happen a lot: higher-order functions + no type signatures = incorrect errors by incorrect type inference.
Following that practice means starting (even if temporarily) with code like this:
EnemyEvaded iid eid -> do
let
something :: [Message] -> ([Message], ())
something queue =
let
queue' = flip map queue $ \case
Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
msg' -> msg'
in pure (queue', ())
withQueue something
pure g
The error from the buggy code in this case is:
error:
• Couldn't match type '([Message], ())' with '()'
Expected type: ([Message], ())
Actual type: ([Message], ([Message], ()))
• In the expression: pure (queue', ())
In the expression:
let
queue'
= flip map queue
$ \case
Damage EnemyJustEvadedTarget source n -> ...
msg' -> ...
in pure (queue', ())
In an equation for 'f':
f eid iid queue
= let queue' = flip map queue $ ... in pure (queue', ())
|
1654 | in pure (queue', ())
Now we can see the actual type error the extra pure
introduces. As mentioned,
GHC is choosing the Applicative
instance for Monoid a => (,) a
, which you
can think of as ([a], _)
. So you end up with this extra layer of
([Message], ...)
. I won’t go into the details on this, but for reasons like
what we’re seeing here, that instance is a bit of a foot-gun.
This error doesn’t necessarily do a perfect job of directing you to the fix, but it does a much better job of not directing you to the wrong fix. It’s also nice that it at least indicates the correct line of the bug. Also, it doesn’t take long for a working Haskeller to internalize the wart that is these tuple instances. Once that happens, error messages like this are less confusing.
Example 2
In our main package at work, we have a utility for retrying an operation on errors:
-- | Run an @'IO'@ action until successful
--
-- This assumes the action itself already represents failure in @'Either'@. If
-- you're dealing with a raising action, use @'try'@.
--
runOperationUntilSuccessful
:: (MonadIO m, MonadLogger m, Exception e) => Int -> m (Either e a) -> m a
runOperationUntilSuccessful attempts operation = do
res <- operation
case res of
Right x -> pure x
Left ex -> if attempts <= 0
then throwIO ex
else do
logWarnN $ "Operation failed: " <> tshow ex <> ". Trying again."
runOperationUntilSuccessful (attempts - 1) operation
Since we wrote this, the great retry
package was created, which
solves this space very nicely. So I was going to replace this function, as well
as a few locations where we’ve written the same thing in-line, with this:
-- | 'retrying' a function that uses 'Either' for errors
--
-- Will retry with the given policy on 'Left' results.
--
retryingEither
:: MonadIO m => RetryPolicyM m -> m (Either e a) -> m (Either e a)
retryingEither policy run = retrying policy (const $ pure . isLeft) (const run)
- void $ runOperationUntilSuccessful 5 $ sendEmail
+ void $ Retry.retryingEither policy $ sendEmail
(htmlEmail
(From billr)
(To EmailName { enEmail = recipient, enName = Nothing })
Can you see the bug?
The type of runOperationUntilSuccessful 5
and retryingEither policy
are not
the same. But both versions of this code type-checked. The difference is what
happens when retries are exhausted: with a void
-ed Either
function, a final
error is discarded instead of thrown.
The void
that was here to begin with is unnecessary.
runOperationUntilSuccessful 5
is already m ()
(because sendEmail
is
m ()
). If it weren’t there, my change would’ve failed to compile, since
retryingEither policy
returns m (Either _ ())
.
The solution was to make a version that was also already m ()
by throwing on
Left
in the same way, and to remove any unnecessary void
s:
-- | 'retryingEither' but throws the 'Left' value when retries are exhausted
retryingEitherThrow
:: (MonadIO m, Exception e) => RetryPolicyM m -> m (Either e a) -> m a
retryingEitherThrow policy run =
either throwIO pure =<< retryingEither policy run
- void $ Retry.retryingEither policy $ sendEmail
+ Retry.retryingEitherThrow policy $ sendEmail
Hopefully, you can see from these examples that void
is a risk because it
destroys type information and harms inference. I encourage you to give it some
thought when you come across it. You might save yourself some head-scratching.