Void Is a Smell

Notice: This website has moved to a new URL. Please visit us at Renaissance Learning R&D.

Void Is a Smell

by @pbrisbin on September 23, 2020

Just this week, I and a co-worker were separately stymied by the same kind of bug. In both cases, it boiled down to mis-use of Haskell’s void function.

This is not the actual signature or implementation, but you can think of void like this:

void :: Monad m => m a -> m ()
void action = do
  _result <- action
  pure ()

It runs the given action and discards the _result by returning the unit value instead. In this case, you’re not just discarding a concrete result value but also discarding information in the types: the input type m a can be anything and the fact that it’s discarded means you may not be aware of what it is or when it changes. This can hide bugs. For this reason, from now on, I’m going to consider the use of void a smell.

Aside: I use the term “smell” very precisely. It’s not an indication that something is wrong. It’s an indication something may be wrong. That smell in your fridge might be rotting food, or it could be some really good cheese. It requires more investigation, that’s all.

Let’s walk through the two examples from this week, to see if you agree.

Example 1

In one of my co-worker’s side projects, they started with this code:

EnemyEvaded iid eid -> do
  withQueue $ \queue ->
    let
      queue' = flip map queue $ \case
        Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
        msg' -> msg'
    in pure (queue', ())
  pure g

They were helpfully presented with a warning indicating that withQueue was returning a result that was implicitly discarded.

error: [-Wunused-do-bind, -Werror=unused-do-bind]
    A do-notation statement discarded a result of type '([Message], ())'
    Suppress this warning by saying
      '_ <- withQueue $ \ queue -> let queue' = ... in pure (q ueue', ())’
     |
1026 |     withQueue $ \queue ->

It’s always better to do things explicitly, so GHC suggests the wildcard pattern (which requires no additional functions in scope) but using void is the same outcome.

EnemyEvaded iid eid -> do
  void $ withQueue $ \queue ->
    let
      queue' = flip map queue $ \case
        Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
        msg' -> msg'
    in pure (queue', ())
  pure g

This code type-checks, but doesn’t work. Gasp you say! In Haskell? It can’t be! Luckily, we know the if-it-compiles-it-works mantra has its limitations and so we test, and a test caught that this was flushing the queue instead of updating it.

The trouble is the extra pure:

      in pure (queue', ())

The correct code is:

      in (queue', ())

This fixes things without requiring any other changes. How can that be? How do both of these type-check in the same context?

The type of withQueue is (roughly),

withQueue :: ([Message] -> ([Message], r)) -> m r

GHC is trying to treat pure (queue', ()) as ([Message], r). Since we are void-ing out the final r later, it has a lot of freedom to get it wrong. It finds the Applicative instance for tuples: Monoid a => (,) a, so pure x at this type is ([], x) – hence the flushed queue.

In this case, the goal was to return ([Message], ()), and so the expected type of the withQueue expression would be m () already, so void is just not necessary.

EnemyEvaded iid eid -> do
  withQueue $ \queue ->
    let
      queue' = flip map queue $ \case
        Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
        msg' -> msg'
    in (queue', ())
  pure g

It’s not surprising my co-worker was led astray. We’re trained to follow GHC’s guidance somewhat blindly, especially in the case where the code is well-typed and GHC’s instruction is only a warning. Approaching void as a smell may have offset that tendency and led them to question the type of the argument to withQueue instead. That makes it more likely they’d spot the extra pure.

We can help this along with another good practice, which is to extract and explicitly-type higher-order functional arguments. I’ve seen it happen a lot: higher-order functions + no type signatures = incorrect errors by incorrect type inference.

Following that practice means starting (even if temporarily) with code like this:

EnemyEvaded iid eid -> do
  let
    something :: [Message] -> ([Message], ())
    something queue =
      let
        queue' = flip map queue $ \case
          Damage EnemyJustEvadedTarget source n -> EnemyDamage eid iid source n
          msg' -> msg'
      in pure (queue', ())

  withQueue something
  pure g

The error from the buggy code in this case is:

error:
    • Couldn't match type '([Message], ())' with '()'
      Expected type: ([Message], ())
        Actual type: ([Message], ([Message], ()))
    • In the expression: pure (queue', ())
      In the expression:
        let
          queue'
            = flip map queue
                $ \case
                    Damage EnemyJustEvadedTarget source n -> ...
                    msg' -> ...
        in pure (queue', ())
      In an equation for 'f':
          f eid iid queue
            = let queue' = flip map queue $ ... in pure (queue', ())
     |
1654 |   in pure (queue', ())

Now we can see the actual type error the extra pure introduces. As mentioned, GHC is choosing the Applicative instance for Monoid a => (,) a, which you can think of as ([a], _). So you end up with this extra layer of ([Message], ...). I won’t go into the details on this, but for reasons like what we’re seeing here, that instance is a bit of a foot-gun.

This error doesn’t necessarily do a perfect job of directing you to the fix, but it does a much better job of not directing you to the wrong fix. It’s also nice that it at least indicates the correct line of the bug. Also, it doesn’t take long for a working Haskeller to internalize the wart that is these tuple instances. Once that happens, error messages like this are less confusing.

Example 2

In our main package at work, we have a utility for retrying an operation on errors:

-- | Run an @'IO'@ action until successful
--
-- This assumes the action itself already represents failure in @'Either'@. If
-- you're dealing with a raising action, use @'try'@.
--
runOperationUntilSuccessful
  :: (MonadIO m, MonadLogger m, Exception e) => Int -> m (Either e a) -> m a
runOperationUntilSuccessful attempts operation = do
  res <- operation

  case res of
    Right x -> pure x
    Left ex -> if attempts <= 0
      then throwIO ex
      else do
        logWarnN $ "Operation failed: " <> tshow ex <> ". Trying again."
        runOperationUntilSuccessful (attempts - 1) operation

Since we wrote this, the great retry package was created, which solves this space very nicely. So I was going to replace this function, as well as a few locations where we’ve written the same thing in-line, with this:

-- | 'retrying' a function that uses 'Either' for errors
--
-- Will retry with the given policy on 'Left' results.
--
retryingEither
  :: MonadIO m => RetryPolicyM m -> m (Either e a) -> m (Either e a)
retryingEither policy run = retrying policy (const $ pure . isLeft) (const run)

- void $ runOperationUntilSuccessful 5 $ sendEmail
+ void $ Retry.retryingEither policy $ sendEmail
    (htmlEmail
        (From billr)
        (To EmailName { enEmail = recipient, enName = Nothing })

Can you see the bug?

The type of runOperationUntilSuccessful 5 and retryingEither policy are not the same. But both versions of this code type-checked. The difference is what happens when retries are exhausted: with a void-ed Either function, a final error is discarded instead of thrown.

The void that was here to begin with is unnecessary. runOperationUntilSuccessful 5 is already m () (because sendEmail is m ()). If it weren’t there, my change would’ve failed to compile, since retryingEither policy returns m (Either _ ()).

The solution was to make a version that was also already m () by throwing on Left in the same way, and to remove any unnecessary voids:

-- | 'retryingEither' but throws the 'Left' value when retries are exhausted
retryingEitherThrow
  :: (MonadIO m, Exception e) => RetryPolicyM m -> m (Either e a) -> m a
retryingEitherThrow policy run =
  either throwIO pure =<< retryingEither policy run

- void $ Retry.retryingEither policy $ sendEmail
+ Retry.retryingEitherThrow policy $ sendEmail

Hopefully, you can see from these examples that void is a risk because it destroys type information and harms inference. I encourage you to give it some thought when you come across it. You might save yourself some head-scratching.