Haskell on Actions, Part 3
by @pbrisbin on May 21, 2021
This is the last post in a series about our Haskell projects on GitHub Actions.
In this post, we’ll talk about Docker-based deployment of a Haskell project. Really, this applies to any compiled project, where careful layers, multi-stage builds, and layer caching on CI is important.
- Building a simple project including caching concerns
- Automated releases of libraries or executables
- Docker-based deployments from Actions
Layer management & multi-stage builds
If you already understand Docker layers and multi-stage builds, you can safely skip this section.
Docker builds work in Layers. Each step in a Dockerfile
establishes a layer,
which is effectively a snapshot of the file-system at that point. This has two
consequences worth discussing here:
- If inputs to a layer (the layers before, any files being added) have not changed from a previous build, and the artifacts from that previous build are still present, it will not be built again
- If you add large files in one layer, and remove them in another, they still physically exist in the original layer and resulting image.
What does this mean for your Haskell Dockerfile
s?
First, you should be sure that slow layers (such as installing the compiler and
dependencies) are only “busted” when they need to be. You don’t want a change to
your main App.hs
to cause re-installing GHC. Concretely, this means you should
COPY
in only the files that impact dependency installation, then only install
dependencies, as distinct layers.
RUN mkdir -p /src
WORKDIR /src
# As long as these files don't change
COPY stack.yaml package.yaml /src/
# This step won't re-run
RUN stack --no-terminal build --dependencies-only
# If these files change
COPY library /src/library
COPY executables /src/executables
# Only this (faster) step will re-run
RUN stack --no-terminal build \
--pedantic \
--ghc-options '-j4 +RTS -A64m -n2m -RTS' \
--copy-bins
This could be made even more granular. Only stack.yaml
informs GHC choice, so
you could COPY
that and install GHC separately, before proceeding to
package.yaml
and dependencies installation. However, there are additional
complexities with that, such as extra-deps
and stack
refusing to do anything
without a package.yaml
. These complexities are solvable, but put this idea on
the far side of diminishing returns for me.
Second, you’ll want to make use of multi-stage builds. In the “old
days”, we would do this by building one image with a compiler tool-chain, then
running another docker build
in that image to produce executables for use in a
slimmer image built from an entirely different Dockerfile
. What a headache.
To solve this, recent Docker allows FROM
commands to be named, and for there
to be multiple of them. Each FROM
wipes any layers that have come before it
(and the final FROM
establishes the resulting image), and you can use COPY
to grab files from any previous FROM
’s layers.
# Stage 1
FROM fpco/stack-build-small:lts-17.8 AS builder
# ...
RUN stack install
# Stage 2
FROM ubuntu:18.04
COPY --from=builder /root/.local/bin/my-exe /my-exe
CMD ["/my-exe"]
This can interact poorly with caching on CI: If you pull the most recently
deployed image before building a new one, in an attempt to re-use cached layers,
you’ll find none of the builder
stage’s layers are cached. This makes sense in
retrospect because they don’t exist in the final image, by design. We’ll solve
for this when we talk about caching in our example Workflow.
Example
Taking all of the above into account, here’s a mildly abridged example of a
typical Haskell Dockerfile
:
FROM fpco/stack-build-small:lts-17.8 AS builder
# ...
RUN mkdir -p /src
WORKDIR /src
COPY stack.yaml package.yaml /src/
RUN stack --no-terminal build --dependencies-only
COPY library /src/library
COPY executables /src/executables
RUN stack --no-terminal build \
--pedantic \
--ghc-options '-j4 +RTS -A64m -n2m -RTS' \
--copy-bins
FROM ubuntu:18.04
# ...
COPY --from=builder /root/.local/bin/my-exe /my-exe
CMD ["/my-exe","+RTS","-N"]
You can go much further in the slim image game. Using something like Alpine as the runtime base is common, but can cause problems with missing shared libraries. More aggressive executable stripping is also common. Again, the complexities that brings are not worth the size savings for me. We find images using this approach typically weigh in around 50MB and we’ve had zero issues working with images of that size.
Docker Layer Caching on GitHub Actions
There are a few ways to attempt layer caching on GitHub Actions, but I’ve only
found one that works: this one. It uses Buildx, which I’m not
familiar with, but the key part is the cache-
options, particularly
mode=max
, which ensures all the layers from a multi-stage build are included.
Here is a full ci.yml
using it:
name: CI
on: push
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: freckle/stack-cache-action@main
- uses: freckle/stack-action@main
image:
runs-on: ubuntu-latest
steps:
# For example, say you push to Dockerhub under the same org/name as this
# repository itself
- id: prep
run: |
tags=${{ github.repository }}:${{ github.sha }}
echo "::set-output name=tags::$tags"
- id: buildx
uses: docker/setup-buildx-action@v1
- uses: actions/cache@v2
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-image-${{ github.sha }}
restore-keys: |
${{ runner.os }}-image-
- uses: docker/login-action@v1
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
- uses: docker/build-push-action@v2
with:
builder: ${{ steps.buildx.outputs.name }}
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,mode=max,dest=/tmp/.buildx-cache-new
push: true
tags: ${{ steps.prep.outputs.tags }}
# Avoids ever-growing cache hitting limits
- run: |
rm -rf /tmp/.buildx-cache
mv /tmp/.buildx-cache-new /tmp/.buildx-cache
outputs:
tag: ${{ steps.prep.outputs.tags }}
deploy:
if: ${{ github.ref == 'refs/heads/main' }}
needs: [test, image]
# Most likely some AWS action to update an ECS task to:
# ${{ needs.image.outputs.tag }}
This example uses DockerHub, but only the login step and prep.outputs.tags
need to change if you use another registry, such as AWS ECR.
Bonus!
I hope you enjoyed this series on Haskell and GitHub Actions. As a parting gift, here are two other Actions-related projects we maintain and use:
-
stack-bump-lts-action
: find the latest LTS, update yourstack.yaml
if it differs, and commit with a message including details about the changed dependencies. -
hackage-team
: maintain the Maintainers for your team’s Hackage libraries to match a centralized list, automatically.