In the pyca/cryptography project we ostensibly write Python (and Rust), but in reality we are CI engineers doomed to spend 90% of all our development time building and maintaining our annoyingly large CI1. As part of that unwanted work we possess a self-hosted runner for macOS arm64 builds2.

Self-hosted runners have one big drawback3 – they lack build isolation. In the typical security model of GitHub Actions every job runs in an isolated environment. At the end of a run that environment is destroyed4 and future runs are not exposed to any products of a previous run. In a self-hosted runner, this is not true. Instead, each invocation of the runner is running on the same state-filled system that you built, with whatever mutations the last job performed still present. In some cases (e.g. closed source development using GHA) this represents an annoyance (you can’t treat all your GHA runners the same!), but can be worked around. However, in the open source world5 this represents an unacceptable security problem. CI systems run arbitrary code6 and trivial persistence on a CI runner is clearly undesirable.

For this reason, we have constrained our M1 runner to only pushes to main since we installed it, but, like professional athletes, we always want more.

CiderMill, ephemeral macOS VM runners for GHA

We created CiderMill to get the ephemeral macOS arm647 bliss we craved but didn’t want to wait for. CiderMill uses tart and GHA’s ephemeral runner support to provide build isolation while integrating extremely well with the GitHub Actions ecosystem. Simply put it does the following:

  • Boots one or more8 fresh macOS VMs set up via your own custom Packer scripts using tart (which, in turn, leverages Apple’s Virtualization.framework).
  • Installs and registers an ephemeral action runner within the VM.
  • Waits until the runner exits (upon completion of a job) and then does it all again.

In pyca/cryptography’s case thanks to CiderMill we can relax our constraints and integrate our macOS arm64 job into our primary CI! Less lines of nested bash and YAML plus pre-merge testing on an important architecture, seems good.

We run this on a single 16GB M1 Mini, but projects with more resources can simply run more instances of CiderMill as needed9.

Head on over to CiderMill to check out the documentation10, try it out, and see if it fits your needs.

  1. We run over 70 separate jobs per PR, and this is despite aggressively limiting our combinatorial behavior. There are challenges with a matrix this large, mostly around CI cycle time, tail latency issues, and ephemeral problems with networking or other parts of the underlying infrastructure. Also we run a k8s for ephemeral arm64 Linux runners, but that’s a story for another time. 

  2. An M1 kindly donated by MacStadium’s Open Source Program

  3. Beyond the really big one where you have to operate your own infrastructure. 

  4. Modulo whatever caching you may have chosen to do via actions/cache, etc. 

  5. Specifically, open source projects that allow anyone to submit PRs. 

  6. First-time contributor approval workflows and other things exist to mitigate a variety of CI abuse (e.g. crypto miner), but self-hosted runners have significantly more challenges that are not fully addressable via these mechanisms. 

  7. Apple Silicon, arm64, AArch64, M1, M2…sigh, too many names. 

  8. Apple’s godawful rules about macOS virtualization limit us to 2 VMs. 

  9. Just remember, there’s no advantage to running this on an M1 Pro/Max/Ultra machine unless you need more RAM since you’re constrained to 2 VMs8 per physical machine. 

  10. This project was built for our own use so there are some documented missing features (e.g. no PATs, only GH Apps). Will we add them? Maybe. Could you add them? Yes!