Building a CICD pipeline

Summary

Keeping the game playable in development is of extreme importance for the whole production, I had to implement such a pipeline several times. Here I detail how I worked on one of my projects to automate the process of integrating commits, checking the game for compile errors, asset errors and broken gameplay features.

Goal

The importance of a continuous integration/continuous delivery pipeline in gamedev cannot be understated.

The game needs

  • to be built and delivered on target device for testing
  • as early as possible and as frequently as possible
  • be playable at all times
  • game breaking errors need to be detected as early as possible, as automatically as possible

This allows people to focus on what they are best at: quick iteration time for programmers, focus on creativity for artists and designers, freeing QA testers to concentrate on game feel and quality feedback.

Execution

The tech stack & infrastructure

I’ve used this paradigm in a number of times, the actual software used is a mere implementation detail with pros and cons whose balance depend on the team.

The key elements are:

  • a code/content versioning system (eg: Perforce)
  • a game engine (eg: Unreal)
  • a build management system (eg: TeamCity)
  • a test automation system (eg: Gauntlet & Daedalic Test Automation Plugin)
  • a server host capable of high uptime, high disk space, nightly builds, cheap operation costs (eg: a computer in a closet, a dedicated server, windows os)
  • bonus: a nice frontend (eg: UnrealGameSync)

Sidenote: Why windows OS? Linux would be far stabler, efficient and easier to automate, but unless the whole studio is running a linux OS on their development machines or there are dedicated build engineers, that would mean risking platform specific bugs in the game engine to only appear on the CICD pipeline for no reason and dev time wasted to investigate and fix. The occasional “no nightly build today, there was a windows update tonight” is the price to pay for that.

The workflow

Every developer syncs their files with the cvs used.
Before a commit local checks are run, if these fail an informative error message will give details on the issue at hand.
Server-side, checks are then executed, also possibly failing the commit.
Then a build is made and delivered on the target devices for playtesting from QA and devs.

Local checks

Ideally we want to completely block commits that haven’t gone through a local check, but that might be too expensive both in terms of lost flexibility if we lock the developer out of the versioning system, and in terms of development times to get a custom client done and performant. Perforce in particular has no pre-commit hook to use, but a perforce tool can be developed to replace that.

In my last iteration, the compromise is to set as a rule that all commits need to be checked first locally and then make it comfortable to do so by having an easy to use test automation system. Even earlier, in my previous iteration, instead we developed a tool in unity to handle that.

The important thing is that this test is quick, so not to waste time since it’ll quickly add up.
At this stage the following checks are done:

  • missing references in committed assets
  • compile errors in committed code or blueprints
  • fast unit tests in code
    A nice addition would have been:
  • missing files in the commit, namely files in the commit changelist referencing a modified or untracked file on disk

Total, 5-10 seconds at most.
In my last iteration of CICD we ended up using Unreal Engine’s Session Frontend to handle the selection of the tests to be run.

Build and deploy

Of course a CICD pipeline needs a “Deploy” part. That will change a lot depending on the tech stack used, in my last implementation we had engine code accessible but out of our project depot, we also handled precompiled binaries instead so that artists woulnd’t need to compile code locally. For reference see the first paragraphs in https://udn.unrealengine.com/s/article/UGS-Precompiled-Binaries-Guide

Server checks

Depending on the tech stack a pre-commit check might be impossible and it could be needed a workaround to have an equivalent result with post-commit hooks. Although it’s not ideal, using post-commit checks and rushing to fix the issue is a livable workflow.

In Teamcity it’s possible to execute the tests as a post-commit reaction.
Here the test is as comprehensive as possible, but from a maintenance standpoint it’s not necessary having 100% all-edge-cases test coverage, as it would be extremely time consuming to build and maintain. Rather than that, a “done” feature has a “happy path” integration test that will inform the team when the feature has been broken.
At this stage there are checks for:

  • gameplay features happy paths
  • performance budget constraints
  • level playability
  • slow unit tests in code
    In my last iteration I’ve implemented the tests in this stage using Gauntlet, Unreal engine’s test automation framework, with the help of Daedalic Test Automation Plugin, which required relatively little work to be production ready.

The test automation system

Gauntlet can launch tests implemented in unreal engine with a command line command, Daedalic plugin allows for blueprint test maps to be used very comfortably both in this manner and manually.
A second plugin needs to be implemented to also get code tests to be run in the same fashion.

The result is a robust battery of test tools that can be used with as little overhead as possible by both coders and designers to create automated tests for their respective implemented features.
It’s also possible to get customized error reports that will allow for rapid debugging once that an issue is detected.

Code tests

Unreal offers a vast arrary of code test options, what I chose in my last iteration was to use Automation Spec (docs here https://docs.unrealengine.com/4.27/en-US/TestingAndOptimization/Automation/AutomationSpec/ ) because it minimized the overhead, with some extra macros to handle world, gameobject and component creation in the test in a concise and error safe manner.
I recommend this approach also for Test Driven Development.

Blueprint tests

Integrating Daedalic Test Automation Plugin comes at a bit of a cost, it will require some fixes and tweaking to be adapted to your preferred workflow, but it comes with gauntlet compatibility.

One of the most important tweaks will be to filter the tests needed for CICD, in my last iteration I did so based on a CICD folder being present in the path of the test.

There’s little difference in blueprints when making tests with vanilla Unreal Engine and using Daedalic plugin.

Performance tests

There’s also ample ways to capture performance with a test, one that I followed in my last CICD implementation has been to have a camera do a flythrough of the levels, triggering everything that the player can trigger, while recording everything with a stat Startfile/stat Stopfile command. The result files are then converted to html files.
Additionally when a performance bugdet limit is crossed an error is logged and added to the report complete with a screenshot.
This being run in engine means that the report isn’t 100% accurate, for example there will be the ticking UI from the editor included and your target platform can have different rendering performances compared to unreal on a pc, but it can be used as a benchmark to spot anomalies and trigger an investigation.

References:

Here is a list of useful references to implement such a pipeline
https://www.emidee.net/ue4/2018/11/13/UE4-Unit-Tests-in-Jenkins.html
https://unrealcommunity.wiki/jenkins-ci-amp-test-driven-development-6912tx0c
https://qiita.com/donbutsu17/items/cd17d500a9fed143e061
https://horugame.com/gauntlet-automated-testing-and-performance-metrics-in-ue4/
https://github.com/DaedalicEntertainment/ue4-test-automation#gauntlet

Build management

The last step after all tests have been passed is to package the build for the use of the team and QA.
In a sense this is also a test: the packaging process, the delivery to target platform and the matching between in-editor and on-target device are proven every time the build is ready and deployed.

I used TeamCity to handle this aspect in my last iteration of CICD, with a mix of command line commands, python scripts and use of teamcity’s own settings. TeamCity can also be used to handle light bakes, as a bouns.

Retrospect

Implementing the whole pipeline in different environments and with different tech stacks has consolidated in me the idea that this work is best done as soon as possible. There’s no stage of the development where making and maintaining the tests isn’t worth the security that they give back.

Performance checks can be perhaps delayed in implementation to a Vertical Slice stage, but they will need to be already in place by the time Alpha work starts.

It’s important to get buy in from the team to introduce this kind of system, it will work wonders when most people contribute to implementing tests most of the time and be of little utility when the ‘Janitor’ syndrome develops: a situation where only a few select individuals care to implement and maintain tests, both for themselves and for others.