Modernising CI infrastructure to a more scalable and fault resilient solution

Nalys mission

The mission existed out of a redesign of the existing CI infrastructure to decrease build time and increase maintainability and scalability of the performed builds.

About the project

Project challenges

  • The old build server was plainly out of date. There was no master-slave architecture and it had become a mix of deprecated and unmanageable plugins. This led to an unresponsive GUI, long build times and security issues.

  • Over time builds became more complex which congested the server, because of the blocking nature of its architecture.

  • Build job configurations were not traceable and everything had to be managed through the GUI where no history or logs where recorded of those changes.

  • Changes could be made by anyone and could potentially lead to bugs with almost no possibility of reversion.

Nalys solution

The solution consists of a couple of measures:

  1. At first, the build server is split up into one master node and multiple slaves. The master manages the slaves, which on their turn actually run the build jobs. These slaves can be deployed locally or in the cloud. This leads to a more scalable and robust architecture as more slaves can be spawned dynamically if needed.

  2. Secondly, Docker was used to eliminate the Jenkins plugins. These containers can handle the dependencies and versions in a more simple way. These containers can be updated more easily than the old Jenkins plugins, but also be run locally in the development phase of a project. The Docker images are made in a generic way in which they can be re-used for multiple projects. These Docker images are stored in a private docker repository from which they can be pulled.

  3. Thirdly, Jenkins configuration as code (JCasC) was introduced. All build job configurations can be coded in a YAML format. These configurations together with the Dockerfiles can be maintained with a version control system like Git. This increases traceability and re-usability, but also robustness. A new master can be set up in no time.

Key facts & figures

  • About 100 build jobs were converted and are managed by a private docker repository.
  • A new master build server can be set up in about 3 minutes.
  • About 30 plugins were eliminated.
  • Yocto builds went from building 1h30m for each build to just 15m.

Appendix

Want to know more about the used technologies, have a look at following links: