If you have ever used or experienced GitLab CI/CD pipelines, you might have to be familiar with seeing pipelines running within the release process. Depending on how complex you define your CI jobs, they can run really fast sometimes or they can also become a nightmare in reverse. We sometimes have to wait a couple of hours for a job to finish. Or in the worst cases, it could be longer than that. While working to improve CI/CD pipelines, here are ways I have done to make the pipeline run faster and reduce the pipeline runtime from about 19 minutes down to approximately 5 minutes.

Optimize CI/CD with underlying infrastructure

1. Build up your own Gitlab Runner

GitLab.com (GitLab’s SaaS platform) offers shared runners for repository usage, which is excellent for a quick start. However, these shared runners sometimes do not provide enough resources if our project is large and we have many tasks to build and test. Especially when we need to run heavy tasks.

Thus, the most significant performance boost I experienced came from deploying our dedicated runners. When it comes to Continuous Integration and Continuous Delivery/Continuous Deployment, building and deploying are the steps that you see most of the time. The process of building frequently involves tasks like downloading libraries, dependencies, Docker images, and GitLab-hosted projects, as well as compilation. Conversely, deploying entails transferring our assets to another location. The performance of a Docker image becomes crucial at this point, which the shared runners could not provide enough to speed up the pipeline.

Thus, hosting our own GitLab runner can give us the choice to increase CPU and RAM for our own runners. It will speed up our build and deployment, resulting in better developer efficiency. In addition to CPU and RAM, there are other factors to contemplate when opting to manage our own GitLab runners. One crucial aspect to take into account is the potential bottleneck, primarily attributed to network speed. However, with the evolution of Cloud services, we can easily get a server on any Private Cloud server to address this network issue, such as those on AWS, Google Cloud Platform, or DigitalOcean. Hosting the server on these Cloud Platforms consistently delivered approximately double the network speed. This could help us reduce interruptions while building and deploying jobs.

If you don’t know how to configure GitLab Runner. Please visit my Configuring & Managing GitLab Runners with Reusable Templates: My Real-World Workflow blog to see the implementation guide.

2. Allocate resources to your runners appropriately

If you already have your own runners, when you run your CI/CD jobs using your own runners, some jobs may not require significant resources to be executed, while others may require more CPU and RAM to run faster. Consider configuring multiple runners with different purposes, so that you can utilize these runners for corresponding heavy or light CI/CD jobs.

To configure, you need to add a [[runner]] section to the config.toml file. Click here to learn more about other settings that can be configured in this config.toml file.

E.g. if you have a server with 8GB RAM and 4 cores CPU. You have a Build job with heavy tasks and a Deploy job with light tasks. You can add a runner to be used by your Build CI job, and another runner to be used by your Deploy job. Under the [[runner]] section, there is a sub-section called [[runners.docker]], here is the place where we modified our CPU and RAM for the runner

The example sections should be:

# Runner for being used by Build Job
[[runners]]
  name = "build-job-runner"
  limit = 2
  url = "https://gitlab.com"
  ...
  [runners.docker]

    image = "php:latest"
    # set this runner to use 2GB RAM
    memory = "2048m"
    memory_swap = "2500m"
    # set this runner to use core 1 and 2 of CPUs
    cpuset_cpus = "0,1"
    # set this runner to use 40% of the core 1 and 2
    cpus = ".4"

[[runners]]
  name = "deploy-job-runner"
  limit = 10
  url = "https://gitlab.com"
  ...
  [runners.docker]
    image = "php:latest"
    # set this runner to use 256MB RAM
    memory = "256m"
    memory_swap = "512m"
    # set this runner to use core 3 and 4 of CPUs
    cpuset_cpus = "2,3"
    # set this runner to use 30% of the core 3 and 4
    cpus = ".3"

Once you have this configuration, you have successfully registered these 2 runners with GitLab.com. You can tag your runner by name, like “build_job_runner” and “deploy_job_runner” (you can choose any name you want). Finally, when you define your Build Job and Deploy job in the .gitlab-ci.yml file. You have to specify which job will use which runner by using the keyword tags. Click here to learn more about tags.

Here is an example:

build-job:
  stage: build
  tags:
    - build_job_runner  
  script:
    - echo "Run build"

deploy-job:
  stage: deploy
  tags:
    - deploy_job_runner  
  script:
    - echo "Run deploy"

Now, you’ll have your build job that will use the “build job runner” and your deploy job that will use the “deploy job runner”. Depending on your requirements and your GitLab runner resource, you might want to set the CPU and RAM higher to meet your expectations.

3. Pre-download repositories

Just in case you’re still hosting your own GitLab Runner server, this section may help you in some aspects. When a CI job runs, it normally downloads a repository where your project is hosted. By default, GitLab specifies an option git fetch in the settings. That’s because it is faster because it re-uses the local working copy (and falls back to clone if it doesn’t exist). See Git strategy for more details.

In some special cases, when you need to clone the entire repository or switch between branches in a CI job, it might not be ideal to clone the whole project every time the CI job runs. Especially when your project is very large. Having pre-downloaded repositories on your GitLab Runner server might be useful in this circumstance.

To carry out this method, you have to create a path where you will store your project in the gitlab-runner server.

Create /data/your-project/builds directories

mkdir -p /data/your-project/builds

Next, we need to configure a path in the docker where the CI jobs run, this path is an absolute path to the directory where the CI job stores your project. We set builds_dir parameter under [[runner]] section to a value is /builds. And then, we also configure a volumes parameter under [[runners.docker]] section so that we specify the /data/your-project/builds directory that is mounted to the /builds directory in the docker when a CI job runs.

Here is what we should configure:

[[runners]]
  name = "build-job-runner"
  limit = 2
  url = "https://gitlab.com"
  builds_dir = "/builds"
  ...
  [runners.docker]
    volumes = ["/data/your-project/builds:/builds"]

So, when any CI jobs use a runner called “build-job-runner”, your project will be downloaded for the first time, and it’s stored in /builds which is mounted to /data/your-project/builds. At the next runs, these CI jobs won’t download your project anymore, it rather just fetch the latest commits with a depth of 20 commits history. This could save you a lot of time when you have multiple jobs running different stages and they’re using the same runner.

4. Pre-install packages and dependencies

If you need to install packages and dependencies every time you run your CI jobs. This seems to be wasting your time. You can find the Docker images that contain your desired packages and dependencies. For instance, when you install PHP packages and their dependencies, you can use Docker image php:latest which already contains PHP.

If the Docker images don’t meet your requirements and you want to customize or add more dependencies to the image, you may need to pre-build your own Docker image and store it in a container registry that GitLab’s runners have access to or your own gitlab-runner server has access to. You can use GitLab’s container registry or Amazon ECS.

5. Use locally cache Docker images on your own GitLab Runner server

If you have your own GitLab runner server and you have built your custom Docker images, you can speed up a CI job by downloading your Docker image onto your GitLab runner server. When your CI job runs using your own GitLab runner, it will use the locally cached Docker image that has already been downloaded. If the Docker image is not present, the CI job will automatically download it.

To set this up, you will have to configure pull_policy parameter either in .gitlab-ci.yml (learn more about images:pull_policy) or under [runners.docker] section in config.toml (click here to learn more about configuring with this config file).

If you use .gitlab-ci.yml, it should be configured as follows:

build-job:
  stage: build
  image: "your-domain-container-registry/php:custom"
    pull_policy: if-not-present
  tags:
    - build_job_runner  
  script:
    - echo "Run build"

If you use a config file, set the if-not-present policy in the config.toml, when a CI job uses this “build-job-runner”, the pull_policy will be applied.

[[runners]]
  name = "build-job-runner"
  executor = "docker"
  ...

  [runners.docker]
    ...
    pull_policy = "if-not-present"

When you use the if-not-present policy, the runner first checks if a local image exists. If there is no local image, the runner pulls an image from the registry. When you update your Docker image but do not update it often, the simplest way is to remove the old image from your GitLab runner server and let the CI job pull the latest image when it runs. Alternatively, you can manually run the docker pull command yourself. There is another automated way, you can create a CI job that updates your Docker image, pushes it to the container registry, and finally accesses your GitLab runner server to remove the old image.

6. Set the overlay2 storage driver

As the best practice, GitLab recommends we use the overlay2 storage driver instead of the default vfs storage driver (click here to learn more) to speed up your CI. That’s because the vfs storage driver copies the file system on every run and causes disk-intensive operations. Using overlay2 is the way to go to avoid this matter.

If you use .gitlab-ci.yml file, you can enable the driver for each project by adding the following lines to the top of the file.

variables:
  DOCKER_DRIVER: overlay2

If you host your own runners, you can simply set the DOCKER_DRIVER environment variable in the [[runners]] section of the config.toml file,

[[runners]]
environment = ["DOCKER_DRIVER=overlay2"]

7. Organise your Dockerfiles carefully

In relation to Docker’s build cache, it is a feature that helps speed up the process of building Docker images by reusing layers from previous builds. When you build a Docker image, each instruction in the Dockerfile creates a layer, and Docker caches these layers to avoid rebuilding them if the command and context have not changed. Click here to learn more about best practices for Dockerfiles.

The main idea is that it helps you avoid running unnecessary instructions that haven’t changed. For instance, you can build a main Docker image that contains PHP packages.

FROM ubuntu:22.04
RUN apt-get -y update && \
    apt-get -y install php8.1-cli php8.1-fpm 

Then later, you create another Docker image which is based on the main Docker image, and you set up the configuration according to what you demand.

FROM php:base-22.04
COPY php/custom.ini /etc/php/8.1/mods-available/custom.ini

In this way, you don’t run the installation steps again in the main Docker image when you haven’t really changed anything. Thus the duplicate installation will only slow down your builds.

8. Use a cached image when building a Docker Image

Thanks to Docker’s advantages, it builds only the layers that have changed using a build cache, which significantly reduces the time it takes to build images. But you’ll always end up creating images from scratch if Docker is unable to locate an earlier build, which would occur if you’re starting from scratch every time you run a continuous integration task.

To fix this, we can use the --cache-from option to tell Docker where to look for a previously built image. Here is an example of building custom Ubuntu from a cache ubuntu:20.04.

DOCKER_REGISTRY_HOST="Your-docker-registry-url"

docker build --pull=true --cache-from "$DOCKER_REGISTRY_HOST/ubuntu:20.04" -t "$DOCKER_REGISTRY_HOST/myubuntu:20.04"

You may want to check out docker build options to know more.

9. Make your CI images run with small Linux distributions.

For the best CI performance, it seems to be a good idea to always use a tiny Linux distribution so that you boost the pipeline speed. One of the most popular choices is Alpine Docker, which is an Alpine Linux-based minimum Docker image that is only 5 MB in size and has a full package index. Alpine Linux is a less heavy alternative to common distributions like Ubuntu or Debian, which leads to reduced image sizes.

The reason you should consider this is that when you run a few small tests or jobs that don’t require so many tasks. It might not be worth your effort to download an image that could be 50-60 times larger.

Optimize CI/CD Configuration in YAML settings

The .gitlab-ci.yml file leverage pipeline speed significantly if you find out the correct features and set them up properly. Here are those features:

10. Utilize cache feature.

In the CI job, sometimes you need to install dependencies for your project, things that you can’t pre-build into your own Docker images. For instance, when you use Composer to install dependencies for a PHP project, it creates a vendor folder under the project’s root directory. We can use cache feature to cache the vendor directory between CI jobs, we can follow this way:

build-job:
  script:
    - composer install
  cache:
    key: composer-cache
    paths:
      - vendors/

When this “build-job” finishes, it uploads the vendors directory as a cache.zip file to S3 buckets. Later, every time a CI Job has the cache setting with the key “composer-cache,” it will download the cache.zip file into that CI job. It extracts the cache.zip file to obtain the /vendors directory.

Despite the advantages of the cache feature, it is essential to carefully consider when defining a cache strategy. Otherwise, it may actually slow down your pipelines instead of speeding them up. For example, when you have multiple jobs using the same shared cache, especially when dealing with gigabyte-sized caches, the cache feature downloads and unzips files in each job initially, and then zips and uploads them when the task is completed. Every job follows the same procedure, which significantly impacts your goal of accelerating the pipeline. This procedure is known as the “pull-push” policy. To avoid this behavior, you can specify either a “pull-push” or a “push” policy in the cache feature for the job that acts as a producer-only job. In other jobs, you should specify a “pull” policy in the cache for jobs that only function as consumers.

11. Optimize your script to update cache when it really needs

In relation to the cache feature, when we work on a project based on, for example, a PHP project, we usually run composer install to install dependencies for the PHP project in a CI job.

In this case, we want to run composer install only when composer.json and composer.lock files have been changed in our project. We can create a script that uses the sha1sum command on Linux to check the SHA1 message digest of the 2 files mentioned above. If they have changed in our project, the script writes the result to a composer_hash_project.sha12 file. It then zips and uploads this file as another cache.

When the CI job runs next time, it executes the script and checks the composer_hash_project.sha12 file. If our composer.json and composer.lock files haven’t changed, it skips running composer install, but still uploads the vendor/ directory as a cache without any changes. You can follow this instruction to achieve this. You can also check out this method in this article.

First, create a script for your project, e.g. /your-project/scripts/check_composer_install.sh with the content below, make sure you have composer.phar file in your project:

#!/usr/bin/bash

echo "$(date) - Checking composer files..."
if [[ -f composer.phar ]]; then
    checkFile="$COMPOSER_HASH_FILE"
    if [[ ! -f $checkFile || $(sha1sum -c ${checkFile} 2> /dev/null | grep FAILED | awk "{print \$2}") == *"FAILED"* ]]; then
        echo "$(date) - Composer files have changed. Run composer install
        php composer.phar install --no-progress --verbose
        retVal=$?

        if [[ $retVal -eq 0 ]]; then
            sha1sum composer.json composer.lock > ${checkFile}
        fi
    fi
fi

echo "$(date) - Finished!"

On the job that has pull-push policy for uploading cache. We configure:

produce-cache-job:
  variable: 
    COMPOSER_HASH_FILE: composer_hash_project.sha12
  script:
    # Run the script under your project's root path.
    - ./scripts/check_composer_update.sh
  cache:
    - key: composer-cache
      paths:
        - vendor/
    - key: composer-cache-version
      paths:
        - $COMPOSER_HASH_FILE

In a job that has a pull policy only for downloading cache. It will download and unzip both caches. It still checks composer_hash_project.sha12 and runs composer install if necessary. At the end of the task, no cache is uploaded. We can configure as follows:

consume-cache-job:
  variable: 
    COMPOSER_HASH_FILE: composer_hash_project.sha12
  script:
    # Run the script under your project's root path.
    - ./scripts/check_composer_update.sh
  cache:
    - key: composer-cache
      paths:
        - vendor/
      policy: pull
    - key: composer-cache-version
      paths:
        - $COMPOSER_HASH_FILE
      policy: pull

This method may be used for other projects such as Python, Ruby, NodeJS, etc.

12. Parallelize your CI jobs

Parallelization can drastically cut down on pipeline duration when you have large test cases or tasks to complete. By using the parallel keyword in GitLab, you can parallelize multiple jobs simultaneously which improves pipeline speed by dividing your work into multiple jobs that run concurrently.

Here is an example of using the parallel keyword to split test cases and run multiple jobs at the same stage.


test-cases:
  stage: test
  script:
    - echo "Test cases for $CI_NODE_INDEX/$CI_NODE_TOTAL"
  parallel: 3

You’ll have a pipeline that looks like

There is another example of using parallel jobs in this blog, you could have a look there.

13. Run CI Jobs when files have been changed.

For monorepos, this tip becomes very helpful when you might have a number of separate microservices or apps. Let’s take an example when you want to generate VueJS files for your frontend, but this CI job should only run when the VueJS files in /src directory have been changed. We don’t want to run this job every time we push a commit or create a new branch. Applying rules:changes:paths helps us avoid running unnecessary jobs on a certain pipeline when we don’t really need them.

generate-vuejs:
  script:
    - npm install
    - npm run build
  rules:
    - if: $CI_COMMIT_REF_SLUG
      changes:
        paths:
          - src/**/*

You might want to check out Choose when a job runs to know more.

14. Use retry feature to rerun a failed job automatically

You might sometimes see your CI jobs fail due to different reasons. Apart from the real issues that you have to fix in your code, there are problems that could be fixed by themselves. Simply rerun the CI job, and it will fix the pipeline. These failures could be a network issue, a failed artifact upload, etc. In this circumstance, it usually involves manual intervention to run the failed job. This matter becomes inconvenient and blocks your pipeline, accidentally resulting in slowing down your pipeline release process.

GitLab provides us with a retry feature that we can configure in the configuration file known as .gitlab-ci.yml. Your pipeline can recover from temporary problems without human intervention if you define how many times a job should automatically retry in the event of a failure.

For example, we can rerun a job when it fails due to an unknown reason either by specifying a number of retries (by default):

normal-job:
  script:
    - ./create-artifact.sh
  retry: 2

or specify a specific reason when this job should be rerun, e.g. when it fails to upload artifacts

normal-job:
  script:
    - ./create-artifact.sh
  retry:
    max: 2
    when: archived_failure

You can click here to understand more about retry usage.

15. Choose an appropriate caches and artifacts compression

Compression of cache and artifacts is essential to the overall functionality of your GitLab pipelines, it significantly affects your pipeline speed if you choose an appropriate compression option. You might want to check out runner feature flags that let you select the faster FastZip compression tool instead of the default one. To further enhance performance, you can adjust the compression level as well:

variables:
  # Enable to true. By default is "false"
  FF_USE_FASTZIP: "true"
  # You can specify these options below for per job or per pipeline. 
  # There are five levels: slowest, slow, default, fast, and fastest
  ARTIFACT_COMPRESSION_LEVEL: "fastest"
  CACHE_COMPRESSION_LEVEL: "fastest"

16. Break down your big CI job

A “big” CI job is one that has several (independent) commands in its script section that can be executed concurrently rather than sequentially.

For example, you may have a “testing” job that progressively runs multiple test cases. Alternatively, you can split up large jobs like this into several smaller ones, which your runners will then work on concurrently to finish the jobs faster.

17. Use “dependencies” keyword to download only needed artifacts

At the beginning of a job, GitLab downloads all artifacts by default from every job and every stage before it. This can result in unnecessary actions, particularly in intricate pipeline systems.

Not always does a job need all artifacts from previous jobs. We can optimise this behaviour by using the dependencies keyword to define a list of specific jobs from which to fetch artifacts. This helps us avoid downloading unnecessary artifacts that the job doesn’t need, resulting in saving pipeline runtime.

18. Using Git Shallow Clone

For some jobs that don’t need to clone the entire history of the repository, you can use a GIT_DEPTH variable to specify a shallow clone of the repository. This GIT_DEPTH speeds up your pipeline runtime significantly by cloning the latest version of your code and reducing the time it takes to clone the entire history. All you need to do is specify the depth of history commits that the job should clone. Here is how we configure it in .gitlab-ci.yml.

variables:
  GIT_DEPTH: "2"

With this configuration, we clone only 2 depths of the commit history, which is very fast. Remember that if you specify 1 and combine it with the retry feature, your job may fail. Try to adjust the GIT_DEPTH variable properly so that your job won’t fail.

Or if the job eventually doesn’t depend on the repository to run, we could disable cloning completely by specifying:

variables:
  GIT_DEPTH: "0"

19. Utilize “needs” option.

Utilize needs to complete tasks out of sequence. An illustration of the relationships between needs-based jobs is a Directed acyclic graph.

Let’s make an example to understand how the needs option works. Assuming you have a pipeline building 2 modules in separate jobs in a single-stage build. We call these build jobs build-a and build-b. The case we have is that:

  • build-a requires a pre-build-a job to run prior to it, while build-b having no dependencies. That’s when we need to add the pre-build-a job with a pre-build stage.
  • test-a has dependencies on build-a
  • test-b has dependencies on build-b

In this way, build-b has to wait for the pre-build-a job to finish, even though build-b doesn’t have any dependencies. Similar to test jobs, test-a and test-b can run only when all jobs prior to “test” stage are completed. With the needs option, we can bypass this constraint which allows build-b to run immediately without waiting for the pre-build-a job to complete, or any other jobs that build-b have no dependencies prior to it. The same way to test jobs.

We simply add the following configuration to the .gitlab-ci.yml file:

pre-build-a:
  stage: pre-build
  script:
    - echo "Pre-build package for Build A"

build-a:
  stage: build
  script:
    - echo "Run build-a"
  needs: ["pre-build-a"]

build-b:
  stage: build
  script:
    - echo "Run build-b"
  needs: []

test-a:
  stage: test
  script:
    - echo "Run test-a"
  needs: ["build-a"]


test-b:
  stage: test
  script:
    - echo "Run test-b"
  needs: ["build-b"]

This configuration runs the jobs as follows:

  • build-b: runs immediately without waiting for pre-build-a to finish.
  • build-a: runs as soon as pre-build-a finishes
  • test-a: runs as soon as build-a finishes without waiting for build-b to finish
  • test-b: similar to test-a, runs as soon as build -b finishes, even when build-a and pre-build-a haven’t completed.

Thus, we have the graph with dependencies between jobs as below:

Just noted that you might have to be careful when using needs in the real project. It may make your pipeline hard to read, and you’ll need to design it properly when you have a complex pipeline.

20. Enable interruptible pipelines on jobs

By default, GitLab allows multiple pipelines to run concurrently for the same branch, even if the corresponding commit or commits are no longer relevant. While this affords flexibility in terms of managing your pipelines, it can also consume unnecessary resources and put a strain on your runners.

However, there is a solution to mitigate this issue and optimize your pipeline execution. By automatically stopping jobs for obsolete pipelines, you can significantly reduce the workload on your runners. GitLab provides a keyword called interruptible, which can be set to true for individual jobs or by default for all jobs. By configuring this attribute, you lighten the load on your runners and ensure that only the most pertinent pipelines are allowed to run.

Here is an example to set it by default for every job:

default:
  interruptible: true

This approach has several benefits. Firstly, it helps ensure that your runners are effectively utilized and that valuable computational resources are not wasted. Secondly, by reducing the number of unnecessary pipelines running in parallel, you can enhance the overall efficiency and performance of your CI/CD process. This is especially important when you have a large number of developers working on a project concurrently.

In conclusion, GitLab offers a convenient way to optimize pipeline execution by automatically stopping jobs for obsolete pipelines. By utilizing the interruptible attribute and configuring it to true for relevant jobs, you can maximize the efficiency of your runners and streamline your CI/CD workflow.

Conclusion

Here are the 20 tips I’ve used to improve the speed of our GitLab CI/CD pipelines. Hope they can help you in some aspects. If you have any other tips, please share them in the comments below. I really appreciate other solutions that could save us time in the management of GitLab pipelines.

If you like my blog post and want to find a way to thank me, feel free to Buy me a coffee to support me in providing more quality content, or simply share it with others who may have the same issue. Thanks for reading my blog.


Discover more from Turn DevOps Easier

Subscribe to get the latest posts sent to your email.

By Binh

Leave a Reply

Your email address will not be published. Required fields are marked *

Content on this page