DevOps Basics: Github Actions
We’ve been dockerizing many applications recently and even ran them locally in a Docker Compose
Now, it’s time to take a step further and build a production-ready version of this setup.
The first step is not to build production artifacts locally.
Quite a few years ago, we were introduced to the concept of continuous integration, which is when developers frequently merge and automatically test their changes. Back then, it was a shared machine! Now, this concept has evolved, and we have a lot of tools, such as Jenkins, Github Actions, Drone, etc., which represent that shared machine. The reasons why we need that “shared machine” are pretty simple: you aren’t working alone, and your local machine isn’t “production.” “shared machine” is unbiased and often catches issues that would appear in production but couldn’t be easily reproduced in your local environment due to your local settings.
A tool of choice for today is GitHub Actions. It’s a simple yet powerful implementation of a CI tool.
GitHub Actions has several moving parts:
- Workflows (aka pipelines)
- Cache
- Releases
- Artifacts
We’ll be using them all today.
Ideal Docker Build Workflow
You want your workflows to be as simple and as predictable as possible. The reason why is very simple: fewer things to support and change down the road.
This also leads to another tool in our pocket: templates. We’ll rely on them often to avoid copy-pasting when it doesn’t make sense. Certainly, CI becomes more complicated when you copy-paste a lot.
We also need to differentiate between pull requests and branch-type workflows. During the Pull Request Review process, we want to get much more information than when we do the actual build for deployment.
Here’s the Pull Request PR Diagram (for each service):
As you see, there are 2 primary things we want to test:
- run tests on a service if they exist
- build a docker image and do security checks
Not every project has tests, so if it does, we want to ensure they pass. Otherwise, we fail the build. Passed tests don’t guarantee that your service will work correctly, but if they fail, indeed, something went wrong!
We want to ensure our deployable artifacts (docker images) can be built and don’t have critical or high-security vulnerabilities. Because if there’s a critical vulnerability and there’s a fix, you have to do it.
We also don’t want to overload our PRs with checks. Because you can have multiple types of tests, and you have to decide what’s worth it. Pipelines take time to run. The faster it is, the better.
So, the worst-case scenario in our project is 21 pipelines if you change everything in a single PR! Our goal is to speed up the common scenarios as much as possible.
We’re using Google Microservices as an example, so I forked it into my repo. I also removed everything related to CI/CD, Infrastructure, etc., from it.
git clone YOUR_FORKED_VERSION microservices-demo
cd microservice-demo
rm -rf .deploystack .github helm-chart istio-manifests kubrenetes-manifests kustomize release terraform cloudbuild.yaml skaffold.yaml src/adservice/Dockerfile src/cartservice/src/Dockerfile src/checkoutservice/Dockerfile src/currencyservice/Dockerfile src/emailservice/Dockerfile src/frontend/Dockerfile src/paymentservice/Dockerfile src/productcatalogservice/Dockerfile src/recommendationservice/Dockerfile src/shippingservice/Dockerfile
git add .
git commit -m "clean project"
git push -u origin main
If you were to do the same, please don’t submit your PRs to the central Google repository.
CI Architecture
We start with the Docker build. Since every project has a Dockerfile
, we can create a typicall workflow.
Every workflow in GitHub Actions has its own file. Read about syntax and different options in the official docs - GitHub Actions Docs
All workflows are in .github/workflows
folder.
Let’s start by creating _docker-pr.yml
that will represent a typical docker build workflow for a PR.
We begin by setting up when this workflow can be executed.
on:
workflow_call:
inputs:
project:
required: true
type: string
This means that we expect this workflow to be triggered from another workflow. GitHub Actions and many other CI/CD tools allow that. This allows you to avoid doing a bunch of copy-pasting.
inputs:
project:
required: true
type: string
This is a variable we’re expected to provide when calling this workflow.
The second thing for us is to define the jobs we want to execute. We can have multiple jobs. Right now, our steps are the following:
- Checkout the code
- Get Docker Build environment
- Build Docker image
jobs:
docker-ci:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build Docker
uses: docker/build-push-action@v4
with:
context: ./src/$
load: true
cache-from: type=gha
cache-to: type=gha,mode=max
tags: $:$
runs-on
specified the platform we’re running our CI job in. The list of platforms is available in the documentationsteps
is our list of actions we need to perform (it could be bash scrips or something GitHub calls “actions” - packaged scripts so you don’t have to write them yourself)- We’re checking out the code via
actions/checkout@v4
which is available via this link - We’re setting up Docker BuildX
docker/setup-buildx-action@v3
an environment that allows us to build docker images for various platforms as well as efficiently utilize docker cache (available via this link) - We’re performing the actual build via
docker/build-push-action@v4
action (available via this link) with a few parameters we pass through
Parameters are:
context: ./src/$
which is a path to Docker context, and here we’re using the “project” variable we defined earlierload: true
meaning we won’t push this image anywhere since we have no docker registry to push it tocache-from: type=gha
tells us about the source of the docker cache, which is GitHub Actions (gha
)cache-to: type=gha,mode=max
meaning we set to use gha cache, and we’re saying, “cache everything you can.”tags: $:$
this is how we construct a docker image name and a tag. The tag is a “version”. You can either do a numbered version, but I prefer the “commit SHA” since it’s very descriptive because you know exactly exactly what commit to look for.
Now, do you remember our C# Service? Here’s how we dockerized it
One important thing is: that the docker context is ./src/cartservice/src
where in all other projects, we have ./src/PROJECT_NAME
we have to account for that in our docker image tag. So, let’s split the name by /
so we can get just cartservice
We’re adding a step before checkout:
- name: Split Name
id: split
env:
PATH_CANDIDATE: $
run: echo "::set-output name=imagename::${PATH_CANDIDATE##*/}"
As you can see, we don’t use any actions, and running this natively on our ubuntu-latest
runner.
The algorithm is straightforward:
- We set an input into an environment variable
PATH_CANDIDATE
- We then split it by
/
and assign the first element to the output variable imagename ::set-output
is a GitHub actions-specific command. Other CI systems have their own equivalent.
So, for adservice it will be just adservice but for cartservice/src
it’ll be cartservice
OK, we then need to modify our tag to use that imagename
tags: $:$
The reference is steps.STEP_ID.outputs.OUTPUTS_NAME
Now, it’s time to add the docker scanning step. There are multiple tools for this, such as Snyk, Trivy, or AWS/GCP/Azure native Docker image security scanning services. Since we’re not in a cloud step yet, we’ll be using Trivy. They provide free action you can use in your pipeline (here it is) that we can add as a step in our job:
- name: Scan for Vulnerabilities
uses: aquasecurity/trivy-action@master
id: scan
with:
image-ref: '$:$'
exit-code: 1
output: 'vulnerabilities.table'
ignore-unfixed: true
severity: 'CRITICAL,HIGH'
Here’s an explanation of the parameters:
image-ref
- the same as tagsoutput: 'vulnerabilities.table'
puts the results of vulnerability scans into a file so we can use it later onexit-code
- By default, it is0
, but we want to fail a pipeline if there are security issues. You might not always want to do that, but if you don’t think about security, no one willignore-unfixed: true
it doesn’t make sense to stop the build if there are vulnerabilities which don’t have a fix yetseverity: 'CRITICAL,HIGH'
we want to check onlyCRITICAL
andHIGH
-level issues
If we have vulnerabilities in our docker images, this step will fail, and we need to process the output. For this, we’ll be using Github Script action managed by the GitHub team
It allows us to use JavaScript and GitHub API to do stuff. Our use case will be simple
- if a scan fails
- get the results from vulnerabilities.table file and push its content as a comment
You don’t have to push it as a comment. It’s just my preferred method. I like to have all information as comments so devs, and I don’t have to navigate between different tabs when we don’t really have to. Plus, it provides an excellent audit trail.
Another way to do this is to post the results on GitHub Security. I prefer comments because they are more descriptive.
Here’s what our step looks like
- uses: actions/github-script@v7
if: $
with:
script: |
const { readFileSync } = require('fs')
const text = readFileSync('./vulnerabilities.table');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: "👋 We found the following vulnerabilities: \n\n\n```\n" + text + "\n```"
})
Explanation of the properties:
if: $
means this step will be triggered only if the scan step failsscript
is a JS content
const { readFileSync } = require('fs')
const text = readFileSync('./vulnerabilities.table');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: "👋 We found the following vulnerabilities: \n\n\n```\n" + text + "\n```"
})
Firstly we read our file (it’s available because we share the same runner for the whole workflow.
- Then, we use GitHub API to post a comment. PR is an issue in the GitHub terminology.
\n
is a new line, and"
``”` wraps what we want into a markdown.
```\n" + text + "\n```
For the comments to work, let’s navigate into the repositories settings -> Actions -> General
. Scroll to the bottom and select the settings like in the image.
Let’s add another job that would lint a Dockerfile
.
I love Hadolint - a very simple yet powerful docker linter. It catches common mistakes and allows you to write better docker files.
docker-lint:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- uses: hadolint/hadolint-action@v3.1.0
id: scan
with:
dockerfile: ./src/$/Dockerfile
failure-threshold: error
output-file: dockerfile.table
- name: Comment on PR
uses: actions/github-script@v7
if: $
with:
script: |
const { readFileSync } = require('fs')
const text = readFileSync('./dockerfile.table');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: "👋 We found the following Dockerfile errors: \n\n\n```\n" + text + "\n```"
})
- Here, we need to do another checkout.
- Plus, we need to use
hadolint
action and pass aDockerfile
path. failure-threshold: error
says that we’ll fail the pipeline ONLY if there are errors. So, warnings and improvements will be ignored.- We don’t have to do anything to run the jobs in parallel because it’s a default behavior if you specify multiple jobs in a single workflow.
- Then, we take the same output as last time and post the results.
Complete _docker-pr.yml
on:
workflow_call:
inputs:
project:
required: true
type: string
jobs:
docker-lint:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- uses: hadolint/hadolint-action@v3.1.0
id: scan
with:
dockerfile: ./src/$/Dockerfile
failure-threshold: error
output-file: dockerfile.table
- name: Comment on PR
uses: actions/github-script@v7
if: $
with:
script: |
const { readFileSync } = require('fs')
const text = readFileSync('./dockerfile.table');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: "👋 We found the following Dockerfile errors: \n\n\n```\n" + text + "\n```"
})
docker-ci:
runs-on: ubuntu-latest
steps:
- name: Split Name
id: split
env:
PATH_CANDIDATE: $
run: echo "::set-output name=imagename::${PATH_CANDIDATE##*/}"
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build Docker
uses: docker/build-push-action@v4
with:
context: ./src/$
load: true
cache-from: type=gha
cache-to: type=gha,mode=max
tags: $:$
- name: Scan for Vulnerabilities
uses: aquasecurity/trivy-action@master
id: scan
with:
image-ref: '$:$'
exit-code: 1
output: 'vulnerabilities.table'
ignore-unfixed: true
severity: 'CRITICAL,HIGH'
- name: Comment on PR
uses: actions/github-script@v7
if: $
with:
script: |
const { readFileSync } = require('fs')
const text = readFileSync('./vulnerabilities.table');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: "👋 We found the following vulnerabilities: \n\n\n```\n" + text + "\n```"
})
Let’s build the services
Let’s create .github/workflows/adservice-pr.yml
name: "PR Ad Service"
on:
pull_request:
paths:
- 'src/adservice/**'
- '.github/workflows/**'
branches:
- main
jobs:
docker-workflow:
uses: ./.github/workflows/_docker-pr.yml
with:
project: adservice
We name the workflow, then we say it can be executed only on pull requests if files changed are in ./src/adservice
and .github/workflows
folders.
And we run them only when we raise a PR against our main branch. It’s important because the flow we stick to is:
- Branch out of
main
- Raise PR against
main
- Release from
main
It’s a straightforward branching strategy that allows us to avoid overcomplicated releases and do stuff quickly.
Our job uses the shared workflow we defined above and passes our project name there.
You can create other files for every project we have dockerized.
What about tests?
So, 3 projects have tests:
- Cart Service (C#)
- Shipping Service (Go)
- Product Catalog Service (Go)
There’s not much need to create a generic step yet, so we can avoid complications and put the tests there explicitly.
Starting with .github/workflows/cartservice-pr.yml
name: "PR Cart Service"
on:
pull_request:
paths:
- 'src/cartservice/**'
- '.github/workflows/**'
branches:
- main
jobs:
docker-workflow:
uses: ./.github/workflows/_docker-pr.yml
with:
project: cartservice/src
tests:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Dotnet
uses: actions/setup-dotnet@v4
env:
DOTNET_INSTALL_DIR: "./.dotnet"
with:
dotnet-version: '8.0'
- name: Run Tests
run: |
dotnet test src/cartservice/
- We created another job, so we need to checkout the code again
- then since it’s a clean environment, we need to install dotnet. There’s action available, and we’re installing the same version we specified in a
Dockerfile
. - Plus, we’re installing it “locally” to the project’s folder so we don’t pollute the GitHub actions environment.
- Installing everything you need into a “local” folder is a good practice.
- And then, we execute the tests.
Moving on to .github/workflows/shippingservice-pr.yml
name: "PR Shipping Service"
on:
pull_request:
paths:
- 'src/shippingservice/**'
- '.github/workflows/**'
branches:
- main
jobs:
docker-workflow:
uses: ./.github/workflows/_docker-pr.yml
with:
project: shippingservice
tests:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: '1.21'
- name: Run Tests
run: |
cd ./src/shippingservice
go test
Where we do the same thing we’ve done with dotnet, just using Go.
And a similar story with .github/workflows/productcatalogservice-pr.yml
name: "PR Product Catalog Service"
on:
pull_request:
paths:
- 'src/productcatalogservice/**'
- '.github/workflows/**'
branches:
- main
jobs:
docker-workflow:
uses: ./.github/workflows/_docker-pr.yml
with:
project: productcatalogservice
tests:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: '1.21'
- name: Run Tests
run: |
cd ./src/productcatalogservice
go test
All other .github/workflows/*-pr.yml
files look like .github/workflows/adservice-pr.yml
Now, let’s switch to our test branch and raise a PR
git checkout -b feature/bring-ci
git add .
git commit -m "shared CI & workflows for every project"
git push -u origin feature/bring-ci
When you go to raise a PR, you have to switch the branch because, by default, it’s going to Google’s repository (don’t spam them).
remote: Create a pull request for 'feature/bring-ci' on GitHub by visiting:
remote: https://github.com/snegas/microservices-demo/pull/new/feature/bring-ci
- Select your repo and hit Create Pull Request.
- Then wait. You’ll see all your workflows triggered. For the first time.
- You’ll get 2 (at the time of writing) messages on vulnerabilities for
currencyservice
andpaymentservice
.
And below, you’ll see:
Let’s go to the PR Currency Service workflow:
Let’s hit Summary and see how steps are running in parallel
- Navigate to the
Usage
to see exciting statistics on the billable time. - Go to the
docker-lint
job to see how it skippedComment on PR
step. - Now, let’s go to the
PR Cart Service
workflow and navigate to theSummary
You can see how all are executed in parallel. Go through all other workflows to see what it looks like.
Let’s merge this PR despite the errors, and we’ll fix mistakes one by one to show how it would look in the real project.
PR in my fork from screenshots - Link
You can see that I did a dummy change there, last PR had too many commits because I experimented with a few things. Please feel free to take a look at history. Nothing gets done from the very first try. Embrace it!
Fix Currency Service Error
Let’s create a separate branch for it
git checkout main
git pull
git checkout -b fix/resolve-currencyservice-vulnerability
So, vulnerability says:
currencyservice:39ca357740a91263d531ac689e2baa378c91183b (alpine 3.19.1)
========================================================================
Total: 0 (HIGH: 0, CRITICAL: 0)
Node.js (node-pkg)
==================
Total: 1 (HIGH: 0, CRITICAL: 1)
┌───────────────────────────┬────────────────┬──────────┬────────┬───────────────────┬───────────────┬───────────────────────────────────────────────────────┐
│ Library │ Vulnerability │ Severity │ Status │ Installed Version │ Fixed Version │ Title │
├───────────────────────────┼────────────────┼──────────┼────────┼───────────────────┼───────────────┼───────────────────────────────────────────────────────┤
│ protobufjs (package.json) │ CVE-2023-36665 │ CRITICAL │ fixed │ 7.1.2 │ 7.2.4, 6.11.4 │ protobufjs: prototype pollution using user-controlled │
│ │ │ │ │ │ │ protobuf message │
│ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-36665 │
└───────────────────────────┴────────────────┴──────────┴────────┴───────────────────┴───────────────┴───────────────────────────────────────────────────────┘
You can read about it here - CVE-2023-36665
You can find the package-lock.json
that this version is a dependency for google-gax
Let’s see if we can update our dependencies easily.
If you have node
installed locally, you can simply do npm i
and will show the following
5 vulnerabilities (3 high, 2 critical)
To address all issues, run:
npm audit fix
Run `npm audit` for details.
And then npm audit fix
added 4 packages, removed 1 package, changed 16 packages, and audited 344 packages in 4s
If you don’t have Node, you can use Docker
docker run --rm -it -v ./src/currencyservice:/app --entrypoint=sh node:lts-alpine
we need to run the following inside of this container to run npm i
apk add --no-cache python3 make g++
cd /app
npm i
npm audit fix
Explained in detail here - Docker Basics: NodeJS Let’s see what we got updated, among other things
That’s exactly what we needed. Let’s create a PR for it. Our changeset has to include just package-lock.json
file.
git status
On branch fix/resolve-currencyservice-vulnerability
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: src/currencyservice/package-lock.json
no changes added to commit (use "git add" and/or "git commit -a")
Let’s create our PR (again, don’t forget to create it against your main branch and not Google Microservices)
git add .
git commit -m "Fix Currency Service Vulnerabilities"
git push -u origin fix/resolve-currencyservice-vulnerability
Example of PR in my fork - Link
OK, you can do the same with the paymentservice
, just don’t forget to pull the changes after you merge this PR
git checkout main
git pull
git checkout -b fix/resolve-paymentservice-vulnerability
And do the same steps :)
You’ll see that the workflow was not triggered. So, I had to fix a typo and it triggered all workflows :)
PR in my fork - Link
The good thing is, you’ll get everything green and nice. It’s ready to merge!
Releases and Artifacts
So, CI is fun, but we need to do a few more things for it to be production-ready.
- Build on push to main
- Create releases & and artifacts
Build on main
Let’s create another branch and we’ll create a new template file .github/workflows/_docker-main.yml
on:
workflow_call:
inputs:
project:
required: true
type: string
That will be the start. Because we want it to be generic enough. Let’s also copy docker-ci the job and modify it a bit
env:
REGISTRY: ghcr.io
IMAGE_NAME_PREFIX: $
jobs:
docker-main:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Split Name
id: split
env:
PATH_CANDIDATE: $
run: echo "::set-output name=imagename::${PATH_CANDIDATE##*/}"
- name: Checkout
uses: actions/checkout@v4
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: $
username: $
password: $
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build Docker
uses: docker/build-push-action@v4
with:
context: ./src/$
push: true
cache-from: type=gha
cache-to: type=gha,mode=max
tags: $/$-$:$
First of all
REGISTRY: ghcr.io
IMAGE_NAME_PREFIX: $
ghcr.io
is a GitHub docker registryIMAGE_NAME_PREFIX
environment variable to be a prefix to your image. They’ll be scoped to your user/organization so no one else can access them.
Then, we modified our job a bit to do 3 things:
- Have a token with a scope to publish packages
- Login to that registry via
docker/login-action
and we pass a registry, a username (our login), and a password (GITHUB_TOKEN
) - In
docker/build-push-action@v4
instead ofload: true
we dopush: true
and we modified tags to include our prefix
Now, let’s add all of the *-main.yml
files (example adservice-main.yml
)
name: "Main Ad Service"
on:
push:
paths:
- 'src/adservice/**'
- '.github/workflows/**'
branches:
- main
jobs:
docker-workflow:
uses: ./.github/workflows/_docker-main.yml
with:
project: adservice
You do the same with all the other services. and send PR (don’t forget to switch to your repo there so you don’t push to Google Microservices)
Link to my PR - Link
I did a small test with the adservice-main.yml
Just do
on:
push:
paths:
- 'src/adservice/**'
- '.github/workflows/**'
without
branches:
- main
And do a quick push. You’ll be able to see the build!
Then, bring it back, and it’s ready for a merge! I had a few rounds of typos, which is OK.
After the merge, you’ll get your builds passed and on the main page, you’ll see the following
Release
It’s time to create a release! Let’s hit create a new Release!
Conclusion
So, we got the production-grade Github Actions CI configured. We hit a few bumps, everything is recorded in Git History :)
Try yourself, you get a bunch of Free Github Actions minutes with Public repos.
Key takeaways:
- Do templates when it makes sense
- Split main & pr builds
- Don’t run EVERYTHING on PR builds
- Use comments to communicate pipeline failures
The next step is to add some infrastructure flavor on top of this! Stay tuned for Terraform!