Python is one of the most beloved programming languages. It’s been in the industry for a while and has its own fair share of fanatics.

Actually, many of the DevOps tools are written in Python, and teams often use it to develop internal tools for their DevOps platform.

It uses pip as a package manager. However, it works a bit differently from Node. Python developers often use the concept of virtual env in their local development. This allows them to isolate local development environments from similar environments and install multiple versions of the same package for different projects in different virtual environments.

Python has been in the industry longer than NodeJS, so the solutions they came up with are a bit different, as you can see.

Exploring Repositoru

We continue to use Microservices Demo as our learning example.

Let’s navigate to src/emailservice , and we see email_server.py. There’s no obvious place to start, like index or just plain server so it seems like email_server.py would be a good place to start.

On line 128, you see port = os.environ.get('PORT', "8080"). This indicates that the port we should listen to will be defaulted to 8080 unless stated otherwise in the PORT environment variable. Good to know.

Then, it also needs GCP_PROJECT_ID and similarly to Nodejs project looks for DISABLE_PROFILER on the line 169. The env variables are the same.

Dockerfile

Python devs don’t like Alpine. They like Debian. The reason is that pip pre-builds some stuff for Linuxes that conform with manylinux standard, and alpine doesn’t :) You can read more about it here Why using Alpine Docker images and Python is probably bad for your project (right now)

In a nutshell, we trade image size and get fast build time.

Pro tip faster build is almost universally better than a smaller build because the faster you ship, the more real your product is :) And users care very little about if your underlying docker image is 200mbs bigger. Of course, in some cases, you can argue that a smaller build is better, and you’ll be right, however, we’re talking about the vast majority.

To get the smallest possible Debian, we’ll be using slim the suffix. This distribution of Debian is smaller than a typical one because it lacks a bunch of internal libraries you probably won’t need anyway for your service.

At the time of writing, there’s 3.12.1 the version available, so we’ll select 3.12-slim following the logic with Node.

Pro tip Be consistent :) the more predictable you are the better for everyone because your teammates will understand what you’re doing.

And we also need a working folder. Let it be /app again.

Also, as we looked at the port, we can let it be a default 8080, so we need to expose it for other docker containers.

We’re ending up with the following Dockerifle.

FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .

RUN pip install -r requirements.txt

COPY . .

EXPOSE 8080

ENTRYPOINT ["python", "email_server.py"]

Let’s run our usual docker build -t emailservice .

We again get the error :) It says error: command 'gcc' failed: No such file or directory. Seems like we need to install g++ yet again.

Since it’s Debian, there’s a different package manager called apt-get. You can find how to work with it online. There are many tutorials, such as this one: Package management with APT

In practice, we need to run the update command to fetch the latest source list for your packages, so apt-get understands where to download apps from, and you can do it via apt-get update -y. -y flag will make sure if there’s a prompt, then automatically pass yes to it. This will make the update non-interactive and exactly what we’re looking for in the Docker world.

Then, we need to install the package itself. It’s done via apt-get install -y --no-install-recommends g++. This will install the compiler without recommended dependencies (however, they are not required for the compiler to run), and we again pass -y to ensure we get this command executed in a non-interactive mode.

These are 2 different commands, so we’ll chain them in a single RUN to make sure they are running in the same layer. And this produces the following Dockerifle:

FROM python:3.12-slim

WORKDIR /app

RUN apt-get update -y && apt-get install -y --no-install-recommends g++

COPY requirements.txt .

RUN pip install -r requirements.txt

COPY . .

EXPOSE 8080

ENTRYPOINT ["python", "email_server.py"]

And let’s build it again. This has now passed successfully, so let’s check the image size via docker inspect. It says around 417 megabytes. It’s OKish. We can make a small effort to reduce the size via multistage. We’ll move the g++ installation and dependencies installation to another stage and later copy the downloaded and compiled dependencies.

FROM python:3.12-slim as base

RUN apt-get update -y && apt-get install -y --no-install-recommends g++

COPY requirements.txt . 
RUN pip install -r requirements.txt

FROM python:3.12-slim

WORKDIR /app

COPY --from=base /usr/local/lib/python.3.12/ /usr/local/lib/python3.12/

COPY . .

EXPOSE 8080

ENTRYPOINT ["python", "email_server.py"]

This small change brings the image down to ~269 megabytes, which is more than enough improvement. And it appears to be just slightly smaller than what we got in Node.

Patterns

I think you should’ve noticed a pattern here if you read the previous article :) And you’re right. It can be summarized in the following way

  • Review the code for the exposed port and required environment variables
  • Start with a small image such as slim or alpine
  • COPY your dependencies list file first (so it can be cached) and install them before copying the source code
  • COPY the source code
  • Set ENTRYPOINT, ENV variables, and EXPOSE port.
  • There’s a big chance you’d need some other things, such as g++ to install some of your dependencies, so install them prior to COPYing the dependencies file.
  • Make this a multistage, so your resulted image has only what you need for it to work

Let’s see if this pattern will hold up with Go, C#, and Java in the next articles.

Go build dockerfiles.