Create a container image definition file for use with Docker or Apptainer
Starting with the base image or OS
For this tutorial, we are using tools from FSL and FreeSurfer. My local test has been run using FSL 6.0.4 and FreeSurfer 7.3.2.
I’ve provided examples of Dockerfiles and Apptainer build files for this example container that were set up a few different ways. Files in the image-definition-files/example-freesurfer_base_image folder use the “freesurfer/freesurfer:7.3.2” Docker container as a base image for both the Apptainer and Docker build examples.
The Apptainer definition file starts with
Bootstrap: docker
From: freesurfer/freesurfer:7.3.2
This indicates to Apptainer that the base image should be this particular image from Docker Hub, from user “freesurfer”, image name “freesurfer”, with version tag “7.3.2”.
The Dockerfile starts with
FROM freesurfer/freesurfer:7.3.2
Which has the same effect.
I’ve also provided an example that uses Ubuntu 22.04 as a base image. Files in the image-definition-files/example-ubuntu_base_image folder use the official “ubuntu:22.04” Docker container as a base image for both the Apptainer and Docker build examples.
The Apptainer definition file starts with
Bootstrap: library
From: ubuntu:22.04
Apptainer will use the “ubuntu” image, version 22.04, from the Apptainer official image library.
The Dockerfile starts with
FROM ubuntu:22.04
Which uses Docker’s official Ubuntu 22.04 image as the base.
Container image definition file syntax
After the base container image is set, you will add lines to install the programs you need and configure the container.
This will give a very brief overview of the important steps for this project, but for more in-depth reference on your chosen platform you can refer to the Apptainer documentation on Definition Files and the Docker Dockerfile documentation.
Apptainer
In general, an Apptainer definition file contains sections that group the setup steps by type. For example, all environment variables are set in the %environment section of the file. The example Apptainer definition file that uses Ubuntu as the container base includes the following sections:
%files
[source file] [destination file]
[source file] [destination file]
%environment
[environment variable name] [environment variable value]
[environment variable name] [environment variable value]
%post
[command 1]
[command 2]
[command 3]
Your local files get copied in in the %files
block. The files you have locally should be in the same directory where you will run your apptainer build
command from. Any environment variables that need to be set in the container will go in the %environment
block. Any installation steps including file download from URLs, unzipping or moving files, package manager steps, or installation script runs, should go in the %post
block.
Apptainer definition files have other blocks available but for simplicity these are the ones that will be discussed in this tutorial.
Dockerfile
A Dockerfile can run a set of commands in any order, unlike Apptainer which requires the commands to be organized into blocks by type. Commands in the Dockerfile must use Dockerfile instruction syntax. The Dockerfile commands I will use in this tutorial are:
RUN [command]
The RUN instruction will run the command you specify in your container image. You would use this for any installation steps, like file download from URLs, unzipping or moving files, package manager steps, or installation script runs. You would use a RUN line for each step. You can also run multiple commands per RUN
by joining them with &&
but the whole command will fail if either of them does not work.
COPY [source file] [destination file]
COPY should be added once for each file or directory you want to copy into your container image. The files you have locally should be in the same directory where you will run your docker build
command from.
ENV [environment variable name] [environment variable value]
Use the ENV instruction once for each environment variable you need to set within your container image.
Copying in your files (any non-public files or programs you have locally)
The Apptainer %files
section is equivalent to running a bunch of Dockerfile COPY
commands in a row. First you enter the name of the file locally, and then you say where you want the file to go within the container image. The destination needs to be an absolute path within the container.
In Docker you would enter
COPY run_workflow.sh /usr/local/bin/run_workflow.sh
COPY parc /usr/local/parc
and in Apptainer the equivalent would be
%files
run_workflow.sh /usr/local/bin/run_workflow.sh
parc /usr/local/parc
This example is copying in my most recent updated script file named run_workflow.sh
into the container image at the location /usr/local/bin
also with the filename run_workflow.sh
.
You’ll also want to incorporate any atlases, reference images, or libraries that you have stored locally but can’t get elsewhere. The last lines of these examples are copying my local folder called parc
and everything in it to the location /usr/local/parc
within the container image. After the container image is built, any files that are in my local parc
folder will be copied in at the destination /usr/local/parc
within my container image.
You can change the destination names and locations within the container to anything you want, as long as you know how to find them within the container when you need them. For example, let’s say my custom run_workflow.sh
script needs to use the file parc/Schaeffer200.nii.gz
. I set up my container definition file copy step to copy the parc folder and its contents to /usr/local/parc
. Then, to access that file from within the container, my run_workflow.sh
script will need to use the path /usr/local/parc/Schaeffer200.nii.gz
to find it.
If you have any license files that are needed for your programs to be operational, you would include lines for them in the file copy steps as well.
When using Docker, it can be useful to add the COPY
step for your main run script file as one of the last lines in your Dockerfile. This is because docker build
uses caching to determine what files or lines have changed in the container since the last successful build, and if your script file changes, it reruns the COPY
step and then all steps after it, which can take extra time. Putting your script file COPY
step last means that if you can build your container successfully and then need to modify your script, the very fast COPY
step will be the only step you need to re-do since all prior steps will be pulled from the Docker cache.
Setting environment variables
Many programs use environment variables for configuration. You’ll also want to set the PATH
variable within your Docker container image so programs you install can be found by default rather than typing the full location of them, e.g. being able to type fslmaths
instead of /usr/local/fsl/bin/fslmaths
every time.
Here is an example of setting environment variables in Apptainer syntax:
%environment
export OS=Linux
export FREESURFER_HOME=/usr/local/freesurfer
Here is the equivalent set of commands for a Dockerfile:
ENV OS Linux
ENV FREESURFER_HOME /usr/local/freesurfer
These lines are the equivalent of setting
export OS=Linux
in a bash shell, or setting
setenv OS Linux
in tcshell.
Any environment variables that were set in the base image you selected will carry over to your new container image automatically, so you won’t need to set those yourself unless you want to change them. For example, the definition files that use the freesurfer/freesurfer:7.3.2
Docker container as a base image will already have the FREESURFER_HOME
and other FreeSurfer-related environment variables set, so you will not need to set those. But if you are installing FreeSurfer “from scratch” in a container that uses a plain operating system based image, you will need to set those environment variables in your definition file.
On most operating systems, the PATH
environment variable will be set by default. The PATH
variable is a list of absolute paths to directories, with no spaces, with each item in the list separated by the :
character. When you run a command-line program in the command line the system looks through each of those directories listed in the PATH environment variable to find that program. If it can’t find a match in those folders, you see an error.
One of the steps when installing FSL into the container image is to add the full location of the bin
directory in the FSL program folder onto the PATH
variable, by appending to the existing $PATH
variable. In the Apptainer environment block you would do this by adding this line in the %environment
section:
export PATH=/usr/local/fsl/bin:$PATH
and in a Dockerfile you would add this line:
ENV PATH /usr/local/fsl/bin:$PATH
After that runs, this means that any program that is located in /usr/local/fsl/bin
will be accessible from just typing that program name rather than the whole location, so you can type fslmaths
instead of /usr/local/fsl/bin/fslmaths
and the container will know what you’re talking about.
Add program install commands
The main installation steps of any programs you need will happen in the %post section (for Apptainer) or using RUN
instruction steps (for Docker). For Docker, you will add RUN
before each of your commands, like this:
RUN [command1]
RUN [command2]
RUN [command3]
and in Apptainer, you will have a section in your definition file for program commands, starting with the text %post
and with an indented set of commands under it, like this:
%post
[command1]
[command2]
[command3]
The next sections will describe some types of commands you might need to include as part of your container setup.
Using operating system package managers
If your run steps involve downloading program files and you chose a base container that does not have a downloading program like curl
or wget
installed, you will have to install that program as part of your install commands. This command would be the same as though you were installing one of those programs on a Linux machine of the operating system you are using for your container, which would most likely be through a Linux package manager. This is specific to the operating system of your base container.
If needed: Determining operating system of a base container image
The package manager your container uses will be dependent on the operating system of your base container. If you chose a base Docker container for a particular program and it wasn’t clear what operating system it uses, you can you can quickly check by running that container in the command line with the docker run
command and look at the /etc/os-release
file with the cat
command. Here is an example of checking the operating system used in the freesurfer/freesurfer:7.3.2
Docker container.
docker run -t freesurfer/freesurfer:7.3.2 cat /etc/os-release
That command is using the docker run
command to run a single container based on the image freesurfer/freesurfer:7.3.2
and once that container is created, running only the command cat /etc/os-release
, and then stopping the container.
In this case, the output looks something like this:
NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"
So now you know the freesurfer/freesurfer:7.3.2
container image uses CentOS Stream 8 as its operating system, and can use CentOS 8 package manager commands to install any system programs you might need that weren’t already there.
The equivalent command for a Apptainer image called image.sif
would be:
apptainer exec image.sif cat /etc/os-release
Most Linux installations have the /etc/os-release
file available in that location so you should be able to look at that file in many container images and see what operating system is in use.
You can use the operating system of your base image to look up which package manager is used by default. Then you would look up the commands of that particular package manager to install system programs.
Ubuntu Linux uses apt-get
as its default package manager, so in container image definition files that use ubuntu:22.04
as a base container operating system, you would use apt-get
commands to install system packages. As shown above, the freesurfer/freesurfer:7.3.2
Docker image uses CentOS 8 Stream as the operating system, and CentOS includes the yum
package manager, so you can use yum
commands for installations. There might be multiple package managers available in an operating system, or you can use commands to install a different one if you prefer. Different operating systems and package managers have different packages and names available, so while there is a good amount of overlap for basic programs like zip
and wget
, some more advanced system libraries that are required for imaging programs might have different names or versions across package managers and you will need to look it up in the online package repository for the package manager you are using.
Example installation steps using a package manager
Below are examples of the initial system package install steps in the Ubuntu-based containers. Since we are essentially installing packages in a new operating system, it’s best to update the package repository list first with apt-get -y update
and apt-get -y upgrade
. The -y
flag is included so the command prompt auto-selects “yes” rather than prompting you to enter something - when you are performing installations in a container image build, you will not be able to provide manual input or respond to prompts - every command must be entirely automated. The &&
between the two commands chains them together into one line. This will shorten your definition file and reduce the number of build steps, but in that case, if one of the sub-steps fails, the entire line will fail, and you will need to figure out which sub-command failed and diagnose the issue.
After the update
and upgrade
step we will install a list of packages with apt-get -y install
. A couple of these are required for FSL. Some utilities like tar
, zip
, and unzip
will be used for our install steps, or might be used in our custom workflow script e.g. if we are zipping up files or un-gzipping NIFTI files.
In Apptainer, those first commands go in the %post section:
%post
apt-get -y update && apt-get -y upgrade
apt-get -y install bc git openblas tar wget curl zip unzip python3
Here is the equivalent set of commands for a Dockerfile:
RUN apt-get -y update && apt-get -y upgrade
RUN apt-get -y install bc git openblas tar wget curl zip unzip python3
Downloading external program files from a URL
Once you have the system packages in your container image updated, you can move on to running other program installations you need. Some program installations are relatively simple - you can just download a zip file and unzip it. Here are example lines on how that would happen in Apptainer %post section:
%post
wget https://surfer.nmr.mgh.harvard.edu/pub/dist/freesurfer/7.3.2/freesurfer-linux-ubuntu22_amd64-7.3.2.tar.gz -O fs.tar.gz
tar --no-same-owner -xzvf fs.tar.gz
mv freesurfer /usr/local
rm fs.tar.gz
First the wget
command you just installed is used to get the zip file from the URL and save it as the file fs.tar.gz
. Then the tar
command is used to unzip the file. This command as written puts the unzipped output folder at the same location this command is run from. The output is a folder called freesurfer
with the program output, that you can then move with mv
to the destination you want (in this case /usr/local
). Then the rm
command removes the original fs.tar.gz
file to decrease the size of the container image.
Here are the equivalent lines for a Dockerfile:
RUN wget https://surfer.nmr.mgh.harvard.edu/pub/dist/freesurfer/7.3.2/freesurfer-linux-ubuntu22_amd64-7.3.2.tar.gz -O fs.tar.gz
RUN tar --no-same-owner -xzvf fs.tar.gz
RUN mv freesurfer /usr/local
RUN rm fs.tar.gz
These lines are in the example Docker and Apptainer definition files but the tar line includes some exclude
options to exclude certain sub-folders.
Running a script
For other installations you may need to run commands like installer scripts or configuration setup scripts. As long as the script doesn’t require user interaction when it runs, then you’ll just want to make sure that:
- You have an earlier step to get the script file into the container
- You have an earlier step that installs any program or programming language (like Python) needed to run the script
Then you can add the exact command needed to call the installation script, either after a RUN
instruction for a Dockerfile, or in the %post
section for a Apptainer definition file.
For the example of the FSL installer script, it runs on Python, so the earlier package manager installation steps include python3
as a software that will be installed. To get the script into the container, I could use Docker COPY
or a line in Apptainer’s %files
section to copy the file into the container image, or in this case I can also use Docker RUN
or a line in Apptainer’s %post
section to use wget to download it from a URL. Once the script is in the container, and I know where it is located, I can add the command I would use to install the program on a local computer, with a RUN
instruction for a Dockerfile:
RUN python3 fslinstaller.py -d /usr/local/fsl --fslversion 6.0.4
Or from the %post section Apptainer for Apptainer:
%post
python3 fslinstaller.py -d /usr/local/fsl --fslversion 6.0.4