Cloud Native 5 Minutes at a Time: Volumes and Persistent Storage
One of the biggest challenges for implementing cloud native technologies is learning the fundamentals—especially when you need to fit your learning into a busy schedule.
In this series, we’ll break down core cloud native concepts, challenges, and best practices into short, manageable exercises and explainers, so you can learn five minutes at a time. These lessons assume a basic familiarity with the Linux command line and a Unix-like operating system—beyond that, you don’t need any special preparation to get started.
In the last lesson, we learned how to use and share container images using a public registry such as Docker Hub. This time, we’re going to learn how to introduce persistent storage for containers.
Table of Contents
- What is a Container?
- Creating, Observing, and Deleting Containers
- Build Image from Dockerfile
- Using an Image Registry
- Volumes and Persistent Storage←You are here
- Container Networking and Opening Container Ports
- Running a Containerized App
- Multi-Container Apps on User-Defined Networks
Persistent storage for containers
When technologists talk about why containers are useful for large-scale enterprise applications, they will often describe containers as “ephemeral”—a word that means short-lived, here and then gone.
At this point, we should have a firmer sense of why the ephemerality of containers is useful—we can quickly and very efficiently spin up a container as needed, only transferring data for image layers not shared by other containers. This quality enables many of the systems we now describe as “cloud native,” including container orchestrators like Kubernetes.
But in the midst of all this transiency, there is a complication: most applications store persistent data for later use. Even simple websites—which can be built on static web servers with no need for persistent storage—are often created with content management systems like WordPress, and those content management systems use databases to save user logins, draft posts, and other data.
If we want to containerize an application such as WordPress, we need a way to create a persistent data store that is independent of any given container, so our ephemeral containers—created and destroyed as needed—can access, utilize, and update the same data.
With Docker, those persistent data stores are called volumes. By default, volumes are directories located on the host filesystem, within a specific directory managed by the Docker engine—but they can also be defined in remote locations such as cloud storage.
A given volume may be mounted to multiple containers at once, in either read-write or read-only mode, and will persist beyond the closure of a particular container. If you’re using Docker Desktop on macOS or Windows, you will find that volumes have their own management tab.
Now let’s try creating a volume and mounting it to multiple containers.
Exercise: Opening the first volume
You can follow a version of this lesson's exercise in the video below.
To start, we’ll create a new volume:
docker volume create d6app
With this command, we’re creating a new volume and giving it the name “d6app.”
Next, let’s create a new container from the d6 image we made in the last lesson. As we create this new container, we’ll mount the volume we just created.
docker run --name d6v2 -it -v d6app:/d6app <Your Docker ID>/d6:1.0 bash
The syntax here can be a little confusing at first glance—let’s take a moment to break it down:
- We’re executing docker run as usual, specifying the name “d6v2” for our new container.
- The -it tag gives us an interactive shell.
- The -v tag identifies and mounts a volume to the container.
- The first instance of “d6app” (before the colon) specifies the volume to mount—in this case, “d6app,” which we just created.
- "/d6app" tells us where we will be able to find the contents of this volume in our new container: a directory called d6app. We could have called this directory anything, but in this case, we’re simply using the same name as the volume.
- Next, we’re specifying the source of the container image.
- Finally, we’re opening a bash shell.
Now we should be working in a bash shell within our container. Let’s take a look at our container filesystem with the ls command. We should see the directory where the d6app volume is mounted:
bin boot d6.py d6app dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
If we change directory and inspect the contents, we should find it empty.
While we’re here, we’ll create an empty text file:
Now we'll go back into the container’s root directory and open the Python app we wrote last lesson. (We won’t need to download nano this time—it’s part of the image now.)
Let’s add some functionality to our die-rolling app. It’s all well and good to get randomized die rolls, but wouldn’t it be nice if we could record those rolls, so we can keep a log of our amazing (or terrible) luck? Update the d6.py file contents to look like this:
from random import randint
#open the storage file in append mode
outfile = open('/d6app/d6log.txt', 'a')
#assign a random integer between 1 and 6, inclusive, to a variable
roll = randint(1, 6)
#print the variable
#convert the value to a string and write to the file
rollstr = str(roll)
outfile.write(rollstr + '\n')
#close the file
When we run d6.py now, the program should open the text file in the d6app volume, convert the randomized output to a string, and record that string (and a line break) for posterity. Let’s test it out.
Within the file, we should find our recorded result.
Try running the app a few more times and check the file again.
But what happens when we stop the container? Well, first let’s commit and push this updated version of the d6 app to Docker Hub. With the container still running, open another terminal session and enter:
docker commit -m “Updating to v2” d6v2 d6:2.0
docker tag d6:2.0 <Your Docker ID>/d6:2.0
docker push <Your Docker ID>/d6:2.0
The push refers to repository [docker.io/<Your Docker ID>/d6]
3b2cfa75b8bf: Layer already exists
128b344cad66: Layer already exists
88597958b14e: Layer already exists
db26989c2f90: Layer already exists
a3232401de62: Layer already exists
204e42b3d47b: Layer already exists
613ab28cf833: Layer already exists
bed676ceab7a: Layer already exists
6398d5cccd2c: Layer already exists
0b0f2f2f5279: Layer already exists
Now we’re going to start a new container with nearly the same command that we used at the beginning of this lesson:
docker run --name d6test -it -v d6app:/d6app <Your Docker ID>/d6:1.0 bash
Here, we can see that we’re starting a new container based on the version 1.0 image of the d6 app—before we made any changes today. We’re mounting the d6app volume, as before, and opening an interactive shell session.
If we open the d6.py file, we’ll find it unchanged. This is the old 1.0 image, after all. But if we open the d6log.txt file in the mounted d6app volume…
Our file has persisted, carrying the data across multiple containers. We can make a manual edit to the file—typing “test,” say—and then return to the first container. Here, let’s run the Python app again and then check the log file within our initial and most current container:
With persistent storage, we’ve found a way to create some continuity between disparate containers. In the next lesson, we’ll go a step further and dive into the fundamentals of container networking, so we can make multiple containers work together in real time.