Cloud Native 5 Minutes at a Time: Volumes and Persistent Storage

Eric Gregory - March 04, 2022

One of the biggest challenges for implementing cloud native technologies is learning the fundamentals—especially when you need to fit your learning into a busy schedule.

In this series, we’ll break down core cloud native concepts, challenges, and best practices into short, manageable exercises and explainers, so you can learn five minutes at a time. These lessons assume a basic familiarity with the Linux command line and a Unix-like operating system—beyond that, you don’t need any special preparation to get started.

In the last lesson, we learned how to use and share container images using a public registry such as Docker Hub. This time, we’re going to learn how to introduce persistent storage for containers.

What is a Container?
Creating, Observing, and Deleting Containers
Build Image from Dockerfile
Using an Image Registry
Volumes and Persistent Storage←You are here
Container Networking and Opening Container Ports
Running a Containerized App
Multi-Container Apps on User-Defined Networks

Persistent storage for containers

When technologists talk about why containers are useful for large-scale enterprise applications, they will often describe containers as “ephemeral”—a word that means short-lived, here and then gone.

At this point, we should have a firmer sense of why the ephemerality of containers is useful—we can quickly and very efficiently spin up a container as needed, only transferring data for image layers not shared by other containers. This quality enables many of the systems we now describe as “cloud native,” including container orchestrators like Kubernetes.

But in the midst of all this transiency, there is a complication: most applications store persistent data for later use. Even simple websites—which can be built on static web servers with no need for persistent storage—are often created with content management systems like WordPress, and those content management systems use databases to save user logins, draft posts, and other data.

If we want to containerize an application such as WordPress, we need a way to create a persistent data store that is independent of any given container, so our ephemeral containers—created and destroyed as needed—can access, utilize, and update the same data.

With Docker, those persistent data stores are called volumes. By default, volumes are directories located on the host filesystem, within a specific directory managed by the Docker engine—but they can also be defined in remote locations such as cloud storage.

What about bind mounts?

Docker provides another way for containers to access data on the host filesystem: bind mounts. This allows the container to read and write data anywhere on the host filesystem. This can be useful for experimentation on your local machine, but there are many reasons to prefer volumes for general usage: they are manageable with the Docker command line interface, they have a more precisely defined scope, and they are more suited to cloud native architectures. For these reasons, we will focus on volumes here.

A given volume may be mounted to multiple containers at once, in either read-write or read-only mode, and will persist beyond the closure of a particular container. If you’re using Docker Desktop on macOS or Windows, you will find that volumes have their own management tab.

Now let’s try creating a volume and mounting it to multiple containers.

Exercise: Opening the first volume

You can follow a version of this lesson's exercise in the video below.

To start, we’ll create a new volume:

docker volume create d6app

With this command, we’re creating a new volume and giving it the name “d6app.”

Next, let’s create a new container from the d6 image we made in the last lesson. As we create this new container, we’ll mount the volume we just created.

docker run --name d6v2 -it -v d6app:/d6app <Your Docker ID>/d6:1.0 bash

The syntax here can be a little confusing at first glance—let’s take a moment to break it down:

We’re executing docker run as usual, specifying the name “d6v2” for our new container.
The -it tag gives us an interactive shell.
The -v tag identifies and mounts a volume to the container.

The first instance of “d6app” (before the colon) specifies the volume to mount—in this case, “d6app,” which we just created.
"/d6app" tells us where we will be able to find the contents of this volume in our new container: a directory called d6app. We could have called this directory anything, but in this case, we’re simply using the same name as the volume.

Next, we’re specifying the source of the container image.
Finally, we’re opening a bash shell.

Now we should be working in a bash shell within our container. Let’s take a look at our container filesystem with the ls command. We should see the directory where the d6app volume is mounted:

ls
bin  boot  d6.py  d6app  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

If we change directory and inspect the contents, we should find it empty.

cd d6app
ls
<nothing>

While we’re here, we’ll create an empty text file:

touch d6log.txt

Now we'll go back into the container’s root directory and open the Python app we wrote last lesson. (We won’t need to download nano this time—it’s part of the image now.)

cd ..
nano d6.py

Let’s add some functionality to our die-rolling app. It’s all well and good to get randomized die rolls, but wouldn’t it be nice if we could record those rolls, so we can keep a log of our amazing (or terrible) luck? Update the d6.py file contents to look like this:

#import module
from random import randint
 
#open the storage file in append mode
outfile = open('/d6app/d6log.txt', 'a')
 
#assign a random integer between 1 and 6, inclusive, to a variable
roll = randint(1, 6)
 
#print the variable
print(roll)
 
#convert the value to a string and write to the file
rollstr = str(roll)
outfile.write(rollstr + '\n')
 
#close the file
outfile.close()

When we run d6.py now, the program should open the text file in the d6app volume, convert the randomized output to a string, and record that string (and a line break) for posterity. Let’s test it out.

python d6.py
5

Within the file, we should find our recorded result.

nano d6app/d6log.txt

Try running the app a few more times and check the file again.

But what happens when we stop the container? Well, first let’s commit and push this updated version of the d6 app to Docker Hub. With the container still running, open another terminal session and enter:

docker commit -m “Updating to v2” d6v2 d6:2.0
docker tag d6:2.0 <Your Docker ID>/d6:2.0
docker push <Your Docker ID>/d6:2.0
The push refers to repository [docker.io/<Your Docker ID>/d6]
3280116fcd01: Pushed 
3b2cfa75b8bf: Layer already exists 
128b344cad66: Layer already exists 
88597958b14e: Layer already exists 
db26989c2f90: Layer already exists 
a3232401de62: Layer already exists 
204e42b3d47b: Layer already exists 
613ab28cf833: Layer already exists 
bed676ceab7a: Layer already exists 
6398d5cccd2c: Layer already exists 
0b0f2f2f5279: Layer already exists

Now we’re going to start a new container with nearly the same command that we used at the beginning of this lesson:

docker run --name d6test -it -v d6app:/d6app <Your Docker ID>/d6:1.0 bash

Here, we can see that we’re starting a new container based on the version 1.0 image of the d6 app—before we made any changes today. We’re mounting the d6app volume, as before, and opening an interactive shell session.

Read-only?

This is a good place for us to pause and note that not all containers need to mount volumes with read-write access, and as a general rule, we don’t want to give containers any more privileges than they require. To mount a volume with read-only access, we simply append :ro to the name of the volume directory within the container. For the command above, this would look like: d6app:/d6app:ro

If we open the d6.py file, we’ll find it unchanged. This is the old 1.0 image, after all. But if we open the d6log.txt file in the mounted d6app volume…

nano d6app/d6log.txt

Our file has persisted, carrying the data across multiple containers. We can make a manual edit to the file—typing “test,” say—and then return to the first container. Here, let’s run the Python app again and then check the log file within our initial and most current container:

python d6.py
nano /d6app/d6log.txt

With persistent storage, we’ve found a way to create some continuity between disparate containers. In the next lesson, we’ll go a step further and dive into the fundamentals of container networking, so we can make multiple containers work together in real time.

Try Mirantis Kubernetes Engine for Free

Simple, flexible, secure, and scalable container orchestration.

TRY IT FREE

FREE TRIAL:

Unlock the full potential of Mirantis Container Runtime

Schedule a consultation with a Mirantis expert.

LEARN MORE

Cloud Native 5 Minutes at a Time: Volumes and Persistent Storage

Table of Contents

Persistent storage for containers

What about bind mounts?

Exercise: Opening the first volume

Read-only?

Recommended posts

Mirantis patches containerd to address race condition

An App Modernization Diary

So you want to run Windows containers

Choose your cloud native journey.

Cloud Native & Coffee

Join Our Exclusive Newsletter

Try Mirantis Kubernetes Engine for Free

Unlock the full potential of Mirantis Container Runtime

Digital Self-Determination

Services

Platform

Company