Cloud Native 5 Minutes at a Time: Volumes and Persistent Storage

Eric Gregory - March 4, 2022 - , ,

One of the biggest challenges for implementing cloud native technologies is learning the fundamentals—especially when you need to fit your learning into a busy schedule.

In this series, we’ll break down core cloud native concepts, challenges, and best practices into short, manageable exercises and explainers, so you can learn five minutes at a time. These lessons assume a basic familiarity with the Linux command line and a Unix-like operating system—beyond that, you don’t need any special preparation to get started.

In the last lesson, we learned how to use and share container images using a public registry such as Docker Hub. This time, we’re going to learn how to introduce persistent storage for containers.

Table of Contents

  1. What is a Container?
  2. Creating, Observing, and Deleting Containers
  3. Build Image from Dockerfile
  4. Using an Image Registry
  5. Volumes and Persistent Storage←You are here
  6. Container Networking and Opening Container Ports
  7. Running a Containerized App
  8. Multi-Container Apps on User-Defined Networks
  9. Docker Compose and Next Steps

Persistent storage for containers

When technologists talk about why containers are useful for large-scale enterprise applications, they will often describe containers as “ephemeral”—a word that means short-lived, here and then gone.

At this point, we should have a firmer sense of why the ephemerality of containers is useful—we can quickly and very efficiently spin up a container as needed, only transferring data for image layers not shared by other containers. This quality enables many of the systems we now describe as “cloud native,” including container orchestrators like Kubernetes.

But in the midst of all this transiency, there is a complication: most applications store persistent data for later use. Even simple websites—which can be built on static web servers with no need for persistent storage—are often created with content management systems like WordPress, and those content management systems use databases to save user logins, draft posts, and other data.

If we want to containerize an application such as WordPress, we need a way to create a persistent data store that is independent of any given container, so our ephemeral containers—created and destroyed as needed—can access, utilize, and update the same data.

With Docker, those persistent data stores are called volumes. By default, volumes are directories located on the host filesystem, within a specific directory managed by the Docker engine—but they can also be defined in remote locations such as cloud storage.

What about bind mounts?

Docker provides another way for containers to access data on the host filesystem: bind mounts. This allows the container to read and write data anywhere on the host filesystem. This can be useful for experimentation on your local machine, but there are many reasons to prefer volumes for general usage: they are manageable with the Docker command line interface, they have a more precisely defined scope, and they are more suited to cloud native architectures. For these reasons, we will focus on volumes here.

A given volume may be mounted to multiple containers at once, in either read-write or read-only mode, and will persist beyond the closure of a particular container. If you’re using Docker Desktop on macOS or Windows, you will find that volumes have their own management tab.

Volumes view in Docker Desktop

Now let’s try creating a volume and mounting it to multiple containers.

Exercise: Opening the first volume

You can follow a version of this lesson’s exercise in the video below.

To start, we’ll create a new volume:

docker volume create d6app

With this command, we’re creating a new volume and giving it the name “d6app.”

Next, let’s create a new container from the d6 image we made in the last lesson. As we create this new container, we’ll mount the volume we just created.

docker run --name d6v2 -it -v d6app:/d6app /d6:1.0 bash

The syntax here can be a little confusing at first glance—let’s take a moment to break it down:

  • We’re executing docker run as usual, specifying the name “d6v2” for our new container.
  • The -it tag gives us an interactive shell.
  • The -v tag identifies and mounts a volume to the container.
    • The first instance of “d6app” (before the colon) specifies the volume to mount—in this case, “d6app,” which we just created.
    • “/d6app” tells us where we will be able to find the contents of this volume in our new container: a directory called d6app. We could have called this directory anything, but in this case, we’re simply using the same name as the volume.
  • Next, we’re specifying the source of the container image.
  • Finally, we’re opening a bash shell.

Now we should be working in a bash shell within our container. Let’s take a look at our container filesystem with the ls command. We should see the directory where the d6app volume is mounted:

ls
bin  boot  d6.py  d6app  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

If we change directory and inspect the contents, we should find it empty.

cd d6app
ls

While we’re here, we’ll create an empty text file:

touch d6log.txt

Now we’ll go back into the container’s root directory and open the Python app we wrote last lesson. (We won’t need to download nano this time—it’s part of the image now.)

cd ..
nano d6.py

Let’s add some functionality to our die-rolling app. It’s all well and good to get randomized die rolls, but wouldn’t it be nice if we could record those rolls, so we can keep a log of our amazing (or terrible) luck? Update the d6.py file contents to look like this:

#import module
from random import randint
 
#open the storage file in append mode
outfile = open('/d6app/d6log.txt', 'a')
 
#assign a random integer between 1 and 6, inclusive, to a variable
roll = randint(1, 6)
 
#print the variable
print(roll)
 
#convert the value to a string and write to the file
rollstr = str(roll)
outfile.write(rollstr + '\n')
 
#close the file
outfile.close()

When we run d6.py now, the program should open the text file in the d6app volume, convert the randomized output to a string, and record that string (and a line break) for posterity. Let’s test it out.

python d6.py
5

Within the file, we should find our recorded result.

nano d6app/d6log.txt

Contents of the d6log.txt file

Try running the app a few more times and check the file again.

Contents of the d6log.txt file

But what happens when we stop the container? Well, first let’s commit and push this updated version of the d6 app to Docker Hub. With the container still running, open another terminal session and enter:

docker commit -m “Updating to v2” d6v2 d6:2.0
docker tag d6:2.0 /d6:2.0
docker push /d6:2.0
The push refers to repository [docker.io//d6]
3280116fcd01: Pushed 
3b2cfa75b8bf: Layer already exists 
128b344cad66: Layer already exists 
88597958b14e: Layer already exists 
db26989c2f90: Layer already exists 
a3232401de62: Layer already exists 
204e42b3d47b: Layer already exists 
613ab28cf833: Layer already exists 
bed676ceab7a: Layer already exists 
6398d5cccd2c: Layer already exists 
0b0f2f2f5279: Layer already exists 

Now we’re going to start a new container with nearly the same command that we used at the beginning of this lesson:

docker run --name d6test -it -v d6app:/d6app /d6:1.0 bash

Here, we can see that we’re starting a new container based on the version 1.0 image of the d6 app—before we made any changes today. We’re mounting the d6app volume, as before, and opening an interactive shell session.

Read-only?

This is a good place for us to pause and note that not all containers need to mount volumes with read-write access, and as a general rule, we don’t want to give containers any more privileges than they require. To mount a volume with read-only access, we simply append :ro to the name of the volume directory within the container. For the command above, this would look like: d6app:/d6app:ro

If we open the d6.py file, we’ll find it unchanged. This is the old 1.0 image, after all. But if we open the d6log.txt file in the mounted d6app volume…

nano d6app/d6log.txt

Contents of the d6log.txt file

Our file has persisted, carrying the data across multiple containers. We can make a manual edit to the file—typing “test,” say—and then return to the first container. Here, let’s run the Python app again and then check the log file within our initial and most current container:

python d6.py
nano /d6app/d6log.txt
Contents of the d6log.txt file

With persistent storage, we’ve found a way to create some continuity between disparate containers. In the next lesson, we’ll go a step further and dive into the fundamentals of container networking, so we can make multiple containers work together in real time.

banner-img
From Virtualization to Containerization
Learn how to move from monolithic to microservices in this free eBook
Download Now
Radio Cloud Native – Week of May 11th, 2022

Every Wednesday, Nick Chase and Eric Gregory from Mirantis go over the week’s cloud native and industry news. This week they discussed: Docker Extensions Artificial Intelligence shows signs that it's reaching the common person Google Cloud TPU VMs reach general availability Google buys MobileX, folds into Google Cloud NIST changes Palantir is back, and it's got a Blanket Purchase Agreement at the Department of Health and Human …

Radio Cloud Native – Week of May 11th, 2022
Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!

In the last several weeks we have released two updates to Mirantis Container Cloud - versions 2.16 and 2.17, which bring a number of important changes and enhancements. These are focused on both keeping key components up to date to provide the latest functionality and security fixes, and also delivering new functionalities for our customers to take advantage of in …

Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!
Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]

Cloud environments & Kubernetes are becoming more and more expensive to operate and manage. In this demo-rich workshop, Mirantis and Kubecost demonstrate how to deploy Kubecost as a Helm chart on top of Mirantis Kubernetes Engine. Lens users will be able to visualize their Kubernetes spend directly in the Lens desktop application, allowing users to view spend and costs efficiently …

Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]
WHITEPAPER
The Definitive Guide to Container Platforms
READ IT NOW
Mirantis Webstore
Purchase Kubernetes support
SHOP NOW
Technical training
Learn Kubernetes & OpenStack from Deployment Experts
Prep for certification!
View schedule