Though most people may not consider this an issue during the early stages of learning how to use Docker for deployment, data retention and persistence is one of the problems that one needs to consider when they need to utilize Docker containers. Previously, I wrote about virtualization using either Virtual Machines and Docker, though I mostly focused on how both essentially work on an operating system and what they require regarding system resources. What I did not mention, however, was how either Virtual Machines or Docker operate when it comes to data persistence. We are aware that the host systems that we use retain the data that we create between sessions. This means that if I power the computer that I am using now off and then on after submitting this blog post, the majority of the data that was created in the previous session will be available in the next session. However, as we begin to work with virtualization, this issue of data persistence becomes a much greater issue for us to consider when working with Docker. In the case of virtual machines, data persistence is not much of an issue.
Data persistence for Docker containers, however, works differently. It is stated in the Docker documentation that:
The data doesn’t persist when that container no longer exists, and it can be difficult to get the data out of the container if another process needs it.
When a container is stopped and then restarted, whatever files that were created and used in the container will be deleted and the container will essentially run on a “clean state”. However, there is a way to guarantee data persistence on a docker container at any point through binding mounts or volumes to the container. Though Docker offers other types of binds, such as tmpfs mounts and named pipes, I will mostly be focusing on bind mounts and volumes for the remained of this post as ways of maintaining data persistence between a host machine and a Docker container.
While I was researching for more information regarding the differences between using bind mounts and volumes, I came across the following two articles, one tutorial titled Docker Volumes – Tutorial on buildVirtual.Net and one article titled Guide to Docker Volumes on Baeldung.Com by Ashley Frieze. In the Baeldung article, Frieze showcases how the Docker file system works and, in turn, how data retention is affected in a Docker container before explaining the differences between using volumes and bind mounts. Likewise, the buildVirtual tutorial also outlines the above differences, as well as showing how to utilize and delete volumes through docker commands.
Although both bind mounts and volumes can be used for data persistence, it is important to know which method to utilize depending on where we want the binds to be stored in the host system or how other docker or non-docker processes may need to interact with the specific data.
Direct link to the resources referenced in the post: https://www.baeldung.com/ops/docker-volumes and https://buildvirtual.net/amp/docker-volumes-tutorial/
Recommended materials/resources reviewed related to Docker mount and volumes:
1) https://4sysops.com/archives/introduction-to-docker-bind-mounts-and-volumes/
2) https://medium.com/@BeNitinAgarwal/docker-containers-filesystem-demystified-b6ed8112a04a
3) https://www.baeldung.com/ops/docker-container-filesystem
4) https://digitalvarys.com/docker-volume-vs-bind-mounts-vs-tmpfs-mount/
5) https://medium.com/devops-dudes/docker-volumes-and-bind-mounts-2fb4bd9df09d
6) https://docs.microsoft.com/en-us/visualstudio/docker/tutorials/use-bind-mounts
7) https://blog.logrocket.com/docker-volumes-vs-bind-mounts/
From the blog CS@Worcester – CompSci Log by sohoda and used with permission of the author. All other rights reserved by the author.