The Kubernetes Series - Storage

The Kubernetes Series - Storage

(Photo by Steve Johnson on Unsplash)

Previously we looked a little bit more closely at containers. This post will cover how you can manage persistent storage for your Pods.

Volumes

Volumes can best be viewed as shared hard drives between containers, where they can save data to a source where other containers could access it, or where it would persist even if the container itself got destroyed.

Storing Data On The Node

Let's set up a Pod description file that allows its containers to save and access data on the node.

apiVersion: v1
kind: Pod
metadata:
 name: my-awesome-app
 labels:
   app: awesome-app
spec:
 containers:
   - name: awesome-app-container
     image: awesome-app
     command: ["/bin/sh","-c"]
     args: ["echo'stuff in file' > /node-drive/file.md"]
     volumeMounts:
     - mountPath: /node-drive
       name: node-volume-mount
 volumes:
 - name: node-volume
   hostPath:
     path: /node-volume <-- folder on the node
     type: Directory

Storage Data On a Cluster Level

But what if you have Deployments or ReplicaSets running with no taints & tolerances, and no node affinity? This would mean that your Pods could potentially get launched on any available node in your cluster. Your containers would not be able to read data from their node volumes if they are not actually running on those nodes anymore.

It might thus be better to use an external storage solution like Filestore or AWS Elastic Block Store. Let's look at setting up Elastic Block Store;

apiVersion: v1
kind: Pod
metadata:
 name: my-awesome-app
 labels:
   app: awesome-app
spec:
 containers:
   - name: awesome-app-container
     image: awesome-app
     command: ["/bin/sh","-c"]
     args: ["echo'stuff in file' > /node-drive/file.md"]
     volumeMounts:
     - mountPath: /node-drive
       name: node-volume-mount
 volumes:
 - name: aws-volume
   awsElasticBlockStore:
     volumeID: "vol-f37a03aa"
     fsType: ext4

Data storage and retrieval is now independent of the nodes Pods run on.

Persistent Volumes

Persistent volumes, PVs, are volumes that are available to the whole cluster. It is not storage limited to a particular node or to a single Pod. It essentially simplifies how storage is handled by abstracting it into a kubernetes resource object.

Let's create a Persistent Volume with a definition file;

apiVersion: v1
kind: PersistentVolume
metadata:
  name: shared-storage
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem <-- Block or Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Recycle
  awsElasticBlockStore:
     volumeID: "vol-f37a03aa"
     fsType: ext4

You can view it with;

kubectl get pv

Note that we have 3 possible options for accessModes;

  • ReadWriteOnce – Read-write by a single node(RWO in CLI)
  • ReadOnlyMany – Read-only by many nodes(ROX in CLI)
  • ReadWriteMany – Read-write by many nodes(RWX in CLI)

Persistent Volume Claims

A Persistent Volume Claim(PVC) can be viewed as a binding object, claiming a PV as an available resource for Pods, ReplicaSets and Deployments. During the binding process, Kubernetes tries to find an appropriate PV that matches the requirements set in the PVC. This is conceptually similar to how Pods get provisioned to nodes.

These requirements include;

  • Capacity
  • Access Modes
  • Volume Modes
  • Storage Class

You can also directly bind to a specific PV by referring to its label with a selector.

A caveat! - there is a one-to-one relationship between a PVC and a PV, so if a 5Gi PVC binds to a 8Gi PV, no other PVC can take up the remaining 3Gi. Also, a PVC that finds no matching PV will wait until such PVs become available.

Let's create a PVC;

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: shared-storage-claim
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 2Gi

View it and its state with;

kubectl get pvc

persistentVolumeReclaimPolicy

You'll notice on the definition file for creating a PV that we have a property called persistentVolumeReclaimPolicy. This property tells Kubernetes what to do with a PV when it gets released from a PVC(when it gets deleted, for instance).

We have 3 options;

  • Delete –> PV gets deleted as well
  • Recycle –> PV gets cleared/wiped and made available to other PVCs
  • Retain (default)–> PV remains as-is, with any data on it remaining and will not be made available for new PVCs

Using PVCs in Pods

Let bind a PVC to a Pod in its definition file;

apiVersion: v1
kind: Pod
metadata:
  name: my-awesome-app
spec:
  containers:
    - name: my-awesome-app-container
      image: my-awesome-app-image
      volumeMounts:
      - mountPath: "/volume-storage"
        name: volume-storage <-- should match below
  volumes:
    - name: volume-storage <-- should match above
      persistentVolumeClaim:
        claimName: shared-storage-claim

Conclusion

We just had a closer look at how to handle storage beyond the limitations of individual nodes. We had a look at how to simplify or abstract away storage with Persistent Volumes and how to bind them with Persistent Volume Claims.

The next post will deal with another tricky subject - networking.

Show Comments
👇 Get $100 in credit DigitalOcean Referral Badge