Using AWS EBS as a Volume For Data Persistence

In Part I and Part II of this blog post, we saw how to use manual/direct storage and AWS EBS volumes as the storage volumes for Kubernetes. There is yet a third way of defining your Kubernetes storage - by use of Kubernetes StorageClasses. Let’s deep-dive into this option and see when and why you’d use StorageClasses for your Kubernetes persistent volumes.

What’s a Storage class in Kubernetes

First of all, what is StorageClass? Just as the name says, Kubernetes StorageClasses are a way of defining the different classes or types of storage in which you will create your Kubernetes persistent volumes. This is useful when you need different volumes to reside on different storage types - perhaps because of your internal organizational policies, or because of specific throughput speeds, or for security reasons. For example, as we’ll see shortly, the different AWS EBS volume types have different IO speeds, and you may want your most frequently accessed Kubernetes volumes (perhaps those that store your frequently-accessed Postgres database) to sit on a very fast storage device.

In other words, think of StorageClasses as a way of simultaneously defining and abstracting the details of the underlying storage that your Kubernetes PersistentVolumes (PV’s) will be stored in. The StorageClass object is described in much more detail in the official Kubernetes documentation.

Storage classes enable dynamic volume provisioning. As explained in the official documentation: “Dynamic volume provisioning allows storage volumes to be created on-demand. Without dynamic provisioning, cluster administrators have to manually make calls to their cloud or storage provider to create new storage volumes, and then create PersistentVolume objects to represent them in Kubernetes. The dynamic provisioning feature eliminates the need for cluster administrators to pre-provision storage. Instead, it automatically provisions storage when it is requested by users.”

How to Define a StorageClass

Below is a sample YAML config file (let’s call it ‘gp2-storage-class.yaml’) that will create an AWS EBS StorageClass of type ‘gp2’. Especially note that unlike in the sample Persistent Volume YAML config file in the previous blog post, when using storage classes you do not specify any details about the EBS volume within the Kubernetes volume itself - the storage aspect can be abstracted and ‘hidden’ by defining it fully within the StorageClass’s configuration.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: myAWSGP2Vol1
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  iopsPerGB: "10"
  fsType: ext4
reclaimPolicy: Retain

Next, use this kubectl command to create the actual storage class:

kubectl create -f gp2-storage-class.yaml

And also confirm that it is now listed among your storage classes:

kubectl get storageclass

Now, let’s explain our “gp2-storage-class.yaml” file in more detail. The 5 most important fields are listed below, but of course, there are many more. The first 3 fields must be specified/ defined when creating a StorageClass, and the other 2 are optional. Still, it is recommended that you specify all 5:

kind: the value for this field is simply ‘StorageClass’.
name: the storage class’s name. This is part of the object’s metadata, and cannot be changed after creation. Use a descriptive name to quickly identify the type of StorageClass.
provisioner: the most important field. It specifies the volume plugin used for provisioning your PV’s. In our case, we are using AWS EBS volumes, so our provisioner is kubernetes.io/aws-ebs. Other common provisioners are those for Google Cloud and Azure, and these are kubernetes.io/gce-pd and kubernetes.io/azure-disk respectively. You can also specify local storage via the kubernetes.io/no-provisioner provisioner type. Note that local storage does not support dynamic volume provisioning.
parameters: you can specify different parameters depending on the provisioner. For example, the value gp2 for the type, and iopsPerGB are specific to EBS. For EBS, the most important parameter is type - this specifies the type of storage to be used in the underlying AWS volume. The applicable values are gp2, io1, st1 and sc1 (default is gp2). These are explained in more detail here, and the other sub-fields you can define under parameters (eg: iopsperGB, fsType) are also explained in the official documentation. Each parameter has a default value in case it is not specified.
reclaimPolicy: the values for the reclaimPolicy field can be either Delete or Retain. If no reclaimPolicy is specified when a StorageClass object is created, it will default to Delete. This means that a dynamically provisioned volume will be automatically deleted when a user deletes the corresponding PersistentVolumeClaim. Read more about how to define reclaimPolicy here.

Next, you of course have to now specify that your persistent volumes use the shiny new storage class - see the sample config file below. The field to note here is storageClassName. This is what ties our volume to the previously-defined storage class.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pg-pv-volume
spec:
  capacity:
    storage: 5Gi
  storageClassName: myAWSGP2Vol1

The corresponding PersistentVolumeClaim (PVC), can be created using the sample config file below. This is also a neat way to request and utilize dynamically provisioned storage, by use of the storageClassName field. This way, you can ensure that all new PV’s will use the storage type you want (in our case that’s AWS EBS).

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pg-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: myAWSGP2Vol1
  resources:
    requests:
      storage: 30Gi

Conclusion

We hope you found this tutorial useful. We covered storage classes in Kubernetes - how to create a storage class, when and how storage classes should be used, and their main benefits (dynamic volume provisioning and abstraction of underlying storage) when compared to manually defining storage parameters. And finally, we saw how to specify that persistent volumes and persistent volumes claims use a storage class.