Background
By nature, Pods are ephemeral. This means that GKE destroys the state and value stored in a Pod when it is deleted, evicted, or rescheduled.
As an application operator, you may want to maintain stateful workloads. Examples of such workloads include applications that process WordPress articles, messaging apps, and apps that process machine learning operations.
By using Filestore on GKE, you can perform the following operations:
- Deploy stateful workloads that are scalable.
- Enable multiple Pods to have ReadWriteManyas itsaccessMode, so that multiple Pods can read and write at the same time to the same storage.
- Set up GKE to mount volumes into multiple Pods simultaneously.
- Persist storage when Pods are removed.
- Enable Pods to share data and easily scale.
Objectives
This tutorial is for application operators and other users that want to set up a scalable stateful workload on GKE using PVC and NFS.
This tutorial covers the following steps:
- Create a GKE cluster.
- Configure the managed file storage with Filestore using CSI.
- Create a reader and a writer Pod.
- Expose and access the reader Pod to a Service Load Balancer.
- Scale up the writer.
- Access data from the writer Pod.
Costs
This tutorial uses the following billable components of Google Cloud:Use the Pricing Calculator to generate a cost estimate based on your projected usage.
When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Clean up.
To follow step-by-step guidance for this task directly in the Google Cloud console, click Guide me:
Before you begin
Set up your project
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
- 
    
    
      
        In the Google Cloud console, on the project selector page, click Create project to begin creating a new Google Cloud project. Roles required to create a project To create a project, you need the Project Creator ( roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
- 
  
    Verify that billing is enabled for your Google Cloud project. 
- 
  
  
    
      Enable the Compute Engine, GKE, and Filestore APIs. Roles required to enable APIs To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.
- 
    
    
      
        In the Google Cloud console, on the project selector page, click Create project to begin creating a new Google Cloud project. Roles required to create a project To create a project, you need the Project Creator ( roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
- 
  
    Verify that billing is enabled for your Google Cloud project. 
- 
  
  
    
      Enable the Compute Engine, GKE, and Filestore APIs. Roles required to enable APIs To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.
Set defaults for the Google Cloud CLI
- In the Google Cloud console, start a Cloud Shell instance: 
 Open Cloud Shell
- Download the source code for this sample app: - git clone https://github.com/GoogleCloudPlatform/kubernetes-engine-samples cd kubernetes-engine-samples/databases/stateful-workload-filestore
- Set the default environment variables: - gcloud config set project PROJECT_ID gcloud config set compute/region COMPUTE_REGION gcloud config set compute/zone COMPUTE_ZONE gcloud config set filestore/zone COMPUTE_ZONE gcloud config set filestore/region COMPUTE_REGION- Replace the following values: - PROJECT_ID: your Google Cloud project ID.
- COMPUTE_REGION: the Compute Engine region.
- COMPUTE_ZONE: the Compute Engine zone.
 
Create a GKE cluster
- Create a GKE cluster: - gcloud container clusters create-auto CLUSTER_NAME --location CONTROL_PLANE_LOCATION- Replace the following value: - CLUSTER_NAME: your cluster name.
- CONTROL_PLANE_LOCATION: the Compute Engine location of the control plane of your cluster. Provide a region for regional clusters, or a zone for zonal clusters.
 - The outcome is similar to the following once the cluster is created: - gcloud container clusters describe CLUSTER_NAME NAME: CLUSTER_NAME LOCATION: northamerica-northeast2 MASTER_VERSION: 1.21.11-gke.1100 MASTER_IP: 34.130.255.70 MACHINE_TYPE: e2-medium NODE_VERSION: 1.21.11-gke.1100 NUM_NODES: 3 STATUS: RUNNING- Where the - STATUSis- RUNNING.
Configure the managed file storage with Filestore using CSI
GKE provides a way to automatically deploy and manage the Kubernetes Filestore CSI driver in your clusters.
Using Filestore CSI allows you to dynamically create or delete Filestore instances and use them in Kubernetes workloads with a StorageClass or a Deployment.
You can create a new Filestore instance by creating a PVC that dynamically provisions a Filestore instance and the PV, or access pre-provisioned Filestore instances in Kubernetes workloads.
New instance
Create the Storage Class
- volumeBindingModeis set to- Immediate, which allows the provisioning of the volume to begin immediately.
- tieris set to- standardfor faster Filestore instance creation time. If you need higher available NFS storage, snapshots for data backup, data replication over multiple zones and other enterprise level features, set- tierto- enterpriseinstead. Note: The reclaim policy for dynamically created PV defaults to- Deleteif the- reclaimPolicyin the- StorageClassis not set.
- Create the - StorageClassresource:- kubectl create -f filestore-storageclass.yaml
- Verify that the Storage Class is created: - kubectl get sc- The output is similar to the following: - NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE filestore-sc filestore.csi.storage.gke.io Delete Immediate true 94m
Pre-provisioned instance
Create the Storage Class
When volumeBindingMode is set to Immediate, it allows the provisioning of the volume to begin immediately.
- Create the - StorageClassresource:- kubectl create -f preprov-storageclass.yaml
- Verify that the Storage Class is created: - kubectl get sc- The output is similar to the following: - NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE filestore-sc filestore.csi.storage.gke.io Delete Immediate true 94m
Create a Persistent Volume for the Filestore instance
- Verify that the pre-existing Filestore instance is ready: - gcloud filestore instances list- The output is similar to the following, where the - STATEvalue is- READY:- INSTANCE_NAME: stateful-filestore LOCATION: us-central1-a TIER: ENTERPRISE CAPACITY_GB: 1024 FILE_SHARE_NAME: statefulpath IP_ADDRESS: 10.109.38.98 STATE: READY CREATE_TIME: 2022-04-05T18:58:28- Note the - INSTANCE_NAME,- LOCATION,- FILE_SHARE_NAME, and- IP_ADDRESSof the Filestore instance.
- Populate the Filestore instance console variables: - INSTANCE_NAME=INSTANCE_NAME LOCATION=LOCATION FILE_SHARE_NAME=FILE_SHARE_NAME IP_ADDRESS=IP_ADDRESS
- Replace the placeholder variables with the console variables obtained above to the file - preprov-pv.yaml:- sed "s/<INSTANCE_NAME>/$INSTANCE_NAME/" preprov-pv.yaml > changed.yaml && mv changed.yaml preprov-pv.yaml sed "s/<LOCATION>/$LOCATION/" preprov-pv.yaml > changed.yaml && mv changed.yaml preprov-pv.yaml sed "s/<FILE_SHARE_NAME>/$FILE_SHARE_NAME/" preprov-pv.yaml > changed.yaml && mv changed.yaml preprov-pv.yaml sed "s/<IP_ADDRESS>/$IP_ADDRESS/" preprov-pv.yaml > changed.yaml && mv changed.yaml preprov-pv.yaml
- Create the PV - kubectl apply -f preprov-pv.yaml
- Verify that the PV's - STATUSis set to- Bound:- kubectl get pv- The output is similar to the following: - NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE fileserver 1Ti RWX Delete Bound default/fileserver filestore-sc 46m
Use a PersistentVolumeClaim to access the volume
The following pvc.yaml manifest references the Filestore CSI driver's StorageClass named filestore-sc.
In order to have multiple Pods reading and writing to the volume,
the accessMode is set to ReadWriteMany.
- Deploy the PVC: - kubectl create -f pvc.yaml
- Verify that the PVC is created: - kubectl get pvc- The output is similar to the following: - NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE fileserver Bound pvc-aadc7546-78dd-4f12-a909-7f02aaedf0c3 1Ti RWX filestore-sc 92m
- Verify that the newly created Filestore instance is ready: - gcloud filestore instances list- The output is similar to the following: - INSTANCE_NAME: pvc-5bc55493-9e58-4ca5-8cd2-0739e0a7b68c LOCATION: northamerica-northeast2-a TIER: STANDARD CAPACITY_GB: 1024 FILE_SHARE_NAME: vol1 IP_ADDRESS: 10.29.174.90 STATE: READY CREATE_TIME: 2022-06-24T18:29:19
Create a reader and a writer Pod
In this section, you create a reader Pod and a writer Pod. This tutorial uses Kubernetes Deployments to create the Pods. A Deployment is a Kubernetes API object that lets you run multiple replicas of Pods that are distributed among the nodes in a cluster..
Create the reader Pod
The reader Pod will read the file that is being written by the writers Pods. The reader Pods will see what time and which writer Pod replica wrote to the file.
The reader Pod will read from the path /usr/share/nginx/html which is shared between all the Pods.
- Deploy the reader Pod: - kubectl apply -f reader-fs.yaml
- Verify that the reader replicas are running by querying the list of Pods: - kubectl get pods- The output is similar to the following: - NAME READY STATUS RESTARTS AGE reader-66b8fff8fd-jb9p4 1/1 Running 0 3m30s
Create the writer Pod
The writer Pod will periodically write to a shared file that other writer and reader Pods can access. The writer Pod records its presence by writing its host name to the shared file.
The image used for the writer Pod is a custom image of Alpine Linux, which is used for utilities and production applications. It includes a
script indexInfo.html that will obtain the metadata of the most recent writer,
 and keep count of all the unique writers and total writes.
For this tutorial, the writer Pod writes every 30 seconds to the path /html/index.html. Modify the sleep number
 value to have a different write frequency.
- Deploy the writer Pod: - kubectl apply -f writer-fs.yaml
- Verify that the writer Pods are running by querying the list of Pods: - kubectl get pods- The output is similar to the following: - NAME READY STATUS RESTARTS AGE reader-66b8fff8fd-jb9p4 1/1 Running 0 3m30s writer-855565fbc6-8gh2k 1/1 Running 0 2m31s writer-855565fbc6-lls4r 1/1 Running 0 2m31s
Expose and access the reader workload to a Service Load Balancer
To expose a workload outside the cluster, create a Service of type
LoadBalancer. This type of Service creates an external load balancer with an IP address reachable through the internet.
- Create a Service of type - LoadBalancernamed- reader-lb:- kubectl create -f loadbalancer.yaml
- Watch the deployment to see that GKE assigns an - EXTERNAL-IPfor- reader-lbService:- kubectl get svc --watch- When the - Serviceis ready, the- EXTERNAL-IPcolumn displays the public IP address of the load balancer:- NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.8.128.1 <none> 443/TCP 2d21h reader-lb LoadBalancer 10.8.131.79 34.71.232.122 80:32672/TCP 2d20h
- Press Ctrl+C to terminate the watch process. 
- Use a web browser to navigate to the - EXTERNAL-IPassigned to the load balancer. The page refreshes every 30 seconds. The more writers Pods and shorter the frequency, the more entries it will show.
To see more details about the load balancer service, see loadbalancer.yaml.
Scale up the writer
Because the PV accessMode was set to ReadWriteMany, GKE can scale up the number of Pods so that more writer Pods can write to this shared volume (or more readers can read to read them).
- Scale up the - writerto five replicas:- kubectl scale deployment writer --replicas=5- The output is similar to the following: - deployment.extensions/writer scaled
- Verify the number of running replicas: - kubectl get pods- The output is similar to the following: - NAME READY STATUS RESTARTS AGE reader-66b8fff8fd-jb9p4 1/1 Running 0 11m writer-855565fbc6-8dfkj 1/1 Running 0 4m writer-855565fbc6-8gh2k 1/1 Running 0 10m writer-855565fbc6-gv5rs 1/1 Running 0 4m writer-855565fbc6-lls4r 1/1 Running 0 10m writer-855565fbc6-tqwxc 1/1 Running 0 4m
- Use a web browser to navigate again to the - EXTERNAL-IPassigned to the load balancer.
At this point, you configured and scaled your cluster to support five stateful writer Pods. Where multiple writer Pods are writing to the same file simultaneously. The reader Pods can also be easily scaled up.
Optional: Access data from the writer Pod
This section demonstrates how to use a command-line interface to access a reader or writer Pod. You can see the shared component that the writer is writing to and the reader is reading from.
- Obtain the writer Pod name: - kubectl get pods- The output is similar to the following: - NAME READY STATUS RESTARTS AGE writer-5465d65b46-7hxv4 1/1 Running 0 20d- Note the hostname of a writer Pod (Example: - writer-5465d65b46-7hxv4).
- Run the following command to access the writer Pod: - kubectl exec -it WRITER_HOSTNAME -- /bin/sh
- See the shared component in the file - indexData.html:- cd /html cat indexData.html
- Clear the - indexData.htmlfile:- echo '' > indexData.html- Refresh the web browser hosting the - EXTERNAL-IPaddress to see the change.
- Exit the environment: - exit
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
Delete the project
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
Delete the individual resources
- Delete the load balancer Service: - kubectl delete service reader-lb- Wait until the load balancer provisioned for the reader service is deleted 
- Verify the list returns - Listed 0 items:- gcloud compute forwarding-rules list
- Delete the Deployments - kubectl delete deployment writer kubectl delete deployment reader
- Verify the Pods are deleted and returns - No resources found in default namespace.- kubectl get pods
- Delete the PVC. This will also delete the PV and the Filestore instance due to the retention policy set to - delete- kubectl delete pvc fileserver
- Delete the GKE cluster: - gcloud container clusters delete CLUSTER_NAME --location=CONTROL_PLANE_LOCATION- This deletes the resources that make up the GKE cluster, including the reader and writer Pods. 
What's next
- Learn how to deploy Cloud SQL with GKE
- Access Modes for PV and PVC
- Learn more about GKE and Filestore
- Learn more about Filestore CSI Driver
- How to create a Filestore instance
- See how to access Filestore instances from GKE clusters
- Explore other Kubernetes Engine tutorials.
- Learn more about exposing apps using Services in GKE Exposing applications using services