Google Cloud Filestore: Managed NFS File Storage for GCP Workloads
Some applications need a file system, not an object store. Content management systems that read and write files through POSIX paths, media rendering jobs that write frames to a shared directory, machine learning training jobs that read large datasets from a mounted drive — these workloads do not fit the object storage model where you upload and download discrete objects via HTTP.
Cloud Filestore provides a managed NFS (Network File System) service. You create a file share, mount it on Compute Engine VMs or Kubernetes pods, and your application works with it exactly like a local file system. Google manages the underlying infrastructure, replication, and patching.
Service Tiers
Filestore offers four tiers with different performance and availability characteristics:
┌────────────────────┬────────────────────┬───────────────────────────────────┐│ Tier │ Performance │ Availability & use case │├────────────────────┼────────────────────┼───────────────────────────────────┤│ Basic HDD │ 100 MB/s, 300 IOPS │ Dev/test, low-throughput batch ││ │ per TB │ jobs. Minimum 1 TB. │├────────────────────┼────────────────────┼───────────────────────────────────┤│ Basic SSD │ 240 MB/s, │ General web serving, CMS, ││ │ 15,000 IOPS │ dev databases. Min 2.5 TB. │├────────────────────┼────────────────────┼───────────────────────────────────┤│ Zonal (SSD) │ Scales linearly │ High-performance workloads. ││ │ up to 10 GB/s │ Single zone. EDA, rendering, ││ │ │ ML training. Min 1 TB. │├────────────────────┼────────────────────┼───────────────────────────────────┤│ Enterprise │ Scales linearly │ Mission-critical, requires HA ││ │ up to 10 GB/s │ across zones. Supports snapshots ││ │ │ and backups. Min 1 TB. │└────────────────────┴────────────────────┴───────────────────────────────────┘Basic tiers provision a fixed capacity-performance bundle. To get more throughput, you increase capacity.
Zonal and Enterprise tiers scale throughput independently from capacity — you can provision 10 TB of storage at 1 GB/s, or 10 TB at 5 GB/s.
Enterprise is the only tier that supports cross-zone replication and scheduled backups natively.
Creating a Filestore Instance
# Create a Basic SSD instance for a web applicationgcloud filestore instances create web-filestore \ --tier=BASIC_SSD \ --file-share=name=webdata,capacity=5TB \ --zone=us-central1-a \ --network=name=default
# Create an Enterprise instance for ML traininggcloud filestore instances create ml-filestore \ --tier=ENTERPRISE \ --file-share=name=training-data,capacity=20TB \ --region=us-central1 \ --network=name=default
# Get the instance's NFS endpoint IPgcloud filestore instances describe ml-filestore \ --region=us-central1 \ --format="value(networks[0].ipAddresses[0])"Mounting on Compute Engine VMs
# Install NFS client utilitiessudo apt-get install -y nfs-common
# Create mount pointsudo mkdir -p /mnt/filestore/training-data
# Mount the NFS share# Replace 10.0.0.5 with the actual Filestore IPsudo mount -t nfs \ -o vers=3,rw,hard,intr,timeo=600,retrans=5 \ 10.0.0.5:/training-data \ /mnt/filestore/training-data
# Verifydf -h /mnt/filestore/training-data
# Add to /etc/fstab for persistence across rebootsecho "10.0.0.5:/training-data /mnt/filestore/training-data nfs vers=3,rw,hard,intr,timeo=600,retrans=5 0 0" | sudo tee -a /etc/fstabThe NFS options matter:
vers=3: Filestore supports NFSv3 (and NFSv4.1 for Zonal/Enterprise tiers)hard: Retry indefinitely on network interruption (preferred for data integrity)intr: Allow interrupt signals during retriestimeo=600,retrans=5: Timeout and retry settings to handle transient network issues
Mounting in GKE with ReadWriteMany
Kubernetes persistent volumes backed by Filestore allow multiple pods to read and write the same share simultaneously — a capability that standard Persistent Disks cannot provide.
# PersistentVolume backed by FilestoreapiVersion: v1kind: PersistentVolumemetadata: name: filestore-pvspec: capacity: storage: 5Ti accessModes: - ReadWriteMany # Multiple pods can mount this persistentVolumeReclaimPolicy: Retain nfs: path: /webdata server: 10.0.0.5 # Filestore IP
---# PersistentVolumeClaimapiVersion: v1kind: PersistentVolumeClaimmetadata: name: filestore-pvcspec: accessModes: - ReadWriteMany resources: requests: storage: 1Ti storageClassName: "" volumeName: filestore-pv
---# Deployment using the PVCapiVersion: apps/v1kind: Deploymentmetadata: name: media-processorspec: replicas: 5 template: spec: containers: - name: processor image: gcr.io/my-project/media-processor volumeMounts: - name: shared-storage mountPath: /data/media volumes: - name: shared-storage persistentVolumeClaim: claimName: filestore-pvcAll five pods mount the same Filestore share at /data/media. A file written by pod 1 is immediately readable by pods 2-5 without any synchronization logic.
Snapshots and Backups
Enterprise tier supports point-in-time snapshots of individual file shares.
# Create a manual snapshotgcloud filestore snapshots create data-snapshot-$(date +%Y%m%d) \ --file-share=training-data \ --instance=ml-filestore \ --instance-region=us-central1
# Create a backup (stored in GCS, cross-region capable)gcloud filestore backups create weekly-backup-$(date +%Y%m%d) \ --file-share=training-data \ --instance=ml-filestore \ --instance-region=us-central1 \ --region=us-east1 # Store backup in a different region
# Restore from backup (creates a new instance)gcloud filestore instances create ml-filestore-restored \ --tier=ENTERPRISE \ --file-share=name=training-data,capacity=20TB \ --source-backup=projects/my-project/locations/us-east1/backups/weekly-backup-20250315 \ --region=us-central1Snapshots are cheap (you pay for the delta, not a full copy) and nearly instantaneous. Backups are stored in GCS and persist even if the Filestore instance is deleted.
Real-World Use Case: ML Training Pipeline
A machine learning team runs distributed training jobs on a cluster of GPU VMs. Each job reads a 2 TB training dataset and writes checkpoints to a shared directory.
Data Pipeline: GCS (raw data lake) │ │ preprocessing job (Dataflow) ▼ Filestore Enterprise 20 TB (us-central1) │ │ NFS mount on all GPU VMs ├──► Training VM 1 (A2 with 8x NVIDIA A100) ├──► Training VM 2 (A2 with 8x NVIDIA A100) ├──► Training VM 3 (A2 with 8x NVIDIA A100) └──► Training VM 4 (A2 with 8x NVIDIA A100) │ │ writes model checkpoints every 10 minutes ▼ /mnt/filestore/checkpoints/experiment-42/Filestore at 5 GB/s throughput allows all four VMs to read training data concurrently without I/O becoming a bottleneck. If a VM fails, another can pick up from the last checkpoint because the checkpoints are on the shared Filestore volume rather than local disk.
Filestore vs Other GCP Storage Options
Decision guide: Need to share files across multiple VMs/pods? → Filestore POSIX file system semantics required? → Filestore Simple object storage, HTTP access? → Cloud Storage Block storage for a single VM? → Persistent Disk Sub-millisecond latency, key-value access? → MemorystoreFilestore fills the NFS gap in GCP’s storage portfolio. Most cloud workloads are better served by Cloud Storage or Persistent Disks, but the workloads that specifically need shared POSIX-compatible file access — rendering farms, CMS platforms, legacy application lift-and-shift, distributed ML training — require Filestore.
Summary
Cloud Filestore is the right storage choice when your application needs a shared file system with POSIX semantics. Basic tiers (HDD, SSD) serve lower-performance workloads at fixed capacity-performance bundles. Zonal and Enterprise tiers scale throughput independently and support higher-performance use cases. Enterprise is the only tier with built-in HA and backup support. GKE integration via ReadWriteMany PersistentVolumes allows multiple pods to share a single Filestore instance. Snapshots provide fast recovery from accidental deletion; backups in GCS provide cross-region disaster recovery. The typical use cases are rendering farms, ML training data sharing, web application uploads, and any legacy application expecting a shared drive.