This is the documentation for the latest development version of Velero. Both code and docs may be unstable, and these docs are not guaranteed to be up to date or correct. See the latest version.
Velero provides robust support for Volume Group Snapshots (VGS), a powerful Kubernetes feature for creating atomic, crash-consistent snapshots of multiple volumes simultaneously. This capability is essential for stateful applications that distribute data across several PersistentVolumeClaims (PVCs) and require that all data be captured at the exact same moment to ensure data integrity.
Who is this for? This guide is for application owners and backup administrators who need to ensure data consistency for multi-volume stateful applications, such as distributed databases (e.g., Cassandra, Zookeeper) or complex stateful services.
You should consider using Volume Group Snapshots when:
Before diving in, let’s clarify the Kubernetes resources involved in this process:
Velero’s integration with VGS is designed to be as seamless as possible, automating the complexities of group snapshots. Velero supports three distinct workflows depending on your configuration:
Velero automatically selects the appropriate workflow based on your backup configuration:
When: PVCs have VGS labels AND --snapshot-move-data=true
flag is used
Use case: Need atomic consistency + long-term storage/cross-cloud portability
VolumeGroupSnapshot
for write-order consistency across all labeled PVCsVolumeSnapshot
objects from the VGSDataUpload
CRs that move each volume’s data to object storageWhen: PVCs have VGS labels BUT --snapshot-move-data=false
(or flag omitted)
Use case: Need atomic consistency with local snapshot storage
VolumeGroupSnapshot
for write-order consistency across all labeled PVCsVolumeSnapshot
objects from the VGSVolumeSnapshots
, no DataUpload
or data mover involvedVolumeSnapshots
stored on your storage systemWhen: PVCs have NO VGS labels (standard CSI snapshot behavior)
Use case: Independent volume backups, no consistency requirements
VolumeSnapshot
per PVC independently--snapshot-move-data=true
flag is setSelect your workflow based on your application’s requirements:
Scenario | VGS Labels | Data Movement Flag | Workflow | Best For |
---|---|---|---|---|
Multi-volume app + cross-cloud backup | ✅ | --snapshot-move-data=true |
VGS + Data Movement | Distributed databases with portability needs |
Multi-volume app + local snapshots | ✅ | --snapshot-move-data=false (or omitted) |
VGS Only | Applications requiring consistency with fast local snapshots |
Single volumes or independent backups | ❌ | --snapshot-move-data=true (optional) |
Individual Snapshots | Simple applications, testing, or independent services |
Example Commands:
# VGS + Data Movement (cross-cloud, long-term storage)
velero backup create db-backup --include-namespaces my-database --snapshot-move-data=true
# VGS Only (atomic consistency, local storage)
velero backup create db-backup --include-namespaces my-database --snapshot-move-data=false
# Individual Snapshots (standard CSI behavior)
velero backup create app-backup --include-namespaces my-app
The VGS backup workflow is triggered by a simple label on your PVCs.
Grouping PVCs: When a backup is initiated, Velero’s PVCAction
plugin scans for PVCs with the VGS label (the default is velero.io/volume-group
). All PVCs within the same namespace that share the same label value are collected into a single ItemBlock
. This ensures they are processed as a single, atomic unit.
Orchestrating the Snapshot: The CSI plugin takes over to manage the snapshot creation:
VolumeGroupSnapshotClass
to use based on your configuration.VolumeGroupSnapshot
resource is created, signaling the CSI driver to begin the snapshot process for the entire group.Snapshot Finalization: Velero monitors the process, and once the VolumeGroupSnapshot
is ready, it performs these final steps:
VolumeSnapshot
objects.VolumeSnapshot
for tracking.Resource Cleanup: To keep your cluster tidy, Velero deletes the temporary VolumeGroupSnapshot
and VolumeGroupSnapshotContent
resources after the individual VolumeSnapshots
have been created and secured.
Here is a visual representation of the backup workflow:
Restoring from a VGS backup is simple and flexible. During backup, Velero creates individual VolumeSnapshots
from the VolumeGroupSnapshot
, so the restore process works with standard volume snapshot restoration.
Good to know: No special VGS-related logic is needed during the restore. This means you can restore your data to a cluster that doesn’t have VGS support enabled, providing excellent portability.
Before using Volume Group Snapshots with Velero, ensure your environment meets these requirements:
kubectl version --short
Check the Volume Group Snapshot CRDs on your cluster:
# Check if VGS CRDs are installed
kubectl get crd | grep volumegroup
Verify your CSI driver supports Volume Group Snapshots:
# Check if your CSI driver has VolumeGroupSnapshotClass resources
kubectl get volumegroupsnapshotclass
# Verify CSI driver capabilities (example for AWS EBS)
kubectl describe csidriver ebs.csi.aws.com
Ensure a VolumeGroupSnapshotClass exists for your storage and is properly labeled for Velero discovery:
# List available VolumeGroupSnapshotClasses
kubectl get volumegroupsnapshotclass -o wide
Important: The VolumeGroupSnapshotClass must have the label velero.io/csi-volumegroupsnapshot-class: "true"
for Velero to automatically discover and use it:
apiVersion: groupsnapshot.storage.k8s.io/v1alpha1
kind: VolumeGroupSnapshotClass
metadata:
name: csi-vgs-class
labels:
velero.io/csi-volumegroupsnapshot-class: "true"
spec:
driver: ebs.csi.aws.com
deletionPolicy: Delete
Verify your VolumeGroupSnapshotClass has the correct label:
# Check if VolumeGroupSnapshotClass has the required label
kubectl get volumegroupsnapshotclass --show-labels
Here’s how to get started with VGS backups:
Verify Prerequisites: Ensure all prerequisites above are met.
Label Your PVCs: The key to grouping volumes is to apply a consistent label to all PVCs that should be snapshotted together. Important: All PVCs in a group must use the same CSI driver and exist in the same namespace.
Here’s a complete end-to-end example of using VGS with a database application that has multiple volumes:
Deploy a database application with multiple PVCs:
PVC for Primary Data:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: db-data-pvc
namespace: my-database
labels:
velero.io/volume-group: db-cluster-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: my-csi-storage-class
PVC for Transaction Logs:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: db-logs-pvc
namespace: my-database
labels:
velero.io/volume-group: db-cluster-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: my-csi-storage-class
When you next back up the my-database
namespace, Velero will see the velero.io/volume-group: db-cluster-1
label on both PVCs and will trigger a VolumeGroupSnapshot
for the db-cluster-1
group.
# Create backup that will use VGS for labeled PVCs
velero backup create my-app-backup --include-namespaces my-database
# Monitor backup progress
velero backup describe my-app-backup
velero backup logs my-app-backup
# Verify VolumeSnapshots were created from the VGS
kubectl get volumesnapshot -n my-database -o wide
# Check that snapshots have the correct labels
kubectl get volumesnapshot -n my-database --show-labels
# Confirm backup completed successfully
velero backup describe my-app-backup | grep Phase
# Create a test namespace for restore
kubectl create namespace my-database-restore
# Restore to the new namespace
velero restore create test-restore \
--from-backup my-app-backup \
--namespace-mappings my-database:my-database-restore
# Monitor restore progress
velero restore describe test-restore
velero restore logs test-restore
# Verify PVCs were restored correctly
kubectl get pvc -n my-database-restore --show-labels
# Remove test namespace after verification
kubectl delete namespace my-database-restore
# List backups and restores
velero backup get
velero restore get
You can customize the label key that Velero uses to identify VGS groups. This is useful if you have pre-existing labels or want to use a different convention. The configuration is applied with the following order of precedence:
Backup Resource Spec (Highest Priority): For the most granular control, you can specify the label key directly in your Backup
resource definition.
apiVersion: velero.io/v1
kind: Backup
metadata:
name: my-app-backup
namespace: velero
spec:
volumeGroupSnapshotLabelKey: "my-organization.io/snapshot-group"
includedNamespaces: [ "my-database" ]
# ... other backup spec details
Velero Server Argument: You can set a cluster-wide default by providing the --volume-group-snapshot-label-key
command-line argument when you install or start the Velero server.
Default Value (Lowest Priority): If you don’t provide any custom configuration, Velero defaults to using velero.io/volume-group
.
Symptoms: Backup completes but individual VolumeSnapshots are created instead of VGS
# Check if PVCs have the correct label
kubectl get pvc -n my-database --show-labels
# Verify all PVCs use the same CSI driver
kubectl get pv $(kubectl get pvc -n my-database -o jsonpath='{.items[*].spec.volumeName}') -o jsonpath='{range .items[*]}{.metadata.name}: {.spec.csi.driver}{"\n"}{end}'
Solutions:
Symptoms: Backup fails with VGS-related errors
# Check Velero logs for VGS errors
velero backup logs my-app-backup | grep -i "VolumeGroup"
# Check CSI driver logs
kubectl logs -n kube-system -l app=ebs-csi-controller --tail=100
Solutions:
Issue
When creating VolumeGroupSnapshot backups, you may encounter this error:
VolumeSnapshot has a temporary error Failed to set default snapshot class with error cannot find default snapshot class. Snapshot controller will retry later.
Cause
The Kubernetes snapshot controller requires a default VolumeSnapshotClass to be configured in the cluster, but none is currently set.
Solution
Set a default VolumeSnapshotClass that uses the same CSI driver as your VolumeGroupSnapshotClass:
# List available VolumeSnapshotClasses
kubectl get volumesnapshotclasses
# Set the appropriate class as default for your CSI driver
kubectl patch volumesnapshotclass <snapshot-class-name> \
-p '{"metadata":{"annotations":{"snapshot.storage.kubernetes.io/is-default-class":"true"}}}'
# Example for Ceph RBD:
kubectl patch volumesnapshotclass ocs-storagecluster-rbdplugin-snapclass \
-p '{"metadata":{"annotations":{"snapshot.storage.kubernetes.io/is-default-class":"true"}}}'
Important: Ensure the default VolumeSnapshotClass uses the same CSI driver as your VolumeGroupSnapshotClass. For example, if your VolumeGroupSnapshotClass uses ebs.csi.aws.com
, the default VolumeSnapshotClass should also use ebs.csi.aws.com
.
Note: Only one VolumeSnapshotClass should be marked as default per CSI driver to avoid conflicts. The default VolumeSnapshotClass driver must match the CSI driver used by your VolumeGroupSnapshotClass.
To help you get started, see the documentation.