Documentation

Velero File System Backup Performance Guide

When using Velero to do file system backup & restore, Restic uploader or Kopia uploader are both supported now. But the resources used and time consumption are a big difference between them.

We’ve done series rounds of tests against Restic uploader and Kopia uploader through Velero, which may give you some guidance. But the test results will vary from different infrastructures, and our tests are limited and couldn’t cover a variety of data scenarios, the test results and analysis are for reference only.

Infrastructure

Minio is used as Velero backend storage, Network File System (NFS) is used to create the persistent volumes (PVs) and Persistent Volume Claims (PVC) based on the storage. The minio and NFS server are deployed independently in different virtual machines (VM), which with 300 MB/s write throughput and 175 MB/s read throughput representatively.

The details of environmental information as below:

### KUBERNETES VERSION
root@velero-host-01:~# kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4"
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.14"

### DOCKER VERSION
root@velero-host-01:~# docker version
Client:
 Version:           20.10.12
 API version:       1.41

Server:
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.2
 containerd:
  Version:          1.5.9-0ubuntu1~20.04.4
 runc:
  Version:          1.1.0-0ubuntu1~20.04.1
 docker-init:
  Version:          0.19.0

### NODES
root@velero-host-01:~# kubectl get nodes |wc -l 
6 // one master with 6 work nodes

### DISK INFO
root@velero-host-01:~# smartctl -a /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-126-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               VMware
Product:              Virtual disk
Revision:             1.0
Logical block size:   512 bytes
Rotation Rate:        Solid State Device
Device type:          disk
### MEMORY INFO
root@velero-host-01:~# free -h
              total        used        free      shared  buff/cache   available
Mem:          3.8Gi       328Mi       3.1Gi       1.0Mi       469Mi       3.3Gi
Swap:            0B          0B          0B

### CPU INFO
root@velero-host-01:~# cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c
      4  Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz

### SYSTEM INFO
root@velero-host-01:~# cat /proc/version
root@velero-host-01:~# cat /proc/version
Linux version 5.4.0-126-generic (build@lcy02-amd64-072) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #142-Ubuntu SMP Fri Aug 26 12:12:57 UTC 2022

### VELERO VERSION
root@velero-host-01:~# velero version
Client:
	Version: main ###v1.10 pre-release version
	Git commit: 9b22ca6100646523876b18a491d881561b4dbcf3-dirty
Server:
	Version: main ###v1.10 pre-release version

Test

Below we’ve done 6 groups of tests, for each single group of test, we used limited resources (1 core CPU 2 GB memory or 4 cores CPU 4 GB memory) to do Velero file system backup under Restic path and Kopia path, and then compare the results.

Recorded the metrics of time consumption, maximum CPU usage, maximum memory usage, and minio storage usage for node-agent daemonset, and the metrics of Velero deployment are not included since the differences are not obvious by whether using Restic uploader or Kopia uploader.

Compression is either disabled or not unavailable for both uploader.

Case 1: 4194304(4M) files, 2396745(2M) directories, 0B per file total 0B content

result:

Uploader Resources Times Max CPU Max Memory Repo Usage
Kopia 1c2g 24m54s 65% 1530 MB 80 MB
Restic 1c2g 52m31s 55% 1708 MB 3.3 GB
Kopia 4c4g 24m52s 63% 2216 MB 80 MB
Restic 4c4g 52m28s 54% 2329 MB 3.3 GB

conclusion:

  • The memory usage is larger than Velero’s default memory limit (1GB) for both Kopia and Restic under massive empty files.
  • For both using Kopia uploader and Restic uploader, there is no significant time reduction by increasing resources from 1c2g to 4c4g.
  • Restic uploader is one more time slower than Kopia uploader under the same specification resources.
  • Restic has an irrational repository size (3.3GB)

Case 2: Using the same size (100B) of file and default Velero’s resource configuration, the testing quantity of files from 20 thousand to 2 million, these groups of cases mainly test the behavior with the increasing quantity of files.

Case 2.1: 235298(23K) files, 137257 (10k)directories, 100B per file total 22.440MB content

result:

Uploader Resources Times Max CPU Max Memory Repo Usage
Kopia 1c1g 2m34s 70% 692 MB 108 MB
Restic 1c1g 3m9s 54% 714 MB 275 MB

Case 2.2 470596(40k) files, 137257 (10k)directories, 100B per file total 44.880MB content

result:

Uploader Resources Times Max CPU Max Memory Repo Usage
Kopia 1c1g 3m45s 68% 831 MB 108 MB
Restic 1c1g 4m53s 57% 788 MB 275 MB

Case 2.3 705894(70k) files, 137257(10k) directories, 100B per file total 67.319MB content

result:

Uploader Resources Times Max CPU Max Memory Repo Usage
Kopia 1c1g 5m06s 71% 861 MB 108 MB
Restic 1c1g 6m23s 56% 810 MB 275 MB

Case 2.4 2097152(2M) files, 2396745(2M) directories, 100B per file total 200.000MB content

result:

Uploader Resources Times Max CPU Max Memory Repo Usage
Kopia 1c1g OOM 74% N/A N/A
Restic 1c1g 41m47s 52% 904 MB 3.2 GB

conclusion:

  • With the increasing number of files, there is no memory abnormal surge, the memory usage for both Kopia uploader and Restic uploader is linear increasing, until exceeds 1GB memory usage in Case 2.4 Kopia uploader OOM happened.
  • Kopia uploader gets increasingly faster along with the increasing number of files.
  • Restic uploader repository size is still much larger than Kopia uploader repository.

Case 3: 10625(10k) files, 781 directories, 1.000MB per file total 10.376GB content

result:

Uploader Resources Times Max CPU Max Memory Repo Usage
Kopia 1c2g 1m37s 75% 251 MB 10 GB
Restic 1c2g 5m25s 100% 153 MB 10 GB
Kopia 4c4g 1m35s 75% 248 MB 10 GB
Restic 4c4g 3m17s 171% 126 MB 10 GB

conclusion:

  • This case involves a relatively large backup size, there is no significant time reduction by increasing resources from 1c2g to 4c4g for Kopia uploader, but for Restic uploader when increasing CPU from 1 core to 4, backup time-consuming was shortened by one-third, which means in this scenario should allocate more CPU resources for Restic uploader.
  • For the large backup size case, Restic uploader’s repository size comes to normal

Case 4: 900 files, 1 directory, 1.000GB per file total 900.000GB content

result:

Uploader Resources Times Max CPU Max Memory Repo Usage
Kopia 1c2g 2h30m 100% 714 MB 900 GB
Restic 1c2g Timeout 100% 416 MB N/A
Kopia 4c4g 1h42m 138% 786 MB 900 GB
Restic 4c4g 2h15m 351% 606 MB 900 GB

conclusion:

  • When the target backup data is relatively large, Restic uploader starts to Timeout under 1c2g. So it’s better to allocate more memory for Restic uploader when backup large sizes of data.
  • For backup large amounts of data, Kopia uploader is both less time-consuming and less resource usage.

Summary

  • With the same specification resources, Kopia uploader is less time-consuming when backup.
  • Performance would be better if choosing Kopia uploader for the scenario in backup large mounts of data or massive small files.
  • It’s better to set one reasonable resource configuration instead of the default depending on your scenario. For default resource configuration, it’s easy to be timeout with Restic uploader in backup large amounts of data, and it’s easy to be OOM for both Kopia uploader and Restic uploader in backup of massive small files.
Getting Started

To help you get started, see the documentation.