Virtalis Reach Help

Virtalis Reach Automated Backup System

Overview

Virtalis Reach comes with an optional automated back-up system allowing an administrator to restore to an earlier snapshot in the event of a disaster. We will install Velero to back-up the state of your Kubernetes Cluster and use a custom-built solution which leverages Restic to back-up the persistent data imported into Virtalis Reach.

Alternatively, you can consider using your own backup solution, for example PersistentVolumeSnapshot, which creates a snapshot of a persistent volume at a point in time. You should be aware, however, that they may only be supported on a limited number of platforms such as Azure and AWS.

If you decide to use a different solution to the one provided by Virtalis, you should be aware that not all databases used by Virtalis Reach support live backups. This means that the databases must be taken offline before backups are performed.

You should consider creating regular backups of your buckets which hold the backed-up data in case of failure, this will be done through your cloud provider or manually if you host your own bucket.

Please note: The following databases used by Virtalis Reach can only be backed-up while offline:

  • Minio
  • Neo4j

Variables and Commands

In this section, variables enclosed in <> arrows should be replaced with the appropriate values. For example:

docker login <my_id> <my_password> --verbosity debug


becomes 

docker login admin admin --verbosity debug


Commands to execute in a shell are shown as code and each block of code is designed to be a single command that can be copy and pasted:

These are commands to be entered in a shell in your clusters administration console
This is another block of code \
that uses "\" to escape newlines \
and can be copy and pasted straight into your console


Installation

Creating a Storage Location

Recommended

Follow the “Create S3 Bucket” and “Set permissions for Velero” sections from and make sure that you create the following 2 buckets under your s3 bucket:

  • reach-restic
  • reach-velero

https://github.com/vmware-tanzu/velero-plugin-for-aws#create-s3-bucket 

Export the address and port of the bucket you have created:

export S3_BUCKET_ADDRESS=<address>
#i.e S3_BUCKET_ADDRESS=192.168.1.3, S3_BUCKET_ADDRESS=mydomain.com
export S3_BUCKET_PORT=<port>
export S3_BUCKET_PROTOCOL=<http or https>

Not Recommended - Create an S3 Bucket on the Same Cluster, Alongside Virtalis Reach

Customize the persistence.size if the total size of your data exceeds 256gb and change the storage class REACH_SC if needed.

export REACH_SC=local-path
kubectl create ns reach-backup
#check if pwgen is installed for the next step
command -v pwgen
kubectl create secret generic reach-s3-backup -n reach-backup \
--from-literal='access-key'=$(pwgen 30 1 -s | tr -d '\n') \
--from-literal='secret-key'=$(pwgen 30 1 -s | tr -d '\n')
helm upgrade --install reach-s3-backup bitnami/minio \
-n reach-backup --version 3.6.1 \
--set persistence.storageClass=$REACH_SC \
--set persistence.size=256Gi \
--set mode=standalone \
--set resources.requests.memory='150Mi' \
--set resources.requests.cpu='250m' \
--set resources.limits.memory='500Mi' \
--set resources.limits.cpu='500m' \
--set disableWebUI=true \
--set useCredentialsFile=true \
--set volumePermissions.enabled=true \
--set defaultBuckets="reach-velero reach-restic" \
--set global.minio.existingSecret=reach-s3-backup
cat <<EOF > credentials-velero
[default]
aws_access_key_id=$(kubectl get secret reach-s3-backup \
-n reach-backup -o jsonpath="{.data.access-key}" | base64 --decode)
aws_secret_access_key=$(kubectl get secret reach-s3-backup \
-n reach-backup -o jsonpath="{.data.secret-key}" | base64 --decode)
EOF


Export the Address and Port of the Bucket You Have Created

export S3_BUCKET_ADDRESS=reach-s3-backup-minio.reach-backup.svc.cluster.local
export S3_BUCKET_PORT=9000
export S3_BUCKET_PROTOCOL=http
export S3_BUCKET_REGION=local

Set Up Variables

For the duration of this installation, you have to navigate to the k8s folder that is downloadable by following the Virtalis Reach Installation Guide

Make scripts executable:

sudo chmod +x \
trigger-database-restore.sh \
trigger-database-backup.sh \
install-backup-restore.sh


Export out the following variables:

export ACR_REGISTRY_NAME=virtaliscustomer


Export the address of the reach-restic bucket:

export REPO_URL=s3:$S3_BUCKET_PROTOCOL://\
$S3_BUCKET_ADDRESS:$S3_BUCKET_PORT/reach-restic


Substitute the variable values and export them:

export REACH_NAMESPACE=<name of kubernetes namespace Virtalis Reach is deployed in>

Optional Configuration Variables

export MANAGED_TAG=<custom image tag for Virtalis Reach services>
export DISABLE_CRON<seto to true to install without an automated cronSchedule>

Velero Installation

The following steps will assume you named your Velero bucket “reach-velero”.

Add the VMware helm repository and update:

helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts
helm repo update

Install Velero

helm install velero vmware-tanzu/velero \
--namespace velero \
--create-namespace \
--set-file credentials.secretContents.cloud\
=./credentials-velero \
--set configuration.provider=aws \
--set configuration.backupStorageLocation.name\
=reach-velero \
--set configuration.backupStorageLocation.bucket\
=reach-velero \
--set configuration.backupStorageLocation.config.region\
=$S3_BUCKET_REGION \
--set configuration.backupStorageLocation.config.s3Url\
=$S3_BUCKET_PROTOCOL://$S3_BUCKET_ADDRESS:$S3_BUCKET_PORT \
--set configuration.backupStorageLocation.config.publicUrl\
=$S3_BUCKET_PROTOCOL://$S3_BUCKET_ADDRESS:$S3_BUCKET_PORT \
--set configuration.backupStorageLocation.config.s3ForcePathStyle\
=true \
--set initContainers[0].name=velero-plugin-for-aws \
--set initContainers[0].image=velero/velero-plugin-for-aws:v1.1.0 \
--set initContainers[0].volumeMounts[0].mountPath=/target \
--set initContainers[0].volumeMounts[0].name=plugins \
--set snapshotsEnabled=false \
--version 2.23.1 \
--set deployRestic=true

Install the Velero CLI Client

wget https://github.com/vmware-tanzu/velero/releases\
/download/v1.5.3/velero-v1.5.3-linux-amd64.tar.gz
tar -xzvf velero-v1.5.3-linux-amd64.tar.gz
rm -f velero-v1.5.3-linux-amd64.tar.gz
sudo mv $(pwd)/velero-v1.5.3-linux-amd64/velero /usr/bin/
sudo chmod +x /usr/bin/velero


Manually create a single backup to verify that the connection to the aws bucket is working:

velero backup create test-backup-1 \
--storage-location=reach-velero --include-namespaces $REACH_NAMESPACE


Watch the status of the backup until it’s finished, this should show up as complete if everything was set up correctly:

watch -n2 velero backup get


Create a scheduled backup:

velero create schedule cluster-backup --schedule="45 23 * * 6" \
--storage-location=reach-velero --include-namespaces $REACH_NAMESPACE


This schedule will run a backup every Saturday at 23:45PM

Restic Integration

The custom restic integration uses Kubernetes jobs to mount the data, encrypt it, and send it to a bucket. Kubernetes CustomResourceDefinitions are used to store the information about the restic repositories as well as any created backups.

By default the scheduled data backup runs on every Friday at 23:45PM, this can be modified by editing the cronSchedule field in all values.yaml files located in backup-restore/helmCharts/<release_name>/ with the exception of common-lib.

All the performed backups are offline backups therefore Virtalis Reach will be unavailable for that period as a number of databases have to be taken down.

Create an AWS bucket with the name “reach-restic” by following the same guide from the Velero section

Replace the keys and create a secret containing the reach-restic bucket credentials:

kubectl create secret generic reach-restic-bucket-creds \
-n "$REACH_NAMESPACE" \
--from-literal='AWS_ACCESS_KEY'='<ACCESS_KEY>' \
--from-literal='AWS_SECRET_KEY'='<SECRET_KEY>'


If you instead opted in to deploy an s3 bucket on the same cluster, run this instead:

kubectl create secret generic reach-restic-bucket-creds \
-n "$REACH_NAMESPACE" \
--from-literal='AWS_ACCESS_KEY'=$(kubectl get secret reach-s3-backup \
-n reach-backup -o jsonpath="{.data.access-key}" | base64 --decode) \
--from-literal='AWS_SECRET_KEY'=$(kubectl get secret reach-s3-backup \
-n reach-backup -o jsonpath="{.data.secret-key}" | base64 --decode)


Export the address of the reach-restic bucket:

export REPO_URL=s3:$S3_BUCKET_PROTOCOL://\
$S3_BUCKET_ADDRESS:$S3_BUCKET_PORT/reach-restic


Run the installation:

./install-backup-restore.sh


Check if all the -init-repository- jobs have completed:

kubectl get pods -n $REACH_NAMESPACE | grep init-repository


Query the list of repositories:

kubectl get repository -n $REACH_NAMESPACE


The output should look something like this with the status of all repositories showing as Initialized:

NAME                    STATUS        SIZE   CREATIONDATE
artifact-binary-store   Initialized   0B     2021-03-01T10:21:53Z
artifact-store          Initialized   0B     2021-03-01T10:21:57Z
job-db                  Initialized   0B     2021-03-01T10:21:58Z
keycloak-db             Initialized   0B     2021-03-01T10:21:58Z
vrdb-binary-store       Initialized   0B     2021-03-01T10:21:58Z
vrdb-store              Initialized   0B     2021-03-01T10:22:00Z


Once you are happy to move on, delete the completed job pods:

kubectl delete jobs -n $REACH_NAMESPACE -l app=backup-restore-init-repository


Trigger a manual backup:

./trigger-database-backup.sh


After a while, all the -triggered-backup- jobs should show up as Completed:

kubectl get pods -n "$REACH_NAMESPACE" | grep triggered-backup


Query the list of snapshots:

kubectl get snapshot -n "$REACH_NAMESPACE"


The output should look something like this with the status of all snapshots showing as Completed:

NAME                       STATUS      ID       CREATIONDATE
artifact-binary-store...   Completed   62e...   2021...
artifact-store-neo4j-...   Completed   6ae...   2021...
job-db-mysql-master-1...   Completed   944...   2021...
keycloak-db-mysql-mas...   Completed   468...   2021...
vrdb-binary-store-min...   Completed   729...   2021...
vrdb-store-neo4j-core...   Completed   1c2...   2021...


Once you are happy to move on, delete the completed job pods:

kubectl delete jobs -n $REACH_NAMESPACE -l app=backup-restore-triggered-backup


Triggering a Manual Backup

Set Up Variables

Substitute the variable values and export them:

export REACH_NAMESPACE=<name of kubernetes namespace Virtalis Reach is deployed in>

Run the Backup

Consider scheduling system downtime and scaling down the ingress to prevent people from accessing the server during the backup procedure.

Make a note of the replica count for nginx before scaling it down:

kubectl get deploy -n ingress-nginx 
export NGINX_REPLICAS=<CURRENT_REPLICA_COUNT>


Scale down the nginx ingress service:

kubectl scale deploy --replicas=0 ingress-nginx-ingress-controller \
-n ingress-nginx


Create a cluster resource level backup:

velero backup create cluster-backup-$(date +"%m-%d-%Y") \
--storage-location=reach-velero --include-namespaces $REACH_NAMESPACE


Check the status of the velero backup: 

watch -n2 velero backup get


Create a database level backup:

./trigger-database-backup.sh


Check the status of the database backup:

watch -n2 kubectl get snapshot -n "$REACH_NAMESPACE"

Restoring Data

Set Up Variables

Substitute the variable values and export them:

export REACH_NAMESPACE=<name of kubernetes namespace Virtalis Reach is deployed in>

Restoration Plan

Plan your restoration by gathering a list of the snapshot IDs you will be restoring from and export them

Begin by querying the list of repositories:

kubectl get repo -n "$REACH_NAMESPACE"
NAME                    STATUS        SIZE   CREATIONDATE
artifact-binary-store   Initialized   12K    2021-07-02T12:03:26Z
artifact-store          Initialized   527M   2021-07-02T12:03:29Z
comment-db              Initialized   180M   2021-07-02T12:03:37Z
job-db                  Initialized   181M   2021-07-02T12:03:43Z
keycloak-db             Initialized   193M   2021-07-02T12:03:43Z
vrdb-binary-store       Initialized   12K    2021-07-02T12:03:46Z
vrdb-store              Initialized   527M   2021-07-02T12:02:44Z


Perform a dry run of the restore script to gather a list of the variables you have to export:

DRY_RUN=true ./trigger-database-restore.sh


Sample output:

Error: ARTIFACT_BINARY_STORE_RESTORE_ID has not been exported. Please run 'kubectl get snapshot -n develop -l repository=artifact-binary-store' to see a list of available snapshots.
Error: ARTIFACT_STORE_RESTORE_ID has not been exported. Please run 'kubectl get snapshot -n develop -l repository=artifact-store'...
...


Query available snapshots or use the commands returned in the output above to query by specific repositories:

kubectl get snapshot -n "$REACH_NAMESPACE"


This should return a list of available snapshots:

NAME                            STATUS      ID       CREATIONDATE
artifact-binary-store-mini...   Completed   4a2...   2021-07-0...
artifact-store-neo4j-core-...   Completed   41d...   2021-07-0...
comment-db-mysql-master-16...   Completed   e72...   2021-07-0...
job-db-mysql-master-162522...   Completed   eb5...   2021-07-0...
keycloak-db-mysql-master-1...   Completed   919...   2021-07-0...
vrdb-binary-store-minio-16...   Completed   cf0...   2021-07-0...
vrdb-store-neo4j-core-1625...   Completed   08d...   2021-07-0...


It’s strongly advised to restore all the backed-up data using snapshots from the same day to avoid any missing/inaccessible data.

Note down the replica count for nginx before scaling it down:

kubectl get deploy -n ingress-nginx
export NGINX_REPLICAS=<CURRENT_REPLICA_COUNT>


Scale down the nginx ingress service to prevent people from accessing Virtalis Reach during the restoration process:

kubectl scale deploy --replicas=0 ingress-nginx-ingress-controller \
-n ingress-nginx


Run the Restore Script

./trigger-database-restore.sh


Unset exported out restore id’s:

charts=( $(ls backup-restore/helmCharts/) ); \
for chart in "${charts[@]}"; do if [ $chart == "common-lib" ]; \
then continue; fi; id_var="$(echo ${chart^^} | \
sed 's/-/_/g')_RESTORE_ID"; unset ${id_var}; done


After a while, all the -triggered-restore- jobs should show up as Completed:

kubectl get pods -n "$REACH_NAMESPACE" | grep triggered-restore


Once you are happy to move on, delete the completed job pods:

kubectl delete jobs -n $REACH_NAMESPACE \
-l app=backup-restore-triggered-restore


Watch and wait for all pods that are running to be Ready:

watch -n2 kubectl get pods -n "$REACH_NAMESPACE"


Scale back nginx:

kubectl scale deploy --replicas="$NGINX_REPLICAS" \
ingress-nginx-ingress-controller -n ingress-nginx


Verify that everything is working by testing standard functionality such as
importing a file or viewing a visualisation.

Print page
2021.4
October 20, 2021 19:04

Need more?