So there I was… Had a custom built vSphere cluster, 3 hosts, iSCSI target (setup with Lio-Target), everything running fine, smoothly, perfect… I’m done right? NOPE!
Most of us who care, are concerned about our Disaster Recovery or Backup solution. With virtualization things get a little bit interesting in the fact that either you have a CRAZY large setup, and can use the VMware backup stuff, or you have a smaller environment and want something simple, easy to use, and with a low footprint. Configuring a backup and/or disaster recovery solution for a virtualized environment may be difficult and complicated, however after it’s fully implemented; management, use, and administration is easy, and don’t forget about the abilities and features you get with virtualization.
Reasons why virtualization rocks when it comes to backup and disaster recovery:
-Unlike traditional backups you do not have to install a bare OS to run the backup software to restore over
-You can restore to hardware that is not like nor similar to the original hardware
-Backups are now simple files that can be easily moved, transported, copied, and saved on to normal or non-normal media (you could save an entire system on to a USB key if it was big enough). On a 2TB external USB drive, you could have a backup of over 16+ virtual machines!
-Ease of recovery: Copy the backed up VM files to host/datastore, and simply hit play. Restore complete and you’re up and running!
So with all that in mind, here we go! (Scroll to bottom of post for a quick conclusion).
For my solution, here are some of the requirements I had:
1) Utilize snapshots to take restorable backups while the VMs are running (no downtime).
2) Move the backups to a different location while running (this could be a drive, NFS export, SMB share, etc…).
3) Have the backups stored somewhere easy to access where I can move them to a removable external USB drive to take off-site. This way, I have fast disk-to-disk access to restore backups in the event my storage system goes down or RAID array is lost (downtime would be minimal), or in the event of something more serious like a fire, I would have the USB drive off-site to restore from. Disk-to-disk backups could happen on a daily basis, and disk-to-USB could be done weekly and taken off-site.
4) In the event of a failure, be able to bring USB drive onsite, transfer VMs back and be up and running in no time.
So with this all in mind, I started designing a solution. My existing environment (without backup) composed of:
2 X HP DL360 G5 Servers (running ESXi)
1 X HP ML350 G5 Server (running ESXi)
1 X Super Micro Intel Xeon Server (Running CentOS 6 & Lio-Target backports: providing iSCSI VMFS)
2 X HP MSA20 Storage Units
First, I need a way to create snapshots of my 16+ virtual machines. After, I would need to have the snapshots moved to another location (such as a backup server). There is a free script available called ghettoVCB. ghettoVCB is a “Free alternative for backup up VM’s for ESX(i) 3.5, 4.x+ & 5.x” and is available (along with tons of documentation) at: http://communities.vmware.com/docs/DOC-8760. This script is generally ran on the ESXi host, generates a snapshot and clones it to a seperate datastore configured on that host. It does this for all virtual machines named in the VM list you specify, or for all VMs on the host if a specific switch is passed to ghettovcb.sh.
So now that we have the software, we need to have a location setup to back up to. We could either create a new iSCSI target, or we could setup a new Linux server and have a RAID array configured and formatted with ext4 and exported via NFS. This would allow us to have the NFS setup as a datastore on ESXi (so we can backup to the NFS export), and afterwards be able to access the backups natively in Linux to copy/move to a external drive formatted with EXT4.
We configure a new server, running CentOS 6 with enough storage to backup all VMs. We create NFS exports and mount these to all the ESXi hosts. We copy the ghettovcb script to a location on the NFS export so it’s accessible to all hosts easily (without having to update the script on each host individually), and we create lists for each physical host containing the names of the virtual machine it virtualizes. We then edit the ghettovcb.sh file to specify the new destination datastore (the backup datastore) and how we want it to back up.
When executing:
./ghettovcb.sh -f esxserverlist01
It creates the snapshot for each VM in the list for that host, clones it to the destination datastore (which in my setup is the NFS export on the new backup server), then deletes the snapshot when the backup is complete, finally moving on to the next VM and repeating the process until done. The script needs to be ran on all hosts, and list files for VMs have to be created for each host.
We now have a backup server and have done a disk-to-disk backup of our virtual machines. We can now plug in a large external USB drive to the backup server, and simply copy over the backups to it.
I do everything manually because I like to confirm everything is done and backed up properly, however you can totally create scripts to automate the whole process. After this we have our new backup solution!
Quick recap:
1) Setup a new backup server with enough disk space to back up all VMs. Setup an NFS export and mount it to ESXi hosts.
2) Download and configure the ghettoVCB script. Run the script on each ESXi host to disk-to-disk backup your VMs to your new backup server.
3) Copy the backup files from the backup server to a external USB drive that has enough space. Take off-site.
I have had to restore a couple VMs in the past due to a damaged RAID array, and I did so using a backup from above. It worked great! I will create a post on the restore process sometime in the future (for now feel free to look at the ghettoVCB documentation)!