Virtual Machines: A Primer

 

What is a Virtual Machine (VM)?

A VM is a software-based emulation of a computer.  It includes an operating system, software programs and hardware resources.  The link between the guest VM and the host computer’s hardware resources, such as CPUs, memory, network cards, video cards, USB ports, and hard disks, is handled by a virtualization layer often referred to as the hypervisor.  The VM operating system behaves almost identically to a native operating system to the user.

The NMRbox VM is a fully functional CentOS Linux operating system pre-configured with software commonly employed in NMR data processing and analysis.

Server and Client based VM’s

VMs can run remotely on a server providing Software-as-a-Service (SaaS) or locally on a host computer. When run locally on a host computer, the VM is downloaded and imported using virtualization software and the host computer supplies the hardware resources.  The host computer must have enough resources to run both its native operating system and the VM’s operating system.  SaaS VMs are able to draw on large pools of shared computing power and place very low system requirements on the user, who accesses the SaaS VM through an ssh terminal possibly with X-windows support, a vncserver, or through a client program that allows remote VMs to be displayed.

Our release of NMRbox will provide a downloadable VM for use on a host computer as well as a SaaS version.

Basic steps required to install a local copy of the NMRbox VM

NOTE: Detailed installation instructions can be found on the NMRbox website.

1.Download the NMRbox VM as a .ova file

2.Install virtualization software on your host computer if not already installed (see the next section for options)

3.Import the nmrbox.ova file and create a local NMRbox VM

4.Alter the hardware settings of the NMRbox VM appropriately for your host computer.

a.This step is critical. NMRbox will be configured with default hardware settings, but it up to the user to choose appropriate settings for memory and the number of CPU’s. Changing the second hard drive to be independent of snapshots is also suggested at this point.

5.Boot the NMRbox VM

6.Install tools (such as VMware tools, Virtual Box Guest Additions) to allow a more streamlined interface between your host computer operating system and the virtual operating system

What host software do I need to run the NMRbox VM?

There are a wide range of programs that act as virtualization environments (hypervisors) that import and run VMs, some free and some paid. Here is a partial list of some commonly used software:

•Oracle’s VirtualBox (free; Windows, OSX, Linux, Solaris)

•VMware Player (free; Windows)

•VMware Workstation (paid; Windows, Linux)

•VMware Fusion (paid; OSX)

•Parallels (paid; OSX)

•KVM (free; Linux)

•QEMU (free, Linux)

NMRbox has been tested with Oracle’s VirtualBox and the VMware products, Player, Fusion, and Workstation and works well with all of them.

VirtualBox Guest Additions and VMware Tools

Virtual Box Guest Additions and VMware Tools are extra packages that can be installed inside the VM after the guest operating system is installed. They provide tools that increase the usability of the VM by providing additional functionality between the host and guest operating systems such as:

•Better mouse integration between host and guest OS’s

•Shared folders between host and guest OS’s

•Copy/paste between host and guest OS’s

•Automatic screen resolution changes in the guest OS when the VM window in the host changes sizes

•USB pass-through allowing USB devices to be mounted in either the host or guest OS

•Driverless printing from guest OS

•Better video support

VirtualBox and VMware Player, Fusion, and Workstation all provide easy installation of these tools. Unfortunately NMRbox will not have these tools pre-installed. These tools rely on an interaction between the tools installed inside the VM operating system and tools inside the virtualization software and are version specific, thus making it difficult for NMRbox to support all the possible scenarios. However, the NMRbox documentation has detailed instructions on how to install Guest Additions or VMware tools.

Where are user data files stored inside a VM?

For local NMRbox VMs

By default a users files will be stored in a virtual hard disk inside the VM. However, we wanted to design the NMRbox VM so that the VM can be updated to newer versions without affecting any user data residing inside the VM. To accomplish this we have created user home folders in /home/NMR where NMR is a mount point to a second virtual hard disk, nmrbox*-disk2.vmdk. In this way a user can download a new version of the NMRbox VM as an .ova file and import it with a different name. Once imported the nmrbox*-disk2.vmdk file can be copied from the previous VM to the new VM and all the files will be migrated. There are a few caveats with this process if the previous VM contains snapshots and the second virtual hard disk was not setup to be independent. See the NMRbox documentation for more detailed instructions.

Shared folders: A shared folder on the host computer allows both the host computer and the guest VM to access the same data files. Both VirtualBox and VMware Player, Fusion, and Workstation allow shared folders to be easily configured while the VM is powered off. In our testing we have found some idiosyncrasies with the use of shared folders and suggest they be used primarily for moving files between the host and guest operating systems rather than actively using them for data processing and analysis.

Remote file server: If you already have a remote file server then you have the choice of remote mounting the file server directly from the VM and creating a link to the file share from your home folder.

For server NMRbox VMs

For server based VMs user files will be stored on a remote file share independent of the VM with full disaster recovery and default permissions that deny all read/write/execute permissions from other users.

ovf, ova, vmdk, vbox, vdi; what are all these formats?

(OVF) stands for Open Virtualization Format. The OVF format is not tied to any particular hypervisor or processor architecture and thus is a good format to distribute a VM. A VM exported as an OVF will contain a directory with a .ovf extension and inside the directory will be an XML file descriptor of the VM with a .ovf extension and one or more disk images.

(OVA) A .ova file is a tar compressed archive of an OVF formatted VM. The tar command can be used to unpack the .ova file into a directory structure, but this is often unnecessary as the hypervisor software has an import feature that will handle this automatically.

(VMDK) stands for Virtual Machine Disk and is an open virtual hard disk image format for VMs..  The NMRbox VM is distributed as a .ova file containing a .vmdk disk image.

(VBOX) A .vbox file is much like the .ovf file above and is an XML file descriptor of the VM in Oracle’s VirtualBox format. While it is possible to directly edit the .vbox file it is highly recommended that changes to the VM be handled from the Virtual Box Manager.

(VDI) stands for VirtualBox Disk Image. A .vdi file, like a .vmdk file, is a virtual disk image in a format supported by Oracle’s VirtualBox. In the NMRbox VM we are using the VMDK format for the virtual disk image. However, if you create a separate disk image from within VirtualBox you will be prompted for a disk image type and .vdi will be the default. Other choices will be VMDK, VHD (Virtual hard disk), HDD (Parallels hard disk), QED (QEMU enhanced disk used with KVM) or QCOW (QEMU copy-on-write image also used with KVM).