P2V: How To Make a Physical Linux Box Into a Virtual Machine


Over the last four days, I've been exploring how to convert physical Linux boxes into virtual machines. VMWare has a tool for doing P2V conversions, as they're called, but as far as I can tell it only works for Windows physical machines and for converting various flavors of virtual machines into others.

I've had a Linux machine that I've used in my CS462 (Large Distributed Systems) class for years. The Linux distro has been updated over the years, but the box is an old 266MHz Pentium with 512Mb of RAM. Overall, it's done surprisingly well--a testament to the small footprint of Linux. Still, I decided it was time for an upgrade.

Why Go Virtual

In an effort to simplify my life, I'm trying to cut down on the number of physical boxes I administer, so I decided I wanted the new version of my class server to be running on a virtual machine. This offers several advantages:

  • Fewer physical boxes to manage
  • Easier to move to faster hardware when needed
  • Less noise and heat

I could have just rebuilt the whole machine from scratch on a new virtual machine, but that takes a lot of time and the old build isn't that out of date (one year) and works fine. So, I set out to discover how to transfer a physical machine to a virtual machine. The instructions below give a few details specific to VMWare and OS X, but if you happen to use Parallels (or Windows), the vast majority of what I did is applicable and where it's not, figuring it out isn't hard. I've tried to leave clues and I'm open to questions.

Note: I've used this same process to transfer a VMWare virtual image to run on Parallels. The are probably easier ways, but this technique works fine for that purpose as well--it doesn't matter if the source machine is physical or virtual.

The Process

The first step is to make an image of the source machine. I recommend g4l, Ghost for Linux. There are some detailed instructions on g4l available, but the basics are:

  • Download the g4l bootable ISO and put it on a CD.
  • Boot it on the source machine.
  • Select the latest version from the resulting menu and start it up (you have to type g4l at the prompt).
  • Select raw transfered over network and configure the IP address and the username/password for the FTP server you want the image transfered to.
  • Give the new image a name.
  • Select "backup" and sit back and watch it work.

Note that if you have more than one hard drive on the source machine, you'll have to do each separately. I found that separately imaging each partition on each drive worked best. One tip: there are three compression options. Lzop works, in this application, nearly as GZip or BZip but with much less CPU load. Compression helps not only with storing the images, but also with transfering them around on the 'Net, so you'll probably want some kind of compression.

The next step is to create a virtual machine and put the images on it's drive(s). Create a virtual machine in VMWare as you normally would, selecting the right options for the source OS. When you get to the screen that asks "Startup virtual machine and load OS" (or something like that), uncheck the box and you should be able to change the machine options.

The first thing you need to do with the new VM is create the right number and size of hard drives--and partitions on those drives--to match the partition images you're going to restore.

For transfering single image machines to VMWare, just using the default drive, appropriately sized, worked fine. For more than one drive image, however, I found that making the drive type (SCSI/IDE) match the type on the source was easiest thing to do. Note that VMWare won't let you make the main drive an IDE drive by default. You can always delete it and create a new drive that's an IDE drive if you need to.

The second thing you need to do with the new VM is set the machine to boot from the CD ROM since we've got to start up g4l on the target machine.

On VMWare, you can enter the BIOS by pressing F2 while the virtual machine is loading. This isn't as easy as it sounds since it starts quick. Once you're there, however, it's a pretty standard BIOS setup and changing the boot order is straight forward. On Parallels this is easier since the boot order is an option you can change in the VM's settings.

If you're creating partitions on the drives, you'll need to boot from a ISO image for the appropriate Linux distro and create the partitions using the partition wiazrd, parted, or some other tool--whatever you'd normally do.

Next boot the VM from the g4l ISO image on your computer or the physical CD you made. If you have trouble, be sure the virtual CDROM is connected and powered on when the virtual machine is started. Start g4l and configure it the same way you did before, but this time, you'll select "restore" from the options. g4l should start putting the images from the source machine onto the target. If you have more than one hard drive or partition image, you'll have to restore each to a separate drive or partition--as appropriate--on the virtual machine.

When doing a raw transfer, I you need make the drives the same size as the machine you're moving the image from (I've found that larger works OK, but smaller doesn't). If the drives aren't big enough to support the entire image, you'll get "short reads" and not everything will be transfered. Note that you won't get much complaint from g4l.

The virtual drives should theoretically only take as much space as they need, but it turns out that since you're doing a raw transfer, you'll fill them up with "space." This is one of those instances where copying a sparse data structure results in one that isn't. This results in awfully large disks--make sure you've got plenty of scratch disk space for this operation. More on large disks later.

Repairing and Booting the New Machine

Linux panics if the init RAM disk is not updated
Linux panics if the init RAM disk is not updated
(click to enlarge)

Once the images are copied, you have to make them usable. If you just try to boot from them, you'll likely see something like the screenshot shown on the right: a short message followed by a kernel panic. Before you can use the new machine, you have to do a little repair work on the old images.

  • Get an emergency boot CD ISO for your flavor of Linux and boot the new virtual machine from it. Often you can just boot from the installation image and then enter a rescue mode. For example for Redhat, you can type "linux rescue" at the boot prompt and get into recovery mode.
  • It will search for Linux partitions and should find any you've restored to the machine. You'll have the option to mount these. Do so.
  • Now, use the chroot command to change the root of the file system to the root partition. Mount any of the other partitions that you need (e.g. /boot).
  • Run kudzu to find any new devices and get rid of old ones.
  • Use mkinitrd to create a new init RAM disk. This command should work:
    /sbin/mkinitrd -v -f /boot/initrd-2.2.12-20.img 2.2.12-20
    
    Of course, you'll have to substitute the right initrd name (look in /boot) and use the right version (look in /lib/modules).

If you get an error message about not being able to find the right modules, be sure that the last argument to mkinitrd matches what you see in /lib/modules exactly.

Now, you should be able to boot the machine. With any luck, it should work.

Disk Size Issues

When you restore the image, your new sparse disk will grow to the size of the image, even if the image is only partially full of real data. For example, my Linux box had a 6Gb drive (I told you it was ancient) that contained the root partition and a 100 Gb drive that I'd partitioned into two pieces: one 40Gb partition mounted as /home and a 60Gb partition mounted as /web. After restoring the images for these three partitions, I ended up with a 6Gb and a 107Gb files representing the virtual disks. This despite the fact that only 8Gb of the 107Gb actually contained any data.

Clearly, you don't want 107Gb files hanging around if they can be smaller. One option is to do a file copy rather than an image. This would work fine for the /home and /web partitions in my case, but wouldn't have worked for the root partition--I wanted an image for that. If you've just got one big partition, then you can't use the file transfer option and still have exactly the same machine.

Fortunately there's a relatively painless way of reducing the size of the disk to just what's needed (thanks to Christian Mohn for the technique).

The first step is to zero out all the free space on each partition of the drive you want to shrink. This, in effect, marks the free space. You can do that easily with this command:

cat /dev/zero > zero.fill;sync;sleep 1;sync;rm -f zero.fill 

After this runs, you'll get an error that says "cat: write error: No space left on device". That's normal--you just filled the drive with one BIG file full of zeros, made sure it was flushed to the disk, and then deleted it.

Next you can use the VMWare supplied disk management tool to do the actual shrinking. For VMWare Workstation Manager, you use vmware-vdiskmanager, but the version of this program that ships with Fusion doesn't support the shrink option. Note that this, and other support programs, are in

/Library/Application Support/VMware\\ Fusion/

on OS X.

Fortunately, in OS X at least, there's another program, called diskTool in

/Applications/VMware Fusion.app/Contents/MacOS/

that does support the shrink option (-k1). Running this command

diskTool -k 1 Luwak-IDE_0-1.vmdk 

on my large disk reduced it from 107Gb to 8Gb!

A few notes: Apparently you have to perform the shrink option on the disks for a machine before any snapshots have been taken. Also, be sure to run the zero fill operation in each partition on the disk. The shrinking option takes a little time, but it's well worth it. I haven't tried this in Parallels, but I suspect the disk compaction option would work. If someone tries it, let me know.

Conclusion

So, after a lot of experimentation, some playing around, and a lot of long operations on large files, I have a virtual machine that's a fairly accurate reproduction of the physical machine that it came from. I'll be testing it over the next few days to make sure it's usable.

On reflection, I needn't have been so faithful to the structure on the physical machine. I could have created the right number of partitions on one drive rather than creating multiple drives. After all, the new drive can be as big as I like. Maybe I'll do that next and see how things go...