As of a couple of weeks ago I have started working with Promox.  Now that the ‘free’ options for premium hypervisors are starting to dry up (Win2022 Hyper-V – does not exist; VMware is pulling all non-subscription licensing) I can see there will be a growing trend towards less mainstream hypervisors.

Proxmox has actually been around for a while.  In essence, it’s a web-skinned Linux KVM.  Proxmox is built on a Debian build, and heavily relies on the operator to be somewhat proficient with Linux.  While it takes some of the guesswork out of the Linux experience, there is no way an operator will be able to fully run Proxmox without dropping to bash occasionally.

Initially I was very impressed with Proxmox.  It installs very easily and the web console is both intuitive and provides quite a lot of help.  After watching a few training videos I could see that importing VMs can be done numerous ways, and that networking appeared to be fairly flexible.As I continued down the road of converting my ESXi 6.7 hosts, though, I started to see where Proxmox still has a ways to go.

My first trial was to install Windows 10.  This went quite smoothly and I was able to load the VirtIO disk drivers into the install without too much difficulty, and get the VM booted.  After that, though, I encountered numerous issues with the VirtIO supported drivers and the Windows 10 image.  After the driver package and guest software for VirtIO was installed, it took no less than 2 hours to boot the image again.  It took quite a while to figure out that the issues were with VirtIO, so 3 builds later I was on the SeaBIOS with the default drivers and found that to be stable.  Not as quick as with the VirtIO driver set (when it did eventually boot to a desktop) but stable and far more reasonable boot times.

My second trial was to add storage.  This was relatively easy.  On my second Proxmox node I used an old Datto Siris box I’d reset.  It has JBOD, no RAID to speak of, so I set up ZFS.  Within a couple of minutes I’d wiped out the VMFS 6 partitions and set up a single solid ZFS volume.  I also have a Synology NAS – 4 NICs, 4 Bays.  I went the easy route and just set up an SMB/CIFS share and connected to Proxmox.

The third trial was a bit more daunting.  On my first Promox node I’d already created a Windows 10 instance.  My second node had nothing on it.  I set up the second node to be the Cluster to join, then discovered that the first node couldn’t join the cluster because it already had a VM running on it.  So I set up the first node to be the Cluster to join, and discovered that in the Web UI there is absolutely no way to ‘unjoin’ or even remove the cluster config.  Now I had 2 nodes, each believing they were the master node.

After a lot of searching I finally figured out how to remove the cluster info from both servers so I could try again.

To Remove a Node:

  1. SSH to host (better to not use the Web UI Shell for this)
  2. Check nodes
    pvecm nodes
  3. Identify the node you want to remove by it’s Node id
  4. Remove the node
    pvecm delnode <nodeid>
  5. Now we want to remove the Cluster
    1. Set the cluster to indicate that there’s now only the 1 member
      pvecm expected 1
    2. Stop the Cluster Service
      systemctl stop pve-cluster
    3. Start the PVE local service
      pmxcfs -l
    4. Delete the PVE cluster conf files and the lock file
      rm -r /etc/pve/cluster.conf /etc/pve/corosync.conf
      rm -r /etc/cluster/cluster.conf /etc/corosync/corosync.conf
      rm /var/lib/pve-cluster/.pmxcfs.lockfile
    5. Stop the Cluster Service (again)
      systemctl stop pve-cluster
    6. Reboot

The first time I did this the /etc/pve/cluster.conf and /etc/cluster/cluster.conf files did exist.  Of course I’ve blown this up several times now, so I’ve come to realize that these files no longer exist.  Execute the commands anyway (doesn’t hurt anything) and ignore the warnings that these two files cannot be found.

I did eventually get the cluster to connect the nodes and I did find that the Promox server quite conveniently shifted to the Datacenter console to manage both the nodes and the storage, which actually is quite convenient.  Unfortunately unless both nodes are identical it’s not quite so simple to seamlessly move VMs around, but it does work quite well.

My fourth trial was to migrate VMs from my VMware ESXi host to Proxmox.  The first one I selected was a VM called ‘Unifi Controller’.  It’s a VM I no longer use, but it’s small making it relatively portable, and being that it’s no longer in use there was no danger of losing anything critical.

This ended up being quite the exercise.  The space in the name caused a lot of problems.  Exporting the VM from the VMware Web UI was no issue, nor uploading to Promox. The Debian OS had no trouble with it either, just making it /Unifi\ Controller/, but the ‘qm importovf’ native to Proxmox couldn’t handle it, converting the space to %20 instead.  Eventually I worked out that if I renamed the .ovf and .vmdk files to something else without a space, and edited the .ovf file to match the file names I’d already reset, then the ‘qm importovf’ command was able to blast through the conversation from VMDK to RAW format and add the VM to the node.  Obviously I did have to recreate the network config, but that was relatively easy and the VM came up without issue.

Convert the VMDK disk to RAW:
qemu-img convert -f vmdk /pve-zfs/Unifi\ Controller-1.vmdk -O raw /pve-zfs/Unifi\ Controller/Unifi\ Controller-1.raw -p

Import the VM to Promox (after renaming the files to unificontroller.xxx):
qm importovf 200 unificontroller.ovf pve-zfs

The problem with this process is the download/upload of the OVF from VMware.  If you’re exporting a 1 TB server and you’re on a laptop with only 500 GiB storage, then you can’t get there from here.  So the more logical approach is to import the OVF directly from ESXi to Proxmox, which turns out to be fairly easily done through a program called ovftool (VMware – download and install from their site to your Proxmox server):

ovftool –overwrite vi://192.168.1.5/FileServ /pve-zfs

I had to add –overwrite as this process has failed numerous times and when it does – you have to start over.

The syntax is easy:

  • ovftool – command
  • –overwrite – option
  • vi://192.168.1.5/FileServ – virtual interface IP address of your ESXi host and /VM-Name
  • /pvs-zfs – the storage name you are importing to

The only problem I have with this process is that it’s very slow.  The VM must be shut down to execute the export/import and I’ve found this process on a 1 TB transfer can easily take a day.  I’m going to keep trying to figure out if there’s a way to speed this up.  If I do, I will document it here.