VI3 pdf updated to version 1.2.1

Last night I updated my VI3 card with a smattering of new bits and pieces.  I made small changes to the Networking, MSCS, VMs and VirtualCenter sections.  This isn’t a big update and you’d need a keen eye to spot anything.  A big thanks to Adam White for a couple of corrections he alerted me to.

The new goodness can be downloaded now.

I have a couple of bigger bits I want to add in, but it might take a few days to collate the information.  Keep checking back for version 1.3 sometime soon 😉

Problems with Storage VMotion

I’ve been using the new Storage VMotion functionality fairly heavily over the last couple of months.  I have encountered a couple of significant bugs whilst using it, which have resulted in corrupted VMDK files.  Here are my two tips when using Storage VMotion, to avoid the problems I have seen:

  • Ensure that you have sufficient space in the source datastore as Storage VMotion uses snapshots to transfer the VMDKs over.  This is a particular problem with large frequently changing VMs, e.g. database servers.  Large disks mean that the whole transfer can take several hours, and frequently changing data means large snapshot files.  If sufficient space is not available, the datastore can fill up, preventing any more data being written to the disks and causing the Storage VMotion to fail.  I would recommend you have empty space equivalent to the size of the VM’s disk files before you start.  VMFS extents are an excellent solution to this short term problem.
  • The other problem I have encountered occurs when the VM has its VMDK disk files spread across more than one datastore.  The operation begins successfully, but during the transfer the host gets confused and the process fails.  The disks have snapshots applied, but are corrupt.  Unfortunately, the only way to avoid this situation is to cold migrate all the VM’s files onto one datastore first.

Great care has to be taken to recover disks after a Storage VMotion failure, to avoid total data loss.

When your mind goes blank

For anyone who doesn’t subscribe to Duncan Epping’s great blog, he reminded me of this great command line snippet:

Enter maintenance mode from the ESX command line:
vimsh -n -e /hostsvc/maintenance_mode_enter

I have used this command several times in anger.  It is particularly useful if you can’t get to your VirtualCenter server for whatever reason.

There are some commands which are definitely worth learning.  Fortunately you can get by with just a small handful.

Sometimes people ask me why they would want to learn CLI commands, when the VI Client provides a nice GUI.  This topic is worthy of a long post of its own, but to cover some of the obvious ones:

  • Some people find it easier/quicker than the VI Client
  • Some functionality is still not available from VI Clien
  • CLI allows for scripting
  • Sometimes the VirtualCenter is not available (particularly if it is a VM itself)
  • If the VirtualCenter database server is offline, VirtualCenter itself can become unworkable
  • If the ESX hosts console networking gets screwed
  • You have physical access to ESX consoles but no way to get to a windows GUI

When things need fixed quickly, usually at some ungodly hour of the night, those CLI commands can be difficult to remember.  However, as long as you know a bit about the CLI, if you remember these simple things then you can do just about anything:

ls /usr/sbin
This lists out most commands that you would use.  Some of the command names aren’t that obvious e.g. vmkfstools.  This listing will often help you remember what you are looking for, or spark an idea when your mind has gone blank.

-h
Once you’ve figured out which command it was that you were looking for, then run it with the –h option.  For all VMware commands this will show a short listing of all the command’s options, switches and syntax; very similar to the /? from DOS.  You can use man command if you need a more detailed explanation, but –h usually does enough to spark some synapse.

-l (l for lima)
Often worth remembering, as this switch will usually list the current configuration set.

ls /var/log
These days, just about all the important log files are here (and its sub-directory).  With this command you can see which log files you might want to look at and run them with a simple tail command (or even grep).

/sbin/service –status-all
This command will produce a fair amount of information so you can pipe it to grep to look for running or stopped, or query the status of a single service.  However it can also provide some useful extras – for example it tells you what registered services have the local firewall ports opened for them.

history
Most commands I run, I have usually run before.  This is the ultimate reminder.

You see, very simple commands, but great starting points.  Unfortunately, now the basic VMware Install and Configure course doesn’t cover much (if any?) CLI stuff, it is often though of as a real black art.

It is always worth reminding people that even if the VirtualCenter server is unavailable, that you do have a couple of other options in an emergency:

  • You can connect the VI Client directly to host ESX servers.  A lot of less experienced people think of the VI Client as VirtualCenter.  However if VirtualCenter is unavailable then the VI Client can connect to hosts.
  • The VI web access.  You can connect to each host server via its web access on https://hostname/ui.  From Web Access you can do basic VM management.  Even VirtualCenter itself hosts its own Web Access portal https://VC_hostname/ui.

These two methods are often forgotten in a panic, but they have helped me countless times.

You don’t always need to remember how to do everything, just how to help yourself remember.

Interesting SRM things

I’ve been reading up on the new Site Recovery Manager (SRM) product.  Here are some things I’ve noticed, that others might be interested in:

  • NFS is not currently supported.  Only Fibre Channel and iSCSI.
  • Raw Device Mapping (RDMs) are not supported.  This will really hit those using MSCS for critical services – I suppose this makes “Cluster in a box” more viable. (see update below)
  • SRM per-host licensing at the Primary site, but per-CPU at the DR site (presumably host licenses also at the second site if you need to be able to fail both ways, or be able to fail-back from a disaster).
  • SRM uses AdobeFlex for licensing.
  • SRM server components only supported on 32bit Windows server (not 64bit).
  • Only MS SQL and Oracle supported for the SRM database.
  • One VirtualCenter and one SRM license on each site.
  • Only really two sites supported.

I haven’t discovered which firewall ports need to be opened between the two sites, and between the SRM server, SRM DB server, VC server and ESX hosts.

SRM log files:
C:Documents and SettingsAll UsersApplication DataVMwareVMware Site Recovery ManagerLogs

dr-ip-reporter.exe
SRM introduces a new windows command line tool, which generates an XML file detailing the network structure on both sites.

Maximums:
Protected VMs        500
Protection groups    150
Replicated LUNs        150
Running recovery plans    3

Update: I had indicated that you couldn’t use RDMs with SRM.  The VMware documentation states that VMFS volumes are a pre-requisite.  I had assumed that because of this, and the omission of RDMs, that in fact RDMs were not supported.  NetApp’s documentation covering their SRA (Site Recovery Adapter), also only mentions VMFS volumes.

However at the weekend I watched one of Mike Laverick’s SRM videos, in which he commented about a success with a RDM LUN.  This doesn’t necessarily mean that RDMs are supported with SRM, just that it can work.  VMware often include functionality with supporting it, until they can test it thoroughly themselves.  I would expect that this is only possible with Virtual Compatibility Mode RDMs, and would be dependent on your particular SRA/storage appliance.

Does anyone know if RDMs are officially supported?

Update 2: Here is the offical word I was looking for.  Hiding in the SRM 1.0 release notes 🙂

Experimental Support for RDMs
SRM supports RDMs experimentally. VMware encourages you to try features in test environments and report issues, but do not put RDMs in production SRM workflows.