ESX 4.1 to include likewise AD authentication?

*** Please note, I am not in any ESX4+ beta programme, so anything I write below is not covered by an NDA. I found this openly published on the internet ***

Following my last post about ESX and AD authentication, I have been investigating how I could refine things.  This caused me to take a closer look at Likewise’s solutions, which I have used previously for managing Apple Macs in an AD environment.  Whilst digging around their site, I noticed that VMware ESX was a supported option.  So I moved to their forum to see if I could find any users who had implemented this to find out what their experience had been like.  A simple search for VMware popped up this thread: http://www.likewise.com/community/index.php/forums/viewthread/542/ posted on the 10th December by one of the forum’s Administrators.

(The emboldening is my own emphasis)

Q: Which VMware products are supported by Likewise?

A: VMware ESX and ESXi 4.1 are the first VMware products to provide Likewise based Active Directory authentication as part of its hypervisor host OS.  VMware provides full support for the Likewise technologies in its platform.  Likewise Open and Likewise Enterprise are supported on previous versions of VMware.  For more information, please contact [email protected] or post a question to the VMware Virtualization forum.

Q: What components of Likewise Open are included in VMware?

A: VMware has licensed the Likewise Identity Service from Likewise Software and integrated it into its hypervisor host operating systems ESX andESXi.  This includes the components required to the support domain join, authentication and name based lookups of users among other features.

Q: How do I join a VMware 4.1 ESX or ESXi server to Active Directory?
A: VMware ESX 4.1 system is in early beta.  Contact VMware for directions on joining to AD.

Q: Are event logging and group policy features available for VMware?
A: Event logging and group policy features are unique to Likewise Enterprise.  These are not available on ESXi systems.

Q: Is VMware Server on other OS distributions supported?

A: Yes, as long as the OS is supported by Likewise.  sudo can be used with VMware and Likewise to control access to the VMware management commandline.

Q: Can I install Likewise Enterprise or Likewise Open agents on an existing VMware 4.1 system?
A: This is not currently supported in Likewise 5.3 and VMware 4.1 is still in beta.  Stay tuned to the forums for updates.

Q: Is VMware vMA supported by Likewise Enterprise or Likewise Open?

A: vMA is the vSphere Management Assistant, a Red Hat Linux VM used to enable automation and troubleshooting scenarios with ESXi which doesn’tnormally support a service console.  As a Red Hat compatible distribution, Likewise is supported on this system, but may require specific changes or additional packages.

Q: I installed Likewise on a VMware 4.0 system and the domain-join failed.  How can I get it to join properly?

A: The pam configuration of VMware changed from 3.5 to 4.0.  Likewise 5.3 does not currently support these changes.  However, the join can be completed with instructions from [email protected].

This is certainly exciting news as far as I’m concerned.  Likewise provides some great functionality, and should make user management in ESX much easier for Enterprise deployments.  You can read about the features of the Likewise Identity Service, which is the component that VMware is licensing.

Here’s a quick rundown of a few of the nice things it might offer:

  • Authenticate with AD users and groups. AD schema changes not required.
  • Cached credentials support if the DCs are unavailable.
  • Backup alternative to ntpd via AD.
  • Support for AD site affinity.
  • Support for multiple forests.

You think you might find this useful?

AD and sudo integratation in kickstart

Following on from my last post about kickstart scripts which looked at partitioning, this one concentrates on user account provisioning.  There are lots of useful guides online about how to configure user accounts, however none that fitted all my requirements.  So nothing below is groundbreakingly new, but it does demonstrate a complete working solution.

I had 2 basic requirements that I wanted to implement:

  • AD integration for passwords

Although the thought of making the ESX hosts reliant on a Microsoft technology gives me the “willies“, it is the de facto authentication method in most enterprises.  As I didn’t want everyone logging in under the one account, password management for multiple accounts quickly becomes impossible when you have more than a handful of host servers.  AD integration means you can offload the burden of maintaining local passwords.

  • Use of sudo

In my experience, it has become quiet common for companies to create a single root password across all their ESX servers and share this amongst the administrators.  These days no-one would create a single Domain Admins account for their Windows computers and share this around their staff, encouraging everyone to log in with it.

There are several approaches to reducing the (obvious) risk that this creates.  For example, VMware disables root access via SSH as a default, but this is usually the first thing most people enable once the install is finished.  I don’t purport to be any sort of security expert, and I certainly don’t think my solution below is the most secure possible, but I do consider it a sensible medium of security versus convenience. We all know that if its anything more than a mild nuisance, then we’ll just break it open.

How to implement this in a kickstart script

I will explain each part of the script, but it is worth noting that all the commands can be run on the Service Console, or from a shell script, if you want to retroactively fit this sort of user model to an existing server.  It was tested to run on ESX 4 servers, but should run fine against ESX 3.x hosts.

%post –interpreter=bash

# Enable  AD Authentication
/usr/sbin/esxcfg-auth –enablead –addomain=[DOMAIN] –addc=[DOMAIN]

This allows the local accounts to authenticate against your AD domain.  I found the –addc option would run fine if I just specified the domain instead of hard coding it to an individual DC.  There are several additional switches available for kerberos authentication, however I found that in my test environment I didn’t need to stipulate them.  Your mileage will undoubtedly vary, depending on your AD mode and setup .  There are some excellent guides out there, if you need to add this in.

# Give new accounts the path variables to run esxcfg commands
sed -e “s/PATH=$PATH:$HOME/bin/PATH=$PATH:/usr/local/sbin:/sbin:/usr/sbin:$HOME/bin/g” /etc/skel/.bash_profile > /etc/skel/.bash_profile.new
mv -f /etc/skel/.bash_profile.new /etc/skel/.bash_profile

This adds in all the normal root path variables to new user accounts, so when using sudo you don’t need to specify the whole path. This is one of those things that isn’t strictly necessary, but without makes using sudo such a pain for the uninitiated that users get fed up with “change”.

# Help identify when logged in as root
echo “PS1='[e[31m]u@h:w#[e[m]'” >> /root/.bashrc
echo “PS1='[e[32m]u@h:w#[e[m]'” >> /etc/skel/.bashrc

Again another nicety that I like to add in.  It just helps to highlight when you are “su”ing or logging in as root.

# Add enterprise Groups and Users
/usr/sbin/groupadd -g 5000 esxadmin
/usr/sbin/useradd -u 501 -G esxadmin tom -m
/usr/sbin/useradd -u 502 -G esxadmin dick -m
/usr/sbin/useradd -u 503 -G esxadmin harry -m

# Add local users needing admin access
# /usr/sbin/useradd -u 601 -G esxadmin [LOCAL_USER1] -m
# /usr/sbin/useradd -u 602 -G esxadmin [LOCAL_USER2] -m

Firstly, this creates a group called “esxadmin”.  It then creates local accounts for 3 users: tom, dick and harry and adds them to the group. The second section is commented out, but allows for additional accounts to be added.  My thinking here is that in a largish enterprise environment there will always be some users that need to log into all ESX servers – your “domain admins” of the ESX world if you like.  You would leave their names in the script for all your servers.  However, you’re likely to have some administrators that are specific to just a few local servers, so these would be added in on a per server basis.  The usernames used here have to match their AD usernames.

# Add esxadmin to sudoers
echo #
echo “# Allow esxadmin group to sudo” >> /etc/sudoers
echo %esxadmin ALL = (ALL) ALL >> /etc/sudoers

This allows all members of the esxadmin group to run commands using sudo with effectively the elevated privileges of root.

# Allow ROOT access using SSH
sed -e ‘s/PermitRootLogin no/PermitRootLogin yes/’ /etc/ssh/sshd_config > /etc/ssh/sshd_config.new
mv -f /etc/ssh/sshd_config.new /etc/ssh/sshd_config
service sshd restart

Now this section is a little controversial :).  Why go to all this trouble and then allow root access via SSH.  Well I have included it for completeness, as its a common request.  There is a good reason that you may choose to include it though.  If the service console cannot connect to a DC for whatever reason (networking problem, DC is offline, vswif0 is screwed,…), then you won’t be able to log in with one of your local esxadmin accounts.  Imagine your whole environment is virtualised including all DCs and you start to see the chicken and egg possibilities. However, you can always log in with the root password.  So this isn’t an issue if all your hosts are in the server room next door, you have an iLO/RSA/DRAC card in them all, or have remote access to the console KVM.  If you don’t, then you might want to leave this in.

# Enable the SSH client (Out/From an ESX hosts)
/usr/sbin/esxcfg-firewall -e sshClient

This just let’s you bounce from one server to the next.  Effectively saves you having 8 different putty sessions open on your desktop at once.  It also allows you to do thinks like SCP files across to another host.

# Enable TCP outgoing kerberos, there are issues with udp and enable blockOutgoing
/usr/sbin/esxcfg-firewall –openport 88,tcp,out,KerberosClientTCP
/usr/sbin/esxcfg-firewall –openport 53,tcp,out,dns
/usr/sbin/esxcfg-firewall –blockOutgoing

Lots of people warned that the above was needed to get around some issues with the AD authentication.  I’m not sure if this has been fixed since then, and haven’t had a chance to test it myself, so I’ve included it here.

# Remove dangerous default of ctrl-alt-del from inittab
sed -e ‘s/ca::ctrlaltdel/# ca::ctrlaltdel yes/’ /etc/inittab > /etc/inittab.new
mv -f /etc/inittab.new /etc/inittab

This snippet fixes this issue.  I’ve been told that this default is going to be changed in an upcoming patch, but until then this removes the threat.

# SSH Legal Message…
echo  >> /etc/banner
echo  ************************************************************************* >> /etc/banner
echo  *   Legal banner if required                                            * >> /etc/banner
echo  ************************************************************************* >> /etc/banner
echo  >> /etc/banner
echo Banner /etc/banner >> /etc/ssh/sshd_config

If you need a message displayed on the console when a user logs in, then this takes care of it.

# Create post config script
cat << EOF > /etc/rc3.d/S99postconf
#!/bin/bash

# Allow hostd etc. some time to load
/bin/sleep 90

# Grant the group named esxadmin admin permission to ha-folder-root
/usr/bin/vmware-vim-cmd vimsvc/auth/entity_permission_add vim.Folder:ha-folder-root esxadmin true Admin true

# Reset system to normal boot mode
echo “Removing automated post script.”
rm /etc/rc3.d/S99postconf
EOF
chmod +x /etc/rc3.d/S99postconf

This last section runs after the first reboot and gives the local esxadmin group “Administrator” privileges.  This allows the local accounts in the esxadmin group to log into the host directly with the vSphere GUI client.

What’s the end result?

Once all these steps are implemented, the users tom, dick and harry can log into their ESX server using their regular AD accounts and passwords.  They will be able to run commands that normally need root privileges using sudo, all without having to know the root password.  All the commands will be logged against their own user accounts so everything is now auditable and bit more SOX compliant.

New: esxtop precis

This precis is a handy little guide to using the premier vSphere performance tools – esxtop and vscsiStats.  These are some of the most useful tools in a VMware administrator’s arsenal, but as most people don’t need to use them daily, it can be difficult to remember what’s important.

The card was inspired by Duncan Epping’s fantastic post, highlighting what he regards as the more important fields, providing some thresholds, and what you might look at doing to resolve the problems.  esxtop provides so much potential information, it can be overwhelming trying to figure out what you should be looking for.  When Mr VMware says this is what he looks for, then you sit-up and take notice.

I’ve also added in a short guide to using vscsiStats.  This tools helps to identify performance issues specifically with your storage.  Most performance experts will tell you that storage issues lead to more performance problems than any other single bottleneck.

I’ve formatted this precis as credit card sized, so you can slip it into your wallet and always have it on you.  It would be great to print out on business cards (Duncan thinks I should make a batch for VMworld :)), unfortunately business cards are different sizes the world over.  Hence the credit card size, as they are the same and it should fit in everyone’s wallet (or purse) nicely.

Anyone know where you can get credit card sized things nicely printed or laminated?

Head over to the esxtop page and grab yourself the latest version.

Create local VMFS with 8MB block size during ESX4 kickstart install

I’ve been creating a new kickstart script, to automate ESX 4 deployments.  There were a couple of pieces missing that I couldn’t find a good solution to online.  This was the first. Hopefully recording it here might save someone else a bit of time.

Before I start, I would like to point everyone to this excellent blog post by Mike La Spina.  If you are thinking about using kickstart for your ESX deployments, then this post is fantastic.  Thanks Mike.

Problem: When you create VMFS partitions during the ESX install, they are all created with a 1MB block size.

To get around this, I found a couple of suggestions:

  • This forum post has a suggestion by Mike, however it doesn’t seem to work as advertised.
  • PatrickD has an interesting solution on the forums, which runs before the install and effectively changes the default to 8MB (I found this via Duncan’s post here).  However this would mean all VMFS partitions would take this default during the install and I’m always concerned that the format on the DVD might change and break the script.
  • Gabe has a great post here, about doing this manually at the command line.  But I wanted something scriptable.

To be clear, I wanted a VMFS partition that would house my esxconsole.vmdk file and a separate VMFS partition to fill the rest of the remaining local disk.  This has the advantage of splitting the VMs from the OS, meaning VM snapshots are never going to impact your OS and rebuilding the OS at a later stage should be much less invasive on your precious data.

Here is the partition structure that I was looking for:

# Create new partitions
part /boot –fstype=ext3 –size=1100 –onfirstdisk
part None –fstype=vmkcore –size=100 –onfirstdisk
part  [HOSTNAME]-cos –fstype=vmfs3 –size=40960 –onfirstdisk
part temp_partition –fstype=vmfs3 –size=1024 –maxsize=2047000 –grow –onfirstdisk
virtualdisk cos –size=35840 –onvmfs=[HOSTNAME]-cos
part swap –fstype=swap –size=1600 –onvirtualdisk=cos
part /var –fstype=ext3 –size=2048  –onvirtualdisk=cos
part /tmp –fstype=ext3 –size=2048 –onvirtualdisk=cos
part /home –fstype=ext3 –size=2048 –onvirtualdisk=cos
part /opt  –fstype=ext3 –size=20480 –onvirtualdisk=cos
part / –fstype=ext3 –size=5120 –onvirtualdisk=cos

This creates the following physical partitions (you can see this with fdisk after the build):

  1. primary partition – /boot
  2. primary partition – vmkcore
  3. extended partition
  4. unused primary partition
  5. VMFS partition for my esxconsole file (40GB)
  6. VMFS partition for my VMs

It is the last partition that I want to create with an 8MB block size. So using the great suggestion from Gabe above, I’ve made it work in the kickstart script.

To do this, I create the VMFS partition during the kickstart install in the usual way with this line:

part temp_partition –fstype=vmfs3 –size=1024 –maxsize=2047000 –grow –onfirstdisk

This creates a partition called “temp_partition”, 1GB in size and grows it to fill the remaining space.

Then I add a script in the post install section, which runs after the first reboot (thanks to Mike for the script format):

# Create post config script
cat << EOF > /etc/rc3.d/S99postconf
#!/bin/bash

# Allow hostd etc. some time to load
/bin/sleep 90

# Recreate VMFS partition for VMs to an 8MB block size
disk=”`ls /vmfs/devices/disks/vml*6`”;vmkfstools -C vmfs3 -b 8m -S [HOSTNAME]-01 $disk

# Reset system to normal boot mode
echo “Removing automated post script.”
rm /etc/rc3.d/S99postconf
EOF
chmod +x /etc/rc3.d/S99postconf

The important line here is:

disk=”`ls /vmfs/devices/disks/vml*6`”;vmkfstools -C vmfs3 -b 8m -S [HOSTNAME]-01 $disk

Basically it sets a variable “disk”, which is equal to the original disk label given to partition 6 created during the install.  It then uses that to recreate the disk but with an 8MB block size.  How come I know its always going to be disk 6?  I can do this because I know my partitioning structure is always going to be the same – its scripted!

One thing to note – if you choose to use this script in the future to rebuild a host and don’t want to overwrite any existing VMFS volumes in the main kickstart script (by not specifying the –overwritevmfs flag), this post-install script will just ignore that and overwrite it anyway.

Edit: I’ve changed the name of the temporary partition as the old name might have been confusing.