I’ve recently been thinking about the practicalities of PXE booting ESXi servers. Sounds great, but how do you make this work in a typical environment?
Using trunked connections on ESXi hosts is very much common place. It’s likely that your ESXi’s Management Network connection, which by default will be your first onboard NIC (vmnic0), is connected to a trunked uplink switch port. Probably the most popular configuration is bonding your Management Network with your vMotion vmknic on a vSwitch with two trunk uplinks which includes vmnic0. The drive towards 10GbE and cable consolidation only increases the likelihood that your vmnic0 will patched into a trunked port.
VMware are starting to pursue solutions using servers’ ability to PXE boot. The potential to PXE boot into an installation routine is not a new concept. VMware’s AutoDeploy and the recently announced PXE Manager fling, uses this technique. In fact not only PXE booting the install, but actually PXE booting the OS itself via the network, or stateless as it is being referred to (although this term really defines something specific, not just PXE booting).
The question comes – how do I PXE boot my servers which are connected to trunked interfaces on the switch? If your servers are physically connected to a trunked connection, then a standard PXE boot won’t tag the traffic appropriately (tell me if I’m wrong – is this something you can set in a server BIOS these days?) You don’t want to re-patch a server’s network cables if you have to quickly rebuild it. Or if you are PXE booting (stateless) then you’d have to do this for each reboot. And you don’t want to trouble your Network Admin to change it back to an access port every time.
This is where I think Native VLANs can help out. As a vSphere server guy, what I know about Native VLANs is VMware’s advice that you avoid tagging traffic with VLAN 1, because this is what Cisco set as the default Native VLAN for switches. When thinking about VLAN IDs for your trunked ESXi ports, you just choose something other than 1. But Native VLANs could provide a solution to the problem of PXE booting on trunks.
If the interface for your vmnic0 has a Native VLAN, then when the server tries to PXE boot, it can get out onto the network. If untagged traffic is being received on a switch’s trunked interface, then it will assume it is for that interface’s Native VLAN. You could have the Native VLAN set as the same VLAN as your Management Network subnet. Then it will PXE boot straight on to the same subnet that it will get once the Management Network is brought up. Alternatively, if you only want to PXE boot into an installer, you could set your Native VLAN to a special build subnet. Once the server is built, then the Management Network traffic is tagged back on to your regular trunked VLAN.
So what do you think? Feasible, secure enough, any potential issues? Or do you have other ways you set this up in your environment that you can recommend to everyone?
This is indeed a workable solution and is how we have tested auto deploy, PXE Manager and also how we PXE boot for scripted installs.
Great article, Forbes, and you have indeed found a solution to PXE booting on a trunked interface. I provided more detail on the interaction between vSwitch port groups and the native VLAN back in 2007 in this article:
http://blog.scottlowe.org/2007/11/13/esx-server-and-the-native-vlan/
Basically, whatever VLAN you want to use as your build VLAN should be marked as the native VLAN, and traffic on that VLAN will be untagged and will therefore work fine during a PXE boot, scripted install, etc., where VLAN support isn’t present.
Thanks Scott. Yeah I realise my revelation isn’t anything new per se, but I hadn’t heard it discussed with regard to the PXE booting stateless ESXi servers. I thought it was something that folks might find interesting.
Forbes, your hunch that you can set the VLAN in the BIOS these days was a good one. Most modern PXE boot ROM will allow you to set a VLAN. Another issue I’m running into lately is PXE booting when there only two link-aggregated connections. I should have a post up soon describing the issues and the options around the problem.
I have successfully PXE kickstarted ESXi 5 on a trunked interface.
It was made possible with the following two technologies:
1. Mutiple boot agent(MBA): NIC supports VLAN tag in BIOS, such as MBA of Broadcom NICs
2. VMware ESXi 5: A new parameter, vlanid, is introduced as a boot option.(not the same as vlanid in kickstart configuration file
details:
http://honglus.blogspot.com.au/2012/05/pxe-kickstart-vmware-esxi-5-on-trunked.html
There now is a new option for Auto Deploy called:
Set-DeployOption “vlan-id” 428
This would set the VLAN ID during boot time for the ESXi Auto Deploy to 428. Havent tried it yet though.
yea but vlan-id does not work with a host profile… then what? two step process?
Hi Bryan,
I’m not sure I fully understand. Try this – in Host Profiles > Network Configuration > Host Port Group > Management > VLAN ID configuration (this is on vSphere 5.1, I haven’t checked older versions). This is where you set the VLAN ID. Does this not do what you need?
Thanks,
Forbes