I had a previously stable XenDesktop farm start having issues today, with the client saying they could not connect to any of the Windows 7 desktops in their farm. A quick investigation revealed most of the desktop VMs had been placed in maintenance mode by the Desktop Delivery Controller. It will do this automatically if the VMs do not register after several reboots, see CTX126704 for the registry entries that control this behaviour.
After much investigation looking at the PXE and TFTP services on the Provisioning Servers, checking DHCP scopes weren’t full, confirming switch ACLs and VLANs were correct and more, I found the setting on the DHCP server for Conflict Detection had been recently enabled. It was set to 2 attempts, but it seems this setting caused enough of a delay that the VMs would time out trying to PXE boot. Once the setting was reverted to zero everything was back to normal.
Note that this client’s XenDesktop setup was XenServer 6.0.2, and the XenDesktop VMs were in the same VLAN as the Provisioning Servers so the client VMs could just broadcast to receive the PXE settings. This is one of the preferred options for enabling TFTP high availablility as per Citrix Blog post by Nick Rintalan last year here (option 4 in his post).
I haven’t yet checked to see if this issue manifests itself in all environments, but now that I’ve seen it I recall having similar issues at another site where we couldn’t get PXE boot behaving reliably and we ended up handing out the TFTP server via DHCP options 66 & 67. Hopefully this helps if you are troubleshooting a similar issue, and I’d be keen to hear from anyone that has the same issue and whether my fix above worked for you.