Tuesday, July 24, 2018

ESXi host upgrade failed - 0x8b in position 513


Most of my time as a Consulting Architect at VMware Professional Services I spend with clients, helping them to create innovative solutions, overcoming challenges, etc.
Since every environment is unique, sometimes I stumble to some weird situations, this past week was one of them.

The client was upgrading their ESXi hosts from version 6.0 to 6.5, while the majority of the hosts went smoothly, a couple of them presented some undesired behavior.

Update Manager was used to remediate the hosts, everything was going fine, the patches have been staged and the first reboot occurred as expected, but during the installation, it crashed with a blue screen and an error message:

*******************
An expected error occurred
See logs for details
UnicoDecodeError: ‘utf-8’ codec can’t decode byte 0x8b in position 513: invalid start byte
*******************

(I'm sorry about the image quality, I was in a hurry trying to figure it out)

And then the installation rollback automatically to ESXi 6.0
Surprisingly all hosts were the same model, installed at the same period, the same way with the same ISO, so there’s nothing special about those hosts we could think off.

After some basic troubleshooting nothing pops up and an internet search for this error did not return anything relevant.
Time to search internally, VOILA ….that’s when I found a couple of past cases with the same behavior.

Long story short, the altbookbank for some reason was corrupted, we never found out why.
The solution was to recreate the altbookbank from the bootbank partition.
First, we got rid of the content in /altbootbank and then we copied the content from /bootbank to it.

Wait a minute, what /altbootbank and /bootbank is all about ?

ESXi keeps two independents copies of its boot partition, bootbank and altbootbank. One of them will have the active image, bootbank, which is used to boot up the system and the other one will have an alternate image, altbootbank, you can imagine that as the last good known state, so in case your boot partition becomes corrupted you can reboot your host from the last good know state (altbootbank).

It really took me a while to figured out how to solve it. I’m publishing it hoping it can save some of your time too, just let me know if you faced this issue too.


Tuesday, July 17, 2018

vSphere Integrated Containers – Affinity rules


Managing a vSphere environment is not about making use of the technology by itself, in fact, it’s leveraging this technology to fulfill business needs what it really matters. Often vSphere Administrators utilize DRS affinity rules to control virtual machines placement specifying a group of hosts which might fulfill those needs, reasons vary from license constraints, specific hardware needs, increase availability etc.

With the advent of vSphere Integrated Containers, VIC, developers can instantiate their own containers, container-vms to be more precise, without the interference of a vSphere Admin, while it increases the agility of the business it also places a new challenge; as containers-vms come and go as need how admins can keep their affinity rules updated in order to fulfill the business need ??? For sure a manual intervention is not up for debate.

Luckily VIC 1.4 brought a new functionality, host affinity. When enabled, there will be a DRS VM Group for each Virtual Container Host, VCH, and as containers are created or deleted this group will be updated accordingly, helping administrators and developers life to adhere to those business need automatically.

During the creation of a VCH, you enable host affinity just specifying ”--affinity-vm-group” option on the vic-machine command line (not yet available on VCH wizard).

A new DRS VM Group will be created with the same name of the VCH, you will also notice VCH VM is part of this group, it’s made that way because it’s impossible to create an empty VM Group, while an empty group can exist as a result of removing all VMs from it.

But what about new existing VCHs ???
Starting with VIC 1.4.1 you can reconfigure them enabling host affinity as well.
 
After creating the VCH, vSphere Administrators just need to create a VM-Host Affinity rule that matches this newly created VM Group and a Host Group, before handling the VCH for the developers.

So every time a developer creates or deletes containers on the VCH, the VM group membership will be updated accordingly and DRS will take care of the scheduling container-vm based on the rule create before automatically.
Enabling higher agility and improving operational efficiency while keeping the business need into account.

If you are still not sure why would you use this feature, I’d like to expose a few use cases;
On a hypothetical scenario of a single vSphere cluster made of 10 hosts distributed on 2 physical racks you may have;

*** Licensing needs ***
Let’s imagine you have an application that is licensed per physical host or processor, to decrease your license cost you might create a host group containing just the hosts you have this application license for and match this group with an affinity rule for the VCH VM Group, this way you don’t need to license your entire cluster;



*** Specific Hardware needs ***
Now if your container benefits from a graphical intensive processor, GPU, you can create a host group containing those hosts and match this group with the VCH VM Group, now those intensive containers will always be scheduled on the right host;



*** Fault Domains ***
Increasing your fault domain is always a plus when it comes to availability.
You can use a Host Group to create a kind of virtual cluster inside your vSphere cluster where the members of this host are spread among the racks. While you cannot guarantee your application will always be spread evenly between racks, HA will restart your container-vms on the remaining hosts in case of rack failure.



But, if you wanna make sure your application will always be spread evenly between racks you can create two VCHs, where each one will have an affinity rule to a Host Group based on hosts of a single rack.
Like VCH01 will use hosts from rack-A and VCH02 will use hosts from rack-B, now you can control the placement of your containers assuring that your application will always be available in case of a rack failure.


As you can see there's so many use cases for this feature but even more important is to support the agility you need while still aligned with your business need.

Do you have a different use case for this feature ? let us know...

Who am I

My photo
I’m an IT specialist with over 15 years of experience, working from IT infrastructure to management products, troubleshooting and project management skills from medium to large environments. Nowadays I'm working for VMware as a Consulting Architect, helping customers to embrace the Cloud Era and make them successfully on their journey. Despite the fact I'm a VMware employee these postings reflect my own opinion and do not represents VMware's position, strategies or opinions. Reach me at @dumeirell

Most Viewed Posts

Blog Archive