Feb 182019
 
ESXi Fatal error: 8 (Device Error)

Unable to boot ESXi from USB or SD Card on HPE Proliant Server

After installing HPE iLO Amplifier on your network and updating iLO 4 firmware to 2.60 or 2.61, you may notice that your HPE Proliant Servers may fail to boot ESXi from a USB drive or SD-Card.

This was occuring on 2 ESXi Hosts. Both were HPE Proliant DL360p Gen8 Servers. One server was using an internal USB drive for ESXi, while the other was using an HPE branded SD Card.

The issue started occuring on both hosts after a planned InfoSight implementation. Both hosts iLO controllers firmware were upgraded to 2.61, iLO Amplifier was deployed (and the servers added), and the amplifier was connected to an HPE InfoSight account.

Update – May 24th 2019: As an HPE partner, I have been working with HPE, the product manager, and development team on this issue. HPE has provided me with a fix to test that I have been able to verify fully resolves this issue! Stay tuned for more information!

Update – June 5th 2019: Great news! As Bob Perugini (WW Product Manager at HPE) put it: “HPE is happy to announce that this issue has been fixed in latest version of iLO Amplifier Pack, v1.40. To download iLO Amplifier Pack v1.40, go to http://www.hpe.com/servers/iloamplifierpack and click “download”.” Scroll to the bottom of the post for more information!

Please see below for errors:

Errors

ESXi Fatal error: 8 (Device Error)
ESXi Fatal error: 8 (Device Error)
Error loading /s.v00
Compressed MD5: 00000000000000000000000000
Decompressed MD5: 00000000000000000000000000
Fatal error: 8 (Device Error)
Error mboot.c32 attempted DOS system call
Error mboot.c32 attempted DOS system call
mboot.c32: attempted DOS system call INT 21 0d00 E8004391
boot:

Symptoms

This issue may occur intermittently, on the majority of boots, or on all boots. Re-installing ESXi on the media, as well as replacing the USB/SD Card has no effect. Installation will be successful, however you the issue is still experiences on boot.

HPE technical support was unable to determine the root of the issue. We found the source of the issue and reported it to HPE technical support and are waiting for an update.

The Issue and Fix

This issue occurs because the HPE iLO Amplifier is running continuous server inventory scans while the hosts are booting. When one inventory completes, it restarts another scan.

The following can be noted:

  • iLO Amplifier inventory percentage resets back to 0% and starts again numerous times during the server boot
  • Inventory scan completes, only to restart again numerous times during the server boot
  • Inventory scan resets back to 0% during numerous different phases of BIOS initialization and POST.
HPe iLO Amplifier Inventory
HPE iLO Amplifier Inventory

We noticed that once the HPE iLO Amplifier Virtual Machine was powered off, not only did the servers boot faster, but they also booted 100% succesfully each time. Powering on the iLO Amplifier would cause the ESXi hosts to fail to boot once again.

I’d also like to note that on the host using the SD-Card, the failed boot would actually completely lock up iLO, and would require physical intervention to disconnect and reconnect the power to the server. We were unable to restart the server once it froze (this did not happen to the host using the USB drive).

There are some settings on the HPE iLO amplifier to control performance and intervals of inventory scans, however we noticed that modifying these settings did not alter or stop the issue, and had no effect.

As a temporary workaround, make sure your iLO amplifier is powered off during any maintenance to avoid hosts freezing/failing to boot.

To fully resolve this issue, upgrade your iLO Amplifier to the latest version (1.40 as of the time of this update). The latest version can be downloaded at: http://www.hpe.com/servers/iloamplifierpack.

Update – April 10th 2019

I’ve attempted to try downgrading to the earliest supported iLo version 2.54, and the issue still occurs.

I also upgraded to the newest version 2.62 which presented some new issues.

On the first boot, the BIOS reported memory access issues on Processor 1 socket 1, then another error reporting memory access issues on Processor 1 socket 4.

I disconnected the power cables, reconnected, and restarted the server. This boot, the server didn’t even detect the bootable USB stick.

Again, after shutting down the iLo Amplifier, the server booted properly and the issue disappeared.

Update – May 24th 2019

As an HPE partner, I have been working with HPE, the product manager, and development team on this issue. HPE has provided me with a fix to test that I have been able to verify fully resolves this issue! Stay tuned for more information!

Update – June 5th 2019 – ITS FIXED!!!

Great news as the issue is now fixed! As Bob Perugini (WorldWide Product Manager at HPE) said it:

HPE is happy to announce that this issue has been fixed in latest version of iLO Amplifier Pack, v1.40.

To download iLO Amplifier Pack v1.40, go to http://www.hpe.com/servers/iloamplifierpack and click “download”.


Here’s what’s new in iLO Amplifier Pack v1.40:
─ Available as a VMware ESXi appliance and as a Hyper-V appliance (Hyper-V is new)
─ VMware tools have been added to the ESXi appliance
─ Ability to schedule the time of the daily transmission of Active Health System (AHS) data to InfoSight
─ Ability to opt-in and allow the IP address and hostname of the server to be transmitted to InfoSight and displayed
─ Test connectivity button to help verify iLO Amplifier Pack has successfully connected to InfoSight
─ Allow user authentication credentials for the proxy server when connecting to InfoSight
─ Added ability to specify IP address or hostname for the HPE RDA connection when connection to InfoSight
─ Ability to send updated AHS data “now” for an individual server
─ Ability to stage firmware and driver updates to the iLO Repository and then deploy the staged updates at a later date or time (HPE Gen10 servers only)
─ Allow the firmware and driver updates of servers whose iLO has been configured in CNSA (Commercial National Security Algorithm) mode (HPE Gen10 servers only)

  10 Responses to “VMware ESXi Boot Failure from USB or SD Card with HPE iLO Amplifier and InfoSight”

  1. Hey this is interesting because I got the same issue. Thanks for this!

  2. Sorry I meant to ask if there was any update from HP or VMWare?

  3. Hi Faud,

    I reached out to HPe and they were unaware of the issue. I also reached out to the solution architect team for Canada, but haven’t heard anything back.

    Its nice to know someone else experienced the issue!

    I’m not interested in reaching out to VMware because it’s not related to their product, as it’s the HPe side that’s causing the issue.

    Cheers,
    Stephen

  4. Stephen and Faud,

    Please create an HPE support case for this issue. HPE is unable to reproduce the issue and needs additional information.

    Thanks, Bob

  5. Stephen,

    HPE is happy to announce that this issue has been fixed in latest version of iLO Amplifier Pack, v1.40.

    Thank you very much for not only discovering this issue, but for assisting HPE duplicate the issue and testing the fix.

    To download iLO Amplifier Pack v1.40, go to http://www.hpe.com/servers/iloamplifierpack and click “download”.

    Here’s what’s new in iLO Amplifier Pack v1.40:
    ─ Available as a VMware ESXi appliance and as a Hyper-V appliance (Hyper-V is new)
    ─ VMware tools have been added to the ESXi appliance
    ─ Ability to schedule the time of the daily transmission of Active Health System (AHS) data to InfoSight
    ─ Ability to opt-in and allow the IP address and hostname of the server to be transmitted to InfoSight and displayed
    ─ Test connectivity button to help verify iLO Amplifier Pack has successfully connected to InfoSight
    ─ Allow user authentication credentials for the proxy server when connecting to InfoSight
    ─ Added ability to specify IP address or hostname for the HPE RDA connection when connection to InfoSight
    ─ Ability to send updated AHS data “now” for an individual server
    ─ Ability to stage firmware and driver updates to the iLO Repository and then deploy the staged updates at a later date or time (HPE Gen10 servers only)
    ─ Allow the firmware and driver updates of servers whose iLO has been configured in CNSA (Commercial National Security Algorithm) mode (HPE Gen10 servers only)

    Thanks, Bob

  6. Hi Bob,

    Thanks for the update and involving me! I’ve updated the post to reflect the new information.

    Cheers,
    Stephen

  7. Hi guys,

    The problem seems to appears again with 1.50 !

  8. Hi
    This is so helpful, I got the same issue.
    Thanks for this!

  9. You helped me a lot to solve the issue with this bug. in version 2.8, this was preserved, but after understanding the essence of the bug, I was able to fix the server’s performance!!

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)