Vmware – Knowledge Base question and answers

Determining whether virtual machines are configured to autostart

Details

VirtualCenter allows you to control the automatic startup/shutdown of virtual machines through Configuration > Virtual Machine Startup/Shutdown.

If however, you are unable to connect to your host through VMware Infrastructure Client, you must use the service console to check if it is enabled.

Solution

Note: Before you begin, refer to Restarting the Management agents on an ESX Server (1003490) for important information on restarting the mgmt-vmware service.

To determine if your virtual machines are configured to autostart:

  1. Log in as root to your ESX host with SSH.
  2. Open the /etc/vmware/hostd/vmAutoStart.xml file in text editor.
  3. Search for a line that reads:<enabled>true</enabled>

    If you find the line, the functionality is enabled. If you do not find the line, the functionality is disabled.

  4. If you want to disable autostart for all virtual machines on the host, remove the line.
  5. If you want to enable the functionality, add the line immediately after the line ending with:</dynamicProperty>
  6. Save your changes and exit.
  7. Restart the management agents on the ESX host. For instructions on restarting the agents, see Restarting the Management agents on an ESX Server (1003490).

Caution: For this change to take effect, you must restart the management agent. This poses a circular logic problem in the case where you are disabling virtual machine autostart because you want to restart the agent. In this case, use the vimsh command as discussed in Restarting hostd (mgmt-vmware) on ESX Server Hosts Restarts Hosted Virtual Machines Where Virtual Machine Startup/Shutdown is Enabled (1003312).

Additional Information

  • ESX Servers can be shut down from the command line using the halt command.
  • ESX Servers can be rebooted from the command line with the reboot command

Troubleshooting vmware-hostd service if it fails or stops responding

Details

The vmware-hostd management service is the main communication channel between ESX hosts and VMkernel. If vmware-hostd fails, ESX hosts disconnects from VirtualCenter/vCenter Server and cannot be managed, even if you try to connect to the ESX host directly.

If vmware-hostd is not working as expected, you may see these errors:

VPXA Log Errors

  • Authd error: 514 Error connecting to hostd-vmdb service instance.
  • Failed to connect to host :902. Check that authd is running correctly (lib/connect error 11)

vCenter Server Errors

  • Unable to access the specified host. It either does not exist, the server software is not responding, or there is a network problem.

When you try to add or reconnect the host to vCenter Server using VMware Infrastructure/vSphere Client, you see the error:

VMware Infrastructure Client could not establish the initial connection with server <your server>. Details: A connection failure occurred.

When you try to connect directly to the ESX host, you see this error in the VMware Infrastructure/vSphere Client:

Unable to access the specified host.  It does not exist, the server software is not responding, or there is a network problem.

Solution

Validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. After each step, attempt to restart the management agents. Please do not skip a step.

Note: For information on restarting mgmt-vmware, see Restarting the Management agents on an ESX or ESXi Server (1003490).

When the vmware-hostd service fails to respond

  1. Verify network connectivity to the ESX service console. For more information, see Testing network connectivity with the Ping command (1003486).
  2. Verify that vmware-hostd is running. For more information, see Verifying that the Management Service is running on an ESX host (1003494).

To verify if the ESX management service (vmware-hostd) is running:Log in as root to your ESX host with an SSH client. For more information, see Connecting to an ESX host using a SSH client (1019852).

Run this command:

ps -ef | grep hostd | grep -v grep

The output appears similar to this if vmware-hostd is running:

[root@server]# ps -ef | grep hostd | grep -v grep
root     23204     1  0 15:27 ?        00:00:00 /bin/sh /usr/bin/vmware-watchdog -s hostd -u 60 -q 5 -c /usr/sbin/hostd-support /usr/sbin/vmware-hostd -u
root     23209 23204  1 15:27 ?        00:04:23 /usr/lib/vmware/hostd/vmware-hostd /etc/vmware/hostd/config.xml -u
[root@server]#

The output appears similar to this if vmware-hostd is not running:

[root@server]# ps -ef | grep hostd | grep -v grep
[root@server]#

  1. Verify that either ports 80 or 443 are open with the command:netstat -an command

    For more information, see Determining if a port is in use (1003971).
    Checking port usage from Windows

To check the listening ports and applications with Netstat:

Open a command prompt. For more information, see Opening a command or shell prompt (1003892).

Run this command:

netstat -ban.

You see an output similar to:

C:\>netstat -ban

Proto  Local Address          Foreign Address        State           PID
TCP    0.0.0.0:<port>         0.0.0.0:0              LISTENING       <process ID>
[<process>.exe]
TCP    0.0.0.0:<port>         0.0.0.0:0              LISTENING       <process ID>
[<process>.exe]

Where <process> is the name of the application, <port> is the port that is being used, and <process ID> is the process ID of the process.

The output shows the processes that are listening, as well as the name of the process and process ID. When reviewing the information it is important to only look at the ports that are listening to ensure that you find the correct application that is listening on that port. If you do not see any process listening for a port, that port is free to be utilized.

When you determine what is listening on the port, you must decide what action needs to be taken to resolve the conflict. This involves stopping a service or uninstalling the application that is utilizing the port.

Checking port usage from Linux / Mac OS / ESX

Note: Mac OS and certain distributions of Linux do not support listing the process name with Netstat. If you are using Mac OS or are seeing errors on your distribution of Linux, follow the lsof instructions below.

To check the listening ports and applications with Netstat:

  1. Open a shell prompt. For more information, see Opening a command or shell prompt (1003892).
  1. In the shell prompt window, run this command:netstat -pan

    You see an output similar to:

    [root@server]# netstat -pan

    Active Internet connections (servers and established)
    Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
    tcp        0      0 0.0.0.0:<port>              0.0.0.0:*                   LISTEN      <process ID>/<process>
    tcp        0      0 0.0.0.0:<port>              0.0.0.0:*                   LISTEN      <process ID>/<process>

    Where <process> is the name of the application, <port> is the port that is being used, and <process ID> is the process ID of the process.

To check the listening ports and applications with lsof:

  1. Open a shell prompt. For more information, see Opening a command or shell prompt (1003892).
  1. In the shell prompt window, run this command:lsof -i -P -n

You see an output similar to:

[root@server]# lsof -i -P -n
COMMAND     PID          USER    FD   TYPE    DEVICE SIZE NODE NAME
<process>   <process ID> root    3u   IPv4    3011        TCP  *:<port> (LISTEN)
<process>   <process ID> root    3u   IPv4    3011        TCP  *:<port> (LISTEN)

Where <process> is the name of the application, <port> is the port that is being used, and <process ID> is the process ID of the process.

The output from either of these two commands shows the processes that are listening, the name of the process, and the process ID. When reviewing the information it is important to only look at the ports that are listening to ensure that you find the correct application that is listening on that port. If you do not see any process listening for a port, than this means that it is free to be utilized.

When you determine what is listening on the port, you must decide what action needs to be taken to resolve the conflict. This involves stopping a service or uninstalling the application that is utilizing the port.

  1. Verify that the /etc/hosts file is written correctly and has entries similar to the following:# Do not remove the following line, or various programs
    # that require network functionality will fail.
    127.0.0.1 <localhost>.<localdomain> <localhost>
    10.0.0.1 <server>.<domain> <server>
  2. Verify that service console partitions have available disk space. If either/ or /var/log is full, then vmware-hostd cannot start because it is trying to write information to a full disk. For more information on disk space usage on the ESX host, see Identifying disk space on an ESX host (1003564).
  3. Verify that there is SAN connectivity and that SAN has been properly added or removed by running the command:ls /vmfs/volumes

    or

    vdf -h

    If the commands take a very long time to complete or report an error, see Identify shared storage issues with ESX (1003659).

  4. Verify that file /etc/vmware/esx.conf is not missing or corrupt. If the file is missing or corrupt, replace it with a backup copy from/var/log/oldconf/. For more information, see Troubleshooting an ESX host that does not boot (10065).

Details

  • When an ESX host boots up and gets to the log in screen you see a line or error similar to:0:00:47.675
  • The line or error changes every time it reboots. It also displays the error:Util : 815 : Status 0xbad0001 trying to get a valid VMKernel MAC address. Various vmkernel subsystems will provide lower quality of service
  • If you restart the ESX host, you see the error:Grub error 15: File not found

Solution

This issue occurs if the esx.conf file is corrupted.

To repair the file:

  1. Select the Troubleshooting or Service Console Only boot option when the GRUB bootloader appears.
  1. Log into the terminal as root.
  2. Run the commands:For ESX 3.x

    esxcfg-boot -p
    esxcfg-boot -b
    esxcfg-boot -r

    For ESX 4.x

    esxcfg-boot -b

  3. Reboot the ESX host using the reboot command.

Note:

  • These commands repair the esx.conf file. If you do not want to reboot immediately you can omit step two and reboot the host at a later time. If you choose this option, VMware recommends restarting the management agents on ESX. To restart the management agents, see Restarting the Management agents on an ESX Server (1003490).
  • Previous ESX 3.0 configuration files can be recovered from /var/log/oldconf/; other ESX hosts with identical configurations can be inspected to compare settings.
  • VMware ESX 4.0 and above only require the esxcfg-boot -b option.

Tags

esx-does-not-boot

  1. Verify that there are no syntax errors in the /etc/vmware/firewall/services.xml file:
    • Check /var/log/vmware/hostd.log for these errors:[2007-10-23 14:48:56.644 ‘ServiceSystem’ 3076444288 verbose] Command finished with status 0
      [2007-10-23 14:48:56.644 ‘FirewallSystem’ 3076444288 verbose] Loading firewall configuration file ‘/etc/vmware/firewall/services.xml’
      [2007-10-23 14:48:56.647 ‘App’ 3076444288 panic] Application error: no element found
    • Run the command:esxcfg-firewall -q

      You may see this error:

      No element found at line 480, column 0, byte 11664 at /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/XML/Parser.pm line 185

      If you observe any of these errors, see Troubleshooting the firewall policy on an ESX host (1003634).

  2. Verify that CPU usage is below 90% by running the command:esxtop

    For more information regarding esxtop, see Using esxtop to Troubleshoot Performance Problems.

    If vmware-hostd is using more than 90% CPU, increase the amount of memory that is assigned to the ESX service console. For more information, see Increasing the amount of RAM assigned to the ESX Server service console (1003501).

    If a third party component is using more than 90% CPU:

    • Check if HP Insight Manager process cmahostd is consuming CPU. If this process is running, upgrade HP Insight Manager. For more information, see Third-Party Software in the Service Console.
    • Check if third party software is running on the service console. If you have 3rd party products installed in the service console, stop the applicable processes and services and attempt to start the management agent. For more information, see Third-Party Software in the Service Console.
  3. Check any virtual machines that were migrated from ESX 2.5.x or P2Ved with VMware Converter. For more information, see vmware-hostd uses a lot of CPU or has generated a core dump on ESX (4718356).
  4. Check for security scanners on your network. For more information, see The ESX Server Management agent fails when scanned by network security scanner (1002707).

If additional assistance is required for any of the above steps, file a support request with VMware Support and note this KB Article IDs in the problem description. For more information, see How to Submit a Support Request.

When the vmware-hostd service fails to start

If the vmware-hostd service fails to start, perform these troubleshooting steps:

  1. Check for failed Network File System (NFS) or Server Message Block (SMB) mounts on the ESX host. If the are failed NFS or SMB mounts, disable or remove the mounts and restart mgmt-vmware.
  1. Check the /etc/vmware/firewall directory for any files other than service.xml. If there are any extraneous files in the directory, move them to an alternate location.
  2. Check for corruption of virtual machine configuration files. For more information, see Re-registering orphaned virtual machines (1007541).
  3. Check for corruption of the /etc/vmware/hostd/config.xml by looking for blank hostd logs. If the config.xml file is corrupt, reinstall it:
    1. Copy the RPM Package Manager from your installation media. On the installation CD it is located in \VMware\RPMS\VMware-hostd-esx-3.x.x-xxxxx.i386.rpm.Note: Be sure to copy the same version of hostd for ESX 3.x that you are using. To find the exact version of hostd you are using, run the command:

      rpm -qa | grep hostd

    2. Run the command:rpm -ivh –replacepkgs VMware-hostd-esx-3.x.x-xxxxx.i386.rpm
  4. Check if there are any third party monitoring applications using port 9080. Here are some third party monitoring applications:
  • Computer Associates (CA) Network System Manager (NSM) (R11)
  • CA Advanced System Manager (ASM) (R11.1)
  • CAeAC – etrust

If a third party monitoring applications is using port 9080, you may see these error messages:

[‘Solo’ 3076436096 info] Micro web server port: 9080

[‘App’ 3076436096 panic] Application error: Address already in use

[‘App’ 3076436096 panic] Backtrace generated

Disabling the services resolves the issue. For more information, see Third-Party Software in the Service Console.

If the issue continues to exist after trying the steps in this article:

Tags

hostd-service-fails hostd-service-stops-responding

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Restarting hostd (mgmt-vmware) on ESX hosts restarts hosted virtual machines where virtual machine Startup/Shutdown is enabled

Details

This is an issue with virtual machines that are set to automatically start or stop and that are hosted on ESX 3.x. Manually shutting down, starting up, or restarting hostd through the service console causes hosted virtual machines that are set to automatically change power states to stop, start, or restart, respectively.

Notes:

Solution

Note: Before you begin, see KB1003490 for important information on restarting the mgmt-vmware service.

Disable Virtual Machine Startup/Shutdown for the ESX host through VirtualCenter or a VMware Infrastructure (VI) Client that is directly connected to the host.

To disable Virtual Machine Startup/Shutdown:

  1. Log in to VirtualCenter.
  2. Select the ESX Server host where you want restart hostd.
  3. Select the Configuration tab.
  4. Select Virtual Machine Startup/Shutdown.
  5. Select Properties.
  6. Deselect Allow Virtual machines to start and stop automatically with the system.

If the host is not reachable through VirtualCenter or the VI Client:

  1. Log in to the ESX Server service console as root.
  2. At the command line run vimsh.
  3. At the [/] prompt, type:hostsvc/autostartmanager/enable_autostart 0
  4. Type exit. You can now safely restart mgmt-vmware (hostd).

[Archived] Troubleshooting the VMware ESX Server Management Service when it will not start

Symptoms

  • ESX host is Not Responding in VirtualCenter/vCenter Server
  • ESX host is Disconnected in vCenter Server
  • Cannot connect an ESX host to vCenter Server. This error message appears in VMware Infrastructure/vSphere Client when you try to add or reconnect the server to vCenter Server:Unable to access the specified host.  It does not exist, the server software is not responding, or there is a network problem.
  • Cannot connect directly to the ESX host with the vSphere Client.  When trying to connect directly to the ESX host with the VMware Infrastructure/vSphere Client, you see an error similar to:VMware Infrastructure Client could not establish the initial connection with server <your server>. Details: A connection failure occurred.

Purpose

This article guides you through troubleshooting the Management Service (vmware-hostd) if it is not starting on your ESX host. The article aims to help you eliminate the common causes for your problem by verifying the ESX host network configuration, storage configuration, and by ensuring the Management Service configuration is not corrupt.

Resolution

Validate that each troubleshooting step below is true for your environment. Each step provides instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

  1. Verify that the ESX Server Management service cannot be restarted. For more information, see Restarting the Management agents on an ESX or ESXi Server (1003490).

Note: If you perform a corrective action in any of the following steps, attempt to restart the ESX Server Management service again.

  1. Verify that there is adequate disk space available on the ESX Server service console. For more information, see  Investigating disk space on an ESX Server (1003564).

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003564

Investigating disk space on an ESX or ESXi host

Symptoms

  • Virtual machine fails to power on
  • You see the error:Could not power on VM: No space left on device. Failed to power on VM
  • The management agent (hostd) cannot start because the root partition is full
  • The vCenter Server agent (vpxa) cannot start because the root partition is full
  • The system cannot create any new files or directories on / or /tmp partition
  • You see the error:no space left on device
  • The /tmp directory is filled with cimclient_root* logs, potentially thousands (~600,000 or more)
  • The vpxa log may contain entries similar to:[2008-10-13 11:02:05.423 ‘Libs’ 3076454304 warning] Cannot make directory /var/run/vmware/root/27591: No space left on device
    [2008-10-13 11:02:05.423 ‘App’ 3076454304 error] Exception: Failed to initialize authd server
    [2008-10-13 11:02:05.423 ‘App’ 3076454304 error] Backtrace:
    [00] eip 0x909dd92
    [01] eip 0x9043444
    [02] eip 0x907f975
    [03] eip 0x908024c
    [04] eip 0x9033d74
  • vMotion fails at 10%
  • VMware High Availability (HA) configuration produces the error:Configuration of host IP address is inconsistent on host <hostname>: address resolved to <IP Address> and <IPAddress>
  • During an ESX host update, the update fails with the following /var/log/vmware/esxupdate.log entry:Encountered error FileIOError: The error data is: Filename – None Message – I/O Error (28) on file : [Errno 28] No space left on device Errno – 10 Description – Unable to create, write or read a file as expected.
  • When you try to clone a virtual machine or when you try to hot migrate (vMotion) the virtual machine to another host, you see the error:A general system error occurred: Failed to create journal file providerFailed to open “/var/log/vmware/journal/1269032951.9″ for write

Purpose

For troubleshooting purposes, you may have to check the available free disk space on your ESX or ESXi host. This article provides steps to check the available disk space and steps to free space if required.

Resolution

Checking disk space usage on the ESX or ESXi service console partitions

To check the free space on an ESX or ESXi service console partitions:

  1. Open a console to the ESX or ESXi host. For more information, see Unable to connect to an ESX host using Secure Shell (SSH) (1003807) or Using Tech Support Mode in ESXi 4.1 and 5.0 (1017910).
  2. Type df -h.For ESX, you see an output similar to:

    Filesystem            Size  Used Avail Use% Mounted on
    /dev/sda2             4.9G  3.0G  1.6G  66% /
    /dev/sda1              99M   18M   77M  19% /boot
    none                  145M     0  145M   0% /dev/shm
    /dev/sda7             2.0G  135M  1.7G   8% /var/log
    [root@server]#

    Note: The partitions shown are from a default installation of an ESX host. If you have modified the partition configuration, the output may appear differently.

    For ESXi, you see an output similar to:

Filesystem                Size      Used Available Use% Mounted on

visorfs                   1.3G    322.3M      1.0G  24% /
vmfs3                    63.3G    570.0M     62.7G   1% /vmfs/volumes/4d71190d-5921bfa8-03ea-001e0be916ba
vfat                    285.9M    135.5M    150.4M  47% /vmfs/volumes/3c3693e8-f77a642a-1910-5c6bdcb26d3a
vfat                      4.0G      2.7M      4.0G   0% /vmfs/volumes/4d71190d-190fbdb0-ff95-001e0be916ba
vfat                    249.7M    102.0M    147.7M  41% /vmfs/volumes/474ef17b-e05aa697-c0fe-f8c0bde4916e
vfat                    249.7M      4.0k    249.7M   0% /vmfs/volumes/51aa187c-89f6786d-a281-5b966197c73c

  1. Review the Use% for each of the listed items. If any of the volumes listed are 100% full, they must be investigated to determine if space can be freed. The most important mount points to investigate on a default installation of ESX are the / and /var/log mounts because if they are full they can prevent proper operation of the ESX host.
  2. When you have finished reviewing the output, type logout and press Enter to exit the system.

Checking disk space usage on a VMFS volume of an ESX or ESXi host

To check the free space on a VMFS volume of an ESX or ESXi host:

  1. Open a console to the ESX or ESXi host. For more information, see Unable to connect to an ESX host using Secure Shell (SSH) (1003807) or Using Tech Support Mode in ESXi 4.1 and 5.0 (1017910).
  2. Determine the free disk space on each filesystem using the command:vdf -h

    The output appears similar to:

    Filesystem            Size  Used Avail Use% Mounted on
    /dev/sda2             4.9G  3.0G  1.6G  66% /
    /dev/sda1              99M   18M   77M  19% /boot
    none                  145M     0  145M   0% /dev/shm
    /dev/sda7             2.0G  135M  1.7G   8% /var/log
    /vmfs/devices         439G     0  439G   0% /vmfs/devices
    /vmfs/volumes/458865ba-b31110fd-43d5-00127994e616
                           68G   47G   20G  69% /vmfs/volumes/San_Storage
    /vmfs/volumes/45b5eb1a-808343db-ecab-00114335854b
                           26G  9.7G   16G  36% /vmfs/volumes/Local_Storage

    Note: The partitions shown are dependant on the VMFS volumes you have defined and presented to the ESX or ESXi host.

  3. Review the Use% for each of the listed items. If any of the volumes listed are 100% full, they must be investigated to determine if space can be freed. If a VMFS volume is full you cannot create any new virtual machines and any virtual machines that are using snapshots may fail.
  4. When you have finished reviewing the output, type logout and press Enter to exit the system.

Note: Another useful command is du -h --max-depth=1 <dir>.This command lists the directories within a given filesystem that contain the largest files. By starting at the root (/) directory and finding the largest directories, you can then drill down into these directories (using cd) and execute the same command recursively until you find the files themselves which are occupying space.

Identifying large files on an ESX or ESXi host

Information may accumulate on disks for several reasons. Log files may grow after a substantial amount of messages are written to them. Content such as virtual machines or ISOs may be copied to the ESX or ESXi host, but placed in an inappropriate location. Coredumps from past outages may have accumulated.

To confirm this, review the size of these directories:

  • The /vmimages/ directory is used to store operating system install files such as the VMware Tools or other ISO files.
  • The /var/core/ and /root/ directories are used to store crash files for on the service console and the VMkernel.
  • The /var/log/ directory is used to store the majority of the logs for the ESX host.
  • The /vmfs/volumes/ Datastores are used to store the virtual machine data.

To review the space consumed by several of these common directories, run this command:

du -cshP /vmimages /var/core /root /var/log  (This command is not available on ESXi 4.0)

If you are unable to determine what is consuming the disk space, use the find command to locate all files matching a given criteria. For example, to find files within /var/ that are larger than 10MB without traversing mount points, use the command:

find / -size +10M -exec du -h {} \; | less
To find files within /var/ that are larger than 1MB without traversing mount points, use the command:

find /var/ -size +1M -mount -exec du -h {} \; | less

Note: The find command is flexible and can be used to find files matching specific criteria. For more information, see the GNU Find documentation.

The preceding link was correct as of March 29, 2011.   If you find the link is broken, provide feedback and a VMware employee will update the link.

Deleting unnecessary files

The following are a list of files that are safe to delete:

After you have determined what is consuming disk space, delete the unnecessary files:

  1. Open a console to the ESX or ESXi host. For more information, see Unable to connect to an ESX host using Secure Shell (SSH) (1003807) or Using Tech Support Mode in ESXi 4.1 and 5.0 (1017910).
  2. Use the rmcommand to permanently delete files. For example:rm /var/log/oldlogfile

    Caution: If you delete a file, there is no way to recover it. Therefore, use caution when deleting files. If you are unsure about deleting a specific file, contact VMware Support for assistance. If a system file is removed inadvertently, it may cause damage to your ESX or ESXi host that can require a re-installation of the software.

Archiving old files

Historical log files on an ESX or ESXi host may be required for reference in troubleshooting or trending. They can be compressed and archived rather than being deleted. Note, see also vmsupport files left on ESX or ESXi host fill the filesystem on which they reside (1026359).

To compress historic log files:

  1. Open a console to the ESX or ESXi host. For more information, see Unable to connect to an ESX host using Secure Shell (SSH) (1003807) or Using Tech Support Mode in ESXi 4.1 and 5.0 (1017910).
  2. Compress the old log files.In ESX, run these commands to compress the old var/log/vmkwarning and /var/log/vmkernel log files:

    tar czvf /tmp/vmkwarning-logs.tgz /var/log/vmkwarning*
    tar czvf /tmp/vmkernel-logs.tgz /var/log/vmkernel.*

  3. In ESX and ESXi, run this command to compress the old /var/log/messageslog files:tar czvf /tmp/messages-logs.tgz /var/log/messages.*
  4. Remove the source files using the command:rm /var/log/vmkwarning.* /var/log/vmkernel.* /var/log/messages.*
  5. Move the new archive files back to your /var/log/partition for long-term storage using the command:mv /tmp/vmkwarning-logs.tgz /tmp/vmkernel-logs.tgz/tmp/messages-logs.tgz /var/log/
  1. Verify that the ESX Server service console firewall is actively allowing connections to be established. For more information, see  Troubleshooting the firewall policy on an ESX host (1003634).

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003634

 

Troubleshooting the firewall policy on an ESX host

Purpose

For troubleshooting purposes, it may be necessary to investigate the firewall policy on an ESX host. If the policy is too restrictive you may experience connectivity issues on your ESX host, such as the inability to connect to an Network Time Protocol (NTP) host or iSCSI array. This article provides you with the steps to verify if your ESX host firewall policy is too restrictive and with the steps to reset the ESX host firewall policy to its default state.

Resolution

Validating if the ESX host firewall policy is too restrictive

Note: In a default installation of ESX 3.x, VMware has provided a firewall to secure the ESX host service console. By default, it blocks incoming and outgoing communication for everything but essential system services used by your ESX host. Communication to and from the host may be interrupted if the firewall policy has become corrupt.

To validate that the ESX host firewall policy is to restrictive:

  1. Log in to your ESX host as root from either an SSH session or directly from the console of the host.
  2. Stop the firewall with the command:service firewall stop

    This stops the firewall and all traffic is allowed to and from the ESX host.

    This output appears:

    [root@host]# service firewall stop
    Stopping firewall                                          [  OK  ]
    [root@host]#

Verify that your problem still exists after stopping the firewall. If the task fails, then the ESX host firewall is not the problem. If it completes successfully, then the ESX host firewall policy is either corrupted or the ports have not been opened properly to allow communication. For more information, see Resetting the ESX host firewall policy.

To restart the ESX host firewall after you have completed the validation:

  1. Log in to your ESX host as root from either an SSH session or directly from the console of the host.
  2. Start the firewall with the command:service firewall start

The following output appears:

[root@host]# service firewall start
Starting firewall                                          [  OK  ]
[root@host]#

  1. Disconnect from the ESX host with the command:logout

Resetting the ESX host firewall policy

Resetting the ESX host firewall policy resets the rules to the default state. Resetting firewall policies is useful for troubleshooting problems with misconfiguration or corrupt configuration of the ESX host firewall.
Caution: All customizations to the firewall policy are lost when you reset the ESX host firewall policy. For more information on customizing the firewall rules, see Service Console Firewall Configuration in the ESX host Configuration Guide.
To reset the ESX host firewall policy:

  1. Log in to your ESX host as root from either an SSH session or directly from the console of the host.
  2. Reset the firewall with the command:esxcfg-firewall -r

    Note: There is no output.

  3. Restart the firewall when returned to the prompt with the command:service firewall restart

    This output appears:

    [root@host]# service firewall restart
    Stopping firewall                                          [  OK  ]
    Starting firewall                                          [  OK  ]
    [root@host]#

Resetting the ESX host firewall policy in ESX 4.x

You cannot stop the firewall service in ESX 4.x. If you try, you see the message:

Firewall can’t be stopped.  To disable the firewall run, esxcfg-firewall –allowIncoming –allowOutgoing

You can allow all packets through the firewall by running the command:

esxcfg-firewall –allowIncoming –allowOutgoing

To return the firewall configuration to its previous settings, run the command:

esxcfg-firewall –blockIncoming –blockOutgoing

Additional Information

There is a known issue with ESX Server 3.0.x where if there are any files other than the service.xml file located in the /etc/vmware/firewall directory the ESX Server Management service (vmware-hostd) fails. If this is the case move all other files to a different directory and restart the firewall service followed by the ESX Server Management service. This issue has been resolved in ESX Server 3.5.0 and above.

ESXi 4.0/ESXi 4.1 does not include a firewall because it runs a limited set of well-known services and prevents the addition of further services. With such restrictions, the factors that necessitate a firewall are significantly reduced. As such, no firewall is integrated in to ESXi. For more information, see the ESXi Configuration Guide.

  1. Verify that no processes are over utilizing the resources on the ESX service console. For more information, see Checking for resource starvation of the ESX Service Console (1003496).

Checking for resource starvation of the ESX/ESXi Service Console

Symptoms

  • High CPU utilization on an ESX/ESXi host.
  • High memory utilization on an ESX host.
  • Slow response when administering an ESX/ESXi host.

Purpose

For troubleshooting purposes it may be necessary to check if any processes are consuming a substantial amount of resources on the service console. Processes consuming a substantial amount of resources can prevent correct operation of the ESX/ESXi system. This article provides you with the steps to check for starvation of resources on the ESX/ESXi host service console.

Resolution

Introduction to performance monitoring

If any process is utilizing a substantial amount of CPU or memory on your ESX/ESXi host service console it can prevent correct operation of the system. ESX/ESXi includes the top utility to be able to check for resource utilization on the service console. It can be used to view the current values for the statistics and to determine if there is starvation of resources on the ESX/ESXi host service console.

To check the utilization of the processes on the service console:

  1. Log in to your ESX host service console as root from either an SSH session or directly from the console of the server.  In case of ESXi login using Tech Support mode.
  2. Type top .
  3. To exit top, press Q.
  4. When you have finished reviewing the output, type logout and press Enter to exit the system.

This screen appears and shows the resource utilization and running processes on the server:

Checking for CPU Starvation of an ESX/ESXi host

The statistics you must review are load average and CPU Idle. These statistics provide an overall indication of how busy the ESX/ESXi host is.

Load average is a measurement of the number of processes that currently waiting in the run-queue plus the number of processes that being executed for one, five, and 15 minute intervals. A load average of 1.00 means that the ESX/ESXi host machine’s physical CPUs are fully utilized, and a load average of 0.5 indicates they are half utilized. A load average of 2.00 indicates that the system is busy.  If the load average is over 4.00, the system is heavily utilized and performance is be impacted.

This screen indicates that the ESX/ESXi Service Console does not have a queue of tasks waiting to process:

A load average similar to this screen indicates that tasks are waiting in the run queue to be processed:

The CPU state counters provide an overview of the CPU utilization in each state on the system. This screen shows a system with a high CPU idle percentage. A high CPU idle means that the system not busy:

If the CPU idle counter output is low, investigate into which state is consuming the CPU time. The different states mean:

  • User is the percentage of the processor time used for running user processes, such as an application.
  • Nice is percentage of the processor time used for a user process that is running with an altered scheduling priority.
  • System is the percentage of the processor time used for a system process, such as kernel or driver calls.
  • Irq is the percentage of the processor time used for hardware interrupt requests.
  • Softirq is the percentage of the processor time used for software interrupt requests.
  • Iowait is the percentage of the processor time waiting on the completion of disk Input/Output.
  • Idle is the percentage of the processor time that processors are free.

This screen shows the CPU idle state at 0%:

The CPU time is being consumed in the iowait state. If the CPU time is being consumed in the iowait state, check the disk subsystem to determine what is causing the delay in response from the storage subsystem.

Note: If the CPU time is being consumed in the user state, you can determine the process that is consuming the CPU from the list of tasks below the statistics. The list of tasks refreshes every few seconds to provide an updated view of the process list. In the following example vmware-hostd is consuming 0.9% of the available CPU:

Checking for Memory Starvation of an ESX host

Note: VMware ESXi does not have a physical service console. Therefore, there is no method to change the amount of RAM assigned to it. Tech Support Mode is only a running a state of the system and does not require additional resources.

Memory and swap are the statistics you need to review. These statistics provide an overall indication of how much memory is being used and if there is heavy swapping occurring on the system. This screen shows an example of the expected output:

The example above indicates that there is 268248KB (268MB) of RAM in the system and that 84864KB (85MB) is free. There is 554168KB (554MB) of swap available in the system and 503152KB (503MB) is free. In this case there is substantial RAM available for the service console to use and therefore very little swapping occurs.

Note: This view only shows you the amount of RAM that is assigned to the ESX host service console, it does not provide a view of the total RAM in the server.

To troubleshoot an ESX host that shows a low amount of RAM and high amount of swapping:

  • Disable any third party services that have been installed for testing. The third party services may be using up memory resources.
  • Try increasing the amount of RAM that has been assigned to the ESX host service console. For more information, see Increasing the amount of RAM assigned to the ESX Server Service Console (1003501).
  • Check all virtual machine configurations to ensure none of them have an unreasonably high CPU reservation, like 10000MHz.

Note: You can also see the amount of memory and swap currently in use from the /proc/meminfo file.

I/O Starvation can be caused by many issues, but commonly occues when a LUN is removed and the ESX/ESXi host is not rescanned. To properly remove LUNs from your ESX/ESXi host, see Unpresenting a LUN containing a datastore from ESX 4.x and ESXi 4.x (1015084).

Additional Information

For more information, see VMware HA configuration fails with a VMWareClusterManager Rule not enabled error (1004495).

Tags

high-cpu-usage  slow-performance  resource-starvation

  1. Verify that the storage configuration for the ESX Server service console is correct.  If the storage is not properly configured the ESX Server management service may stop responding causing the ESX Server to appear to be non-responsive. For more information, see Verifying the storage configuration of an ESX Server (1003659).

Identifying shared storage issues with ESX or ESXi

Purpose

This article helps you identify problems related with the storage subsystem of ESX/ESXi.

Resolution

Troubleshooting ESX host storage issues begins with identifying how far reaching (the scope) the problem is. In many cases, a detected problem may be misidentified until the scope has been ascertained.

To identify the scope of the problem:

  1. Verify that the storage device cannot be seen by any or a subset of the ESX cluster. If so, select the appropriate storage technology:
  2. Verify that no more than a single ESX host cannot see the shared storage. If so, select the appropriate storage technology:
  3. Verify that the LUN is presented and available. For more information, see Troubleshooting LUN connectivity issues (1003955).
  4. Verify that the ESX host cannot see the datastore:

Additional Information

Troubleshooting flow chart:

shared-storage  storage-connectivity  iscsi-connectivity  nfs-connectivity  fibre-channel-connectivity  lun-connectivity

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Note: If your problem still exists after trying the steps in this article, please:

Tags

cannot-access-host connection-failure does-not-exist esx network-issues not-responding not-responding server-software

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Troubleshooting fibre channel storage connectivity

Symptoms

  • No targets from an array can be seen by:
    • All of the ESX hosts
    • All of the ESX hosts on a specific fabric or connected through an ISL link
    • One ESX host
  • Targets on the storage array are visible but one or more LUNs are not
  • LUN not visible
  • LUN cannot connect
  • Connectivity issues to the storage array
  • LUN is missing
  • ESX host initiators are not logging into the array
  • You see one of these errors:
    • Unknown inaccessible
    • SCSI: 4506: “Cannot find a path to device vmhba1:0:8 in a good state”

Purpose

This article guides you through the most common steps to identify a connectivity problem to a shared storage device.

Resolution

Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

These are common items for troubleshooting connectivity issues to the storage array.

  1. Verify that none of the hosts can see the shared storage. For more information, see Obtaining LUN pathing information for ESX hosts (1003973).

Obtaining LUN pathing information for ESX or ESXi hosts

Purpose

This article explains using tools to determine LUN pathing information for ESX hosts.

Resolution

There are two methods used to obtain the multipath information from the ESX host:

  • ESX command line – use the command line to obtain the multipath information when performing troubleshooting procedures.
  • VMware Infrastructure/vSphere Client – use this option when you are performing system maintenance.

ESXi 5.x

Command line

To obtain LUN multipathing information from the ESXi host command line:

  1. Log in to the ESXi host console.
  2. Type esxcli storage core path list to get detailed information regarding the paths. For example:fc.5001438005685fb7:5001438005685fb6-fc.5006048c536915af:5006048c536915af-naa.60060480000290301014533030303130
    UID: fc.5001438005685fb7:5001438005685fb6-fc.5006048c536915af:5006048c536915af-naa.60060480000290301014533030303130
    Runtime Name: vmhba1:C0:T0:L0
    Device: naa.60060480000290301014533030303130
    Device Display Name: EMC Fibre Channel Disk (naa.60060480000290301014533030303130)
    Adapter: vmhba1
    Channel: 0
    Target: 0
    LUN: 0
    Plugin: NMP
    State: active
    Transport: fc
    Adapter Identifier: fc.5001438005685fb7:5001438005685fb6
    Target Identifier: fc.5006048c536915af:5006048c536915af
    Adapter Transport Details: WWNN: 50:01:43:80:05:68:5f:b7 WWPN: 50:01:43:80:05:68:5f:b6
    Target Transport Details: WWNN: 50:06:04:8c:53:69:15:af WWPN: 50:06:04:8c:53:69:15:af
  3. Type esxcli storage core path list -d <naaID>  to list the detailed information of the corresponding paths for a specific device.
  4. The command esxcli storage nmp device list lists of LUN multipathing information:naa.60060480000290301014533030303130
    Device Display Name: EMC Fibre Channel Disk (naa.60060480000290301014533030303130)
    Storage Array Type: VMW_SATP_SYMM
    Storage Array Type Device Config: SATP VMW_SATP_SYMM does not support device configuration.
    Path Selection Policy: VMW_PSP_FIXED
    Path Selection Policy Device Config: {preferred=vmhba0:C0:T1:L0;current=vmhba0:C0:T1:L0}
    Path Selection Policy Device Custom Config:
    Working Paths: vmhba0:C0:T1:L0

    Note: For information on multipathing and path selection options, see  Multipathing policies in ESX/ESXi 4.x and ESXi 5.x (1011340).

vSphere Client

To obtain multipath settings for your storage in vSphere Client:

  1. Select an ESX/ESXi host, and click the Configuration tab.
  2. Click Storage.
  3. Select a datastore or mapped LUN.
  4. Click Properties.
  5. In the Properties dialog, select the desired extent, if necessary.
  6. Click Extent Device > Manage Paths and obtain the paths in the Manage Pathdialog.

For information on multipathing options, see Multipathing policies in ESX/ESXi 4.x and ESXi 5.x (1011340).

ESX 4.x

Command line

To obtain LUN multipathing information from the ESX/ESXi host command line:

  1. Log in to the ESX host console.
  2. Type esxcfg-mpath -b to list all devices with their corresponding paths:naa.6006016095101200d2ca9f57c8c2de11: DGC Fibre Channel Disk (naa.6006016095101200d2ca9f57c8c2de11)
    vmhba3:C0:T0:L0 LUN:0 state:active fc Adapter: WWNN: 20:00:00:1b:32:86:5b:73 WWPN: 21:00:00:1b:32:86:5b:73 Target: WWNN: 50:06:01:60:b0:20:f2:d9 WWPN: 50:06:01:60:b0:20:f2:d9
    vmhba3:C0:T1:L0 LUN:0 state:active fc Adapter: WWNN: 20:00:00:1b:32:86:5b:73 WWPN: 21:00:00:1b:32:86:5b:73 Target: WWNN: 50:06:01:60:b0:20:f2:d9 WWPN: 50:06:01:60:b0:20:f2:d9

    The device ” naa.6006016095101200d2ca9f57c8c2de11″ has 2 paths: vmhba3:C0:T0:L0 and vmhba3:C0:T1:L0.

  3. Type esxcfg-mpath -l to get more detailed information regarding the paths. For example:

fc.2000001b32865b73:2100001b32865b73-fc.50060160c6e018eb:5006016646e018eb-naa.6006016095101200d2ca9f57c8c2de11
Runtime Name: vmhba3:C0:T1:L0
Device: naa.6006016095101200d2ca9f57c8c2de11
Device Display Name: DGC Fibre Channel Disk (naa.6006016095101200d2ca9f57c8c2de11)
Adapter: vmhba3 Channel: 0 Target: 1 LUN: 0
Adapter Identifier: fc.20000000c98f3436:10000000c98f3436
Target Identifier: fc.50060160c6e018eb:5006016646e018eb
Plugin: NMP
State: active
Transport: fc
Adapter Transport Details: WWNN: 20:00:00:1b:32:86:5b:73 WWPN: 21:00:00:1b:32:86:5b:73
Target Transport Details: WWNN: 50:06:01:60:b0:20:f2:d9 WWPN: 50:06:01:60:b0:20:f2:d9

  1. The command esxcli nmp device list lists of LUN multipathing information:naa.6006016010202a0080b3b8a4cc56e011
    Device Display Name: DGC Fibre Channel Disk (naa.6006016010202a0080b3b8a4cc56e011)
    Storage Array Type: VMW_SATP_ALUA_CX
    Storage Array Type Device Config: {navireg=on, ipfilter=on}{implicit_support=on;explicit_support=on; explicit_allow=on;alua_followover=on;{TPG_id=2,TPG_state=ANO}{TPG_id=1,TPG_state=AO}}
    Path Selection Policy: VMW_PSP_FIXED_AP
    Path Selection Policy Device Config: {preferred=vmhba3:C0:T1:L0;current=vmhba3:C0:T1:L0}
    Working Paths: vmhba3:C0:T1:L0

    The Path Selection Policy (PSP) policy is what ESX host uses when it determines which path to use in the event of a failover. Supported PSP options are:

    VMW_PSP_FIXED Fixed Path Selection
    VMW_PSP_MRU Most Recently Used Path Selection
    VMW_PSP_RR Round Robin Path Selection
    VMW_PSP_FIXED_AP Fixed Path Selection with Array Preference (introduced in ESX 4.1)

    Note: For information on multipathing and path selection options, see Multipathing policies in ESX/ESXi 4.x and ESXi 5.x (1011340).

vSphere Client

To obtain multipath settings for your storage in vSphere Client:

  1. Select an ESX/ESXi host, and click the Configuration tab.
  2. Click Storage.
  3. Select a datastore or mapped LUN.
  4. Click Properties.
  5. In the Properties dialog, select the desired extent, if necessary.
  6. Click Extent Device > Manage Paths and obtain the paths in the Manage Pathdialog.

For information on multipathing options, see Multipathing policies in ESX/ESXi 4.x and ESXi 5.x (1011340).

ESX 3.x

Command line

To obtain LUN multipathing information from the ESX host command line:

  1. Log in to the ESX host console.
  2. Type esxcfg-mpath -l and press Enter.The output appears similar to the following:

    Disk vmhba2:1:4 /dev/sdh (30720MB) has 2 paths and policy of Most Recently Used

    FC 10:3.0 210000e08b89a99b<-> 5006016130221fdd vmhba2:1:4 On active preferred
    FC 10:3.0 210000e08b89a99b<-> 5006016930221fdd vmhba2:3:4 Standby

    Disk vmhba2:1:1 /dev/sde (61440MB) has 2 paths and policy of Most Recently Used

    FC 10:3.0 210000e08b89a99b<->5006016130221fdd vmhba2:1:1 On active preferred
    FC 10:3.0 210000e08b89a99b<->5006016930221fdd vmhba2:3:1 Standby

    In this example, two LUNs are presented.

    As there are no descriptions given, here is an analysis of the information provided for the first LUN:

    • vmhba2:1:4 – This is the canonical device name the ESX host used to refer to the LUN.

      Note
      : When there are multiple paths to a LUN, the canonical name is the first path that was detected for this LUN.
    • /dev/sdh – This is the associated Linux device handle for the LUN. You must use this reference when using utilities like fdisk.
    • 30720MB – The disk capacity of the LUN, e.g 30720MB or 30GB.
    • Most Recently Used- This is the policy the ESX host uses when it determines which path to use in the event of a failover. The choices are:
      • Most Recently Used: The path used by a LUN is not be altered unless an event (user, ESX host, or array initiated) instructs the path to change. If the path changed because of a service interruption along the original path, the path does not fail-back when service is restored. This policy is used for Active/Passive arrays and many pseudo active/active arrays.
      • Fixed: The path used by a LUN is always the one marked as preferred, unless that path is unavailable. As soon as the path becomes available again, the preferred becomes the active path again. This policy is used for Active/Active arrays. An Active/Passive array should never be set to Fixed unless specifically instructed to do so. This can lead to path thrashing, performance degradations and virtual machine instability.
      • Round Robin: This is experimentally supported in ESX 3.x. It is fully supported in ESX 4.x.Note: See the Additional Information section for references to the arrays and the policy they are using
    • FC- The LUN disk type. There are three possible values for LUN disk type:
      • FC: This LUN is presented through a fibre channel device.
      • iScsi: This LUN is presented through an iSCSI device
      • Local: This LUN is a local disk
    • 10:3.0 – This is the PCI slot identifier, which indicates the physical bus location this HBA is plugged into.
    • 210000e08b89a99b – The HBA World Wide Port Numbers (WWPN) are the hardware addresses (much like the MAC address on a network adapter) of the HBAs.
    • 5006016930221fdd – The Storage processor port World Wide Port Numbers (WWPN) are the hardware addresses of the ports on the storage processors of the array.
    • vmhba2:1:4 – This is the true name for this path. In this example, there are two possible paths to the LUN (vmhba2:1:4 and vmhba2:3:4 ).
    • On active preferred – The Path status contains the status of the path. There are six attributes that comprise the status:
      • On: This path is active and able process I/O. When queried, it returns a status of READY.
      • Off: The path has been disabled by the administrator.
      • Dead: This path is no longer available for processing I/O. This can be caused by physical medium error, switch, or array misconfiguration.
      • Standby: This path is inactive and cannot process I/O. When queried, it returns a status of NOT_READY
      • Active: This path is processing I/O for the ESX Server host
      • Preferred: This is the path that is preferred to be active. This attribute is ignored when the policy is set to Most Recently Used (mru).

3.     VI Client

To obtain multipathing information from VI Client:

  1. Select an ESX host.
    1. Click the Configuration tab.
    2. Click Storage.
    3. Click the VMFS-3 datastore you are interested in.
    4. Click Properties.The following dialog appears:

      From this example, you can see that the canonical name is vmhba2:1:0 and the true paths are vmhba2:1:0 and vmhba2:3:0 .
      The active path is vmhba2:1:0 and the policy is Most Recently Used.

    5. Click Manage Paths. The Manage Paths dialog appears:

Additional Information

For more information, see the documentation for your version of ESX and consult the Storage/SAN Compatibility Guide.

  1. Verify that a rescan does not bring the LUNs back. For more information, see Performing a rescan of the storage (1003988).

Performing a rescan of the storage

Purpose

This article explains how to perform a rescan of storage devices. A rescan of storage devices is needed when a storage device has been added, removed, or changed from the array.

Resolution

You can perform a rescan in these ways:

Note: Performing a rescan does not cause a service interruption.

Using the VMware vSphere or VI Client to perform a rescan

To rescan using the vSphere or VI Client:

  1. Log in to the client and select an ESX/ESXi host in your inventory.
  2. Click the Configuration tab.
  3. Click Storage Adapters.
  4. Click the Rescan link.
  5. Click OKto begin the rescan.Note: This performs a rescan of every installed Hardware Bus Adapter (HBA), regardless of the HBA that is selected in the Storage Adapters view.

    The progress of the rescan can be monitored from the ESX/ESXi host console in the /var/log/vmkernel (for ESX hosts) or /var/log/messages (for ESXi) logfiles.

Note: The Rescan in the VI Client, by default, combines the rescan for new LUNs (and removal of retired ones) with the detection of new VMFS data stores, depending on which check boxes are selected when the rescan is initiated. The rescan and datastore detection are asynchronous processes. As a result, the detection process for new data stores may complete before the detection process for new LUNs is complete. You may need to perform the rescan twice if the newly added LUN has a VMFS data store on it, or perform an HBA rescan and VMFS rescan in separate tasks. You may select either or both of the two to be performed, per a modal dialog box when initially beginning a rescan.

Using the ESXESXi 4.x and earlier host command line

To perform a rescan from the ESX/ESXi host command-line:

  1. Log in to the ESX/ESXi host console.
  2. Run the command:esxcfg-rescan <vmkernel SCSI adapter name>

    Where <vmkernel SCSI adapter name> is the vmhba# to be rescanned.

    Note: The rescan must be performed on each HBA that is attached to the storage that changed. In ESX 4.x there may not be any output if there are no changes.

    When rescanning a fibre channel Host Bus Adapter (HBA) or local storage, you see an output similar to:

    Rescanning vmhba2…done.
    On scsi3, removing: 0:0 1:0 1:1 1:2 1:3 1:4.
    On scsi3, adding: 0:0 1:0 1:1 1:2 1:3 1:4.

    When rescanning an iSCSI HBA, you see an output similar to:

    Doing iSCSI discovery. This can take a few seconds …
    Rescanning vmhba1…done.
    On scsi2, removing: 0:0 0:10 1:0.
    On scsi2, adding: 0:0 0:10 1:0.

    Note: You do not need to rescan local storage.

    Although the first pass states that it is removing LUNs, no LUN is removed until after the adding phase is complete. Any LUN that was not marked as adding is removed.

  3. To search for new VMFS datastores, run this command:vmkfstools -V

    Note: This command does not generate any output.

    If a new datastore has been detected, it is mounted in /vmfs/volumes/ using its friendly name (if it has one) or its UUID.

Using the ESXi 5.x and later host command line

To perform a rescan from the ESXi host command-line:

  1. Log in to the ESXi host console. For more information, see Using Tech Support Mode in ESXi 4.1 (1017910).
  2. To rescan, run one of these commands:
    • To rescan all HBAs:
      esxcli storage core adapter rescan –all
    • To rescan a specific HBA:
      esxcli storage core adapter rescan –adapter <vmkernel SCSI adapter name>Where <vmkernel SCSI adapter name> is the vmhba# to be rescanned. To get a list of all adapters, run the esxcli storage core adapter list command.

      Note: There may not be any output if there are no changes.

  3. To search for new VMFS datastores, run this command:vmkfstools -V

    Note: This command does not generate any output.

    If a new datastore has been detected, it is mounted in /vmfs/volumes/ using its friendly name (if it has one) or its UUID.

Tags

lun-paths lun-rescan

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

  1. Verify the connectivity to the LUNs. For more information, see Troubleshooting LUN connectivity issues (1003955).

Troubleshooting LUN connectivity issues

Symptoms

  • Targets on the storage array are visible but one or more LUNs are not visible
  • LUN not visible
  • LUN cannot connect
  • LUN is missing
  • LUN not presented

Purpose

This document assists you in troubleshooting a scenario where LUNs are missing.

Resolution

Calidate that each troubleshooting step below is true for your environment. Each step provides instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Do not skip a step.

These steps assist you in identifying a LUN connectivity issue:

  1. Verify that the LUN is presented to the ESX host. You may need to contact your array vendor for assistance.
    1. Verify that the LUN is in the same storage group as all the ESX hosts (if applicable to the array).
    2. Verify that the LUN is configured correctly for use with the ESX host.

      Note: Consult the appropriate SAN configuration guide for your array (listed in the Additional Information section).
    3. Verify that the LUN is not set to read-only on the array.
  2. Verify that the ESX host can see the LUN(s). For more information, see Obtaining LUN pathing information for ESX hosts (1003973).Note: If LUNs are not visible on the ESX host, see:

Verify that a rescan restores visibility to the LUN(s). For more information, see Performing a rescan of the storage (1003988) and Interpreting SCSI sense codes (289902).

Check the storage for latency. For more information, see Using esxtop to identify storage performance issues (1008205).

Verify that SCSI reservation conflicts are not in excess. For more information, see:

For issues related to the VMFS datastore, see:

Note: If your problem still exists after trying the steps in this article:

Additional Information

For ESX/ESXi 4.1, see:

For ESX / ESXi 4.0, see:

For ESX / ESXi 3.x, see:

For ESX Server 3.5 and 3i (Embedded and Installable), see:

  1. Verify that the fibre switch zoning configuration permits the ESX host to see the storage array. Consult your switch vendor if you require assistance.
  1. Verify that the fibre switch propagates RSCN messages to the ESX hosts. For more information, see Configuring fibre switch so that ESX Server doesn’t require a reboot after a zone set change (1002301).

Configuring fibre switch so that ESX Server doesn’t require a reboot after a zone set change

Details

A change was made to the active zone set of the fabric switches. After a rescan from the VMware Infrastructure/vSphere Client or the ESX command line, all targets affected by the zoning configuration changes are not visible. These targets become visible after the ESX has been rebooted.

Solution

When a change occurs on an active zone set of a fabric switch, most fibre channel switches issue a Register for State Change Notification (RSCN) event to the devices attached to them, such as ESX Servers and storage arrays. The Host Bus Adapter (HBA) drivers used on ESX Server register with the fabric switch to receive RSCN events.  However, the fabric switch may be configured to not issue these events, preventing the ESX Server from receiving these events. This causes target visibility and failover problems on the ESX Server.

The following activities are examples of zone set changes:

  • Adding a zone
  • Removing zones
  • Modifying zones
  • Activating zone sets
  • Deactivating zone sets
  • Enabling and disabling the default zone set

These switches can be configured to suppress RSCN events:

  • Brocade SilkWorm 4100 series switch (re-branded McData Sphereon-3232 series switch).
  • EMC connectrix ED-140M switch.

To enable RSCN events, configure the Switch Operating Parameters so that the Suppress Zoning RSCN on Zone Set Activations is disabled.

Other fibre switches may also be configured to suppress or allow RSCN events. For more information on configuring the fabric switch operating parameters, please contact your switch vendor.

Tags

fibre-channel-connectivity fibre-switch-zoning

  1. Verify that the storage array is listed in the VMware Hardware Compatibility Guide. For more information on confirming hardware compatibility, see Verifying ESX/ESXi host hardware (System, Storage and I/O) devices are supported (1003916).

    Note: Some array vendors have a minimum microcode/firmware version that is required to work with ESX. Consult your array vendor.

Confirming ESX/ESXi host hardware (System, Storage, and I/O) compatibility

Details

This article provides links to ESX/ESXi host Hardware Compatibility Documents (HCLs) so that you can verify your System, Storage, and I/O devices are on the VMware Certified and Supported Hardware Compatibility Lists.

Additionally, you can also verify if your systems and hardware require specific BIOS and firmware versions. If your System, Storage, or I/O devices are not listed or no specific BIOS or firmware versions are listed, contact your OEM or third party vendor for further verification and support.

Solution

VMware Hardware Compatibility Guides

Compare your hardware information with the VMware ESX Server Systems, I/O, and SAN Compatibility guides located at VMware Hardware Compatibility Guides. Review these lists to verify correct system BIOS and firmware levels.

If you have any additional questions, contact your OEM hardware vendor directly to verify that your hardware has the recommended BIOS and firmware versions for all hardware installed in your system and storage devices.

Confirming Hardware Compatibility

To confirm hardware compatibility:

  1. Check the ESX/ESXi host for for host hardware info by running following command on ssh terminal as root:#esxcfg-info | less -i

    You see an output similar to:

    |—-Product Name………………………………………ProLiant DL380 G6
    |—-Vendor Name……………………………………….Hewlett-Packard

    Note: Log in to the ESXi host physical console using Tech Support Mode. For more information, see Tech Support Mode for Emergency Support (1003677).

  2. Identify the SCSI shared storage devices by doing the following:

For ESX 3.x, run the command:

cat /proc/vmware/scsi/vmhba#/#:#

Note: The vmhba#/#:# represents the canonical name for the path. For more information, see Identifying disks when working with VMware ESX (1014953).

The output for ESX 3.x is similar to:

Vendor: DGC Model: RAID 5 Rev: 0324

For ESX 4.x, run the command:

esxcfg-scsidevs -l | egrep -i ‘display name|vendor’

The output for ESX 4.0 is similar to:

Display Name: Local ServeRA Disk (mpx.vmhba0:C0:T0:L0)
Vendor: ServeRA Model: 8k-l Mirror Revis: V1.0

  1. Run the following command from the ESX host service console to find additional peripherals and devices:lspci -vvv

    You see an output similar to:

    02:0e.0 RAID bus controller: Dell Computer Corporation PowerEdge Expandable RAID Controller 4E/SI/DI (rev 06)
    Subsystem: Dell Computer Corporation: Unknown device 016d
    Flags: bus master, stepping, 66Mhz, medium devsel, latency 64, IRQ 24
    Memory at d80f0000 (32-bit, prefetchable) [size=64K]
    Memory at dfdc0000 (32-bit, non-prefetchable) [size=256K]
    Expansion ROM at dfe00000 [disabled] [size=128K]
    Capabilities: [c0] Power Management version 2
    Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable-
    Capabilities: [e0] PCI-X non-bridge device.

    06:07.0 Ethernet controller: Intel Corporation 8254NXX Gigabit Ethernet Controller (rev 05)
    Subsystem: Dell Computer Corporation: Unknown device 016d
    Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 25
    Memory at dfae0000 (32-bit, non-prefetchable) [size=128K]
    I/O ports at ecc0 [size=64]
    Capabilities: [dc] Power Management version 2
    Capabilities: [e4] PCI-X non-bridge device.

    07:08.0 Ethernet controller: Intel Corporation 8254NXX Gigabit Ethernet Controller (rev 05)
    Subsystem: Dell Computer Corporation: Unknown device 016d
    Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 26
    Memory at df8e0000 (32-bit, non-prefetchable) [size=128K]
    I/O ports at dcc0 [size=64]
    Capabilities: [dc] Power Management version 2
    Capabilities: [e4] PCI-X non-bridge device.

  2. Compare your hardware information to the VMware ESX Server Systems, I/O, and SAN Compatibility guides.
  1. Verify that the initiator is registered on the array, and that the storage array is configured correctly. You may need to contact your storage vendor for instructions on this procedure. For more information, see the Fibre Channel SAN Configuration Guide for your version of ESX/ESXi.
  2. Verify the physical hardware:
    • The storage processors on the array.
    • The fibre switch and the Gigabit Interface Converter (GBIC) units in the switch.
    • The fibre cables between the fibre switch and the array.
    • The array itself.

Partner with the hardware vendor to ensure that the array is properly configured.

Note: A rescan is required after any change is made to see if the targets are detected.

Note: If your problem still exists after trying the steps in this article, please:

Troubleshooting iSCSI array connectivity issues

Symptoms

  • No targets from an array are seen by:
    • All of the ESX hosts
    • All of the ESX hosts on a specific switch or connected through an uplink
    • One ESX host
  • Targets on the array are visible but one or more LUNs are not
  • iSCSI LUN not visible
  • iSCSI LUN cannot connect
  • Connectivity issues to the storage array
  • LUN is missing

Purpose

This article provides steps to troubleshoot iSCSI storage array connectivity issues.

Resolution

Validate that each troubleshooting step below is true for your environment. Each step will provide instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution.

The steps outlined below may involve terminal commands. Depending on the version of ESX used, the procedure for running terminal commands will vary:

To troubleshoot VMware ESX to iSCSI array connectivity:

Note: A rescan is required after every storage presentation change to the environment.

  1. Log into the ESX or ESXi host and verify that the host can vmkping the iSCSI targets with this command:vmkping <target ip>

    If you are running an ESX host, also check that the iSCSI target is pingable:

    ping <target ip>

    Note: Pinging the storage array only applies when using the Software iSCSI intitator. In ESXi, ping and ping6 both run vmkping.

  2. Verify that the host Hardware Bus Adapters (HBAs) are able to access the shared storage. For more information, see Obtaining LUN pathing information for ESX hosts (1003973).
  3. Confirm that no firewall is interfering with iSCSI traffic. For details on the ports and firewall requirements for iSCSI, see Port and firewall requirements for NFS and SW iSCSI traffic (1021626). For more information, see Troubleshooting network connection issues caused by firewall configuration (1007911).Note: Check SAN and switch configuration, especially if you are using Jumbo Frames (supported from ESX 4.x). To test the ping to a storage array with Jumbo Frames from ESX, run this command:

    vmkping -s <MTUSIZE> <IPADDRESS OF SAN>

  4. Ensure that the LUNs are presented to the ESX hosts. On the array side, ensure that the LUN IQNs and access control list (ACL) allow the ESX host HBAs to access the array targets. For more information, see Troubleshooting LUN connectivity issues (1003955).
  5. Verify that a rescan of the HBAs displays presented LUNs in the Storage Adapters view of a VMware ESX host. For more information, see Performing a rescan of the storage (1003988).
  6. Verify your CHAP authentication. If CHAP is configured on the array, ensure that the authentication settings for the VMware ESX hosts are the same as the settings on the array. For more information, see Checking CHAP authentication on the ESX Server (1004029).
  7. Consider pinging any ESX host iSCSI initiator (HBA) from the array’s targets. This is done from the iSCSI host.
  8. Verify that the storage array is listed on the Storage/SAN Compatibility Guide. For more information, see Verifying that ESX/ESXi host hardware (System, Storage, and I/O) devices are supported (1003916).Note: Some array vendors have a minimum-recommended microcode/firmware version to operate with VMware ESX. This information can be found from the array vendor and the VMware Hardware Compatibility Guide.
  9. Verify that the physical hardware is functioning correctly, including:
    • The Storage Processors (sometimes known as heads) on the array
    • The storage array itself
    • Check SAN and switch configuration, especially if you are using Jumbo Frames (supported from ESX 4.x). To test the ping to a storage array with Jumbo Frames from ESX, run this command:vmkping -s <MTUSIZE> <IPADDRESS OF SAN>

Note: Consult your storage array vendor if you require assistance.

  1. Perform some form of network packet tracing and analysis, if required. For more information, see:

Note: If your problem still exists after trying the steps in this article, please:

Tags

iscsi-connectivity iSCSI-Hardware-Software-Initiator esx-esxi-iscsi-connectivity

See Also

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003967

Troubleshooting connectivity issues to an NFS datastore

Symptoms

  • The NFS share cannot be mounted by the ESX/ESXi host.
  • The NFS share is mounted, but nothing can be written to it.

Purpose

This document guides you through the most common steps to identify a connectivity problem from an ESX/ESXi host to an NFS shared storage device.

Resolution

Validate that each troubleshooting step below is true for your environment. The steps provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

  1. Verify connectivity to the NFS server and ensure that it is accessible through the firewalls. For more information, see Cannot connect to NFS network share (1007352).
  2. Verify that the ESX host can vmkping the NFS server. For more information, see Testing VMkernel connectivity with the vmkping command (1003728) .
  3. Verify that the NFS host can ping the VMkernel IP of the ESX host.
  4. Verify that the virtual switch being used for storage is configured correctly. For more information, see the Networking Attached Storage section of the ESX Configuration Guide.

    Note: Ensure that there are enough available ports on the virtual switch. For more information, see Network cable of a virtual machine appears unplugged (1004883) and No network connectivity if all ports are in use (1009103).
  5. Verify that the storage array is listed in the Hardware Compatibility Guide. For more information, see the VMware Compatibility Guide. Consult your hardware vendor to ensure that the array is configured properly.

    Note: Some array vendors have a minimum microcode/firmware version that is required to work with ESX.
  6. Verify that the physical hardware functions correctly. Consult your hardware vendor for more details.
  7. Verify that the server (if it is Windows) is correctly configured for NFS. For more information, see Troubleshooting adding a data store from a Windows Services NFS device (1004490).

To troubleshoot a mount being read-only:

  1. Verify that the permissions of the NFS server have not been set to read-only for this ESX host.
  2. Verify that the NFS share was not mounted with the read-only box selected.

Note: If your problem still exists after trying the steps in this article, please:

Tags

nfs-connectivity add-nfs-datastore

See Also

Update History

05/11/2010 – Additional troubleshooting steps.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003682

Troubleshooting ESX and ESXi connectivity to fibre channel arrays

Symptoms

  • One ESX host or ESXi host cannot see any targets from all storage arrays.
  • The storage array does not report the HBA of the ESX or ESXi as being logged in.

fibre-channel-connectivity

Purpose

This article is designed to guide you through the most common steps to identify a connectivity problem from ESX or ESXi to a shared storage device.

Resolution

Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

To troubleshoot connectivity issues to a fibre channel array:

  1. Verify that ESX or ESXi cannot see any targets in a shared storage environment. For more information, see Obtaining LUN pathing information for ESX hosts (1003973).
  2. Verify that a rescan does not restore visibility to all the targets. For more information, see Performing a rescan of the storage (1003988).
  3. Verify that the Host Bus Adapter (HBA) firmware is at the certified level and is in the VMware Hardware Compatibility Guide. You may need to contact your storage array vendor for instructions on this procedure. For more information, see the Fibre Channel SAN Configuration Guide.
  4. Verify that the storage array is listed in the VMware Hardware Compatibility Guide. For more information on confirming hardware compatibility, see Confirming ESX/ESXi host hardware (System, Storage, and I/O) compatibility (1003916).
  5. Verify that the initiator is registered on the storage array, and that the storage array is configured correctly. You may need to contact your storage array vendor for further assistance. For more information, see Fibre Channel SAN Configuration Guide.
  6. Verify all the fibre channel physical hardware:
    • The fibre switch and the Gigabit Interface Converter (GBIC) units in the switch.
    • The fibre cables between the SAN and the ESX Server.
    • The Host Bus Adapter (HBA).Note: You may need to contact your hardware vendor for more information about verifying correct functionality.

Note: If your problem still exists after trying the steps in this article, please:

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003951

Troubleshooting ESX and ESXi connectivity to iSCSI arrays using hardware initiators

Symptoms

  • One ESX or ESXi host cannot see any targets from all storage arrays
  • The array does not report the HBA of the ESX or ESXi as being logged in

Purpose

This article guides you through the most common steps to identify a connectivity problem from an ESX or ESXi to an iSCSI shared storage device when using a hardware initiator.

Resolution

Validate that each troubleshooting step below is true for your environment. Each step provides instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Do not skip a step.
To troubleshoot a connectivity problem from an ESX or ESXi to an iSCSI shared storage device when using a hardware initiator:

  1. Verify that the ESX or ESXi cannot see any targets on shared storage. For more information, see Obtaining LUN pathing information for ESX hosts (1003973).
  2. Verify that a rescan does not restore visibility to the targets. For more information, see Performing a rescan of the storage (1003988).
  3. Verify that the Host Bus Adapter (HBA) is listed on the I/O Compatibility Guide found on the Confirming ESX/ESXi host hardware (System, Storage, and I/O) compatibility (1003916).
  4. Verify that the storage array is listed in the VMware Hardware Compatibility Guide and that the firmware is at the required level. Consult your storage array vendor for details regarding the latest supported firmware version.
  5. Verify that the initiator is registered on the array, and that the storage array is configured correctly. Partner with the hardware vendor to ensure that the array is properly configured. For more information, see the iSCSI SAN Configuration Guide.
  6. Verify that the physical hardware is functioning correctly. You may have to contact your hardware vendor for information about verifying your hardware functions correctly.

Note: If your problem still exists after trying the steps in this article, please:

Tags

iSCSI-Hardware-Software-Initiator iscsi-connectivity esx-esxi-iscsi-connectivity

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003952

Troubleshooting ESX and ESXi connectivity to iSCSI arrays using software initiators

Symptoms

  • One ESX or ESXi host cannot see any targets from all storage arrays.
  • The array does not report the HBA of the ESX or ESXi host as being logged in.
  • The array cannot ping the software initiator on the ESX or ESXi host.
  • The ESX or ESXi host cannot ping the storage processor on the array.
  • The ESX or ESXi host cannot vmkping the storage processor on the array.

iscsi-connectivity iSCSI-Hardware-Software-Initiator esx-esxi-iscsi-connectivity

Purpose

This article guides you through the most common steps to identify a connectivity problem from an ESX or ESXi host to an iSCSI shared storage device using the software initiator.

Resolution

Validate that each troubleshooting step below is true for your environment. Each step will provide instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

To troubleshoot connectivity to iSCSI arrays using software initiator:

  1. Verify that the ESX or ESXi host can see any targets on shared storage. For more information, see Obtaining LUN pathing information for ESX hosts (1003973).
  2. Verify that a rescan restores visibility to the targets. For more information, see Performing a rescan of the storage (1003988).
  3. Verify that the virtual switch being used for storage has been configured correctly. See Networking Configuration for Software iSCSI Storage in the Server Configuration Guide.Note: Ensure that there are enough available ports on the virtual switch. For more information, see Network cable of a virtual machine appears unplugged (1004883) and No network connectivity if all ports are in use (1009103).
  4. Log in to the ESX or ESXi host and verify that the host can vmkping the iSCSI targets with the command:vmkping <target ip>

    If you are running on an ESX host, check that the iSCSI target is pingable:

    ping <target ip>

    For more information, see Testing network connectivity with the Ping command (1003486) and Testing vmkernel network connectivity with the vmkping command (1003728).

  5. Verify that the storage array is listed in the VMware Hardware Compatibility Guide and that the initiator is registered on the array. Consult your storage vendor for instructions on this procedure.
  6. Verify that the array is configured correctly for use with ESX Server or ESXi hosts. Partner with your hardware vendor to ensure that the array is properly configured. For more information, see the iSCSI SAN Configuration Guide.
  7. Verify that the physical hardware and physical network hardware are functioning correctly. You may have to contact your hardware vendor for information about verifying your hardware functions correctly.

Note: If your problem still exists after trying the steps in this article, please:

Additional Information

For related information, see:

Request a Product Feature

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003967

Troubleshooting connectivity issues to an NFS datastore

Symptoms

  • The NFS share cannot be mounted by the ESX/ESXi host.
  • The NFS share is mounted, but nothing can be written to it.

Purpose

This document guides you through the most common steps to identify a connectivity problem from an ESX/ESXi host to an NFS shared storage device.

Resolution

Validate that each troubleshooting step below is true for your environment. The steps provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

  1. Verify connectivity to the NFS server and ensure that it is accessible through the firewalls. For more information, see Cannot connect to NFS network share (1007352).
  2. Verify that the ESX host can vmkping the NFS server. For more information, see Testing VMkernel connectivity with the vmkping command (1003728) .
  3. Verify that the NFS host can ping the VMkernel IP of the ESX host.
  4. Verify that the virtual switch being used for storage is configured correctly. For more information, see the Networking Attached Storage section of the ESX Configuration Guide.

    Note: Ensure that there are enough available ports on the virtual switch. For more information, see Network cable of a virtual machine appears unplugged (1004883) and No network connectivity if all ports are in use (1009103).
  5. Verify that the storage array is listed in the Hardware Compatibility Guide. For more information, see the VMware Compatibility Guide. Consult your hardware vendor to ensure that the array is configured properly.

    Note: Some array vendors have a minimum microcode/firmware version that is required to work with ESX.
  6. Verify that the physical hardware functions correctly. Consult your hardware vendor for more details.
  7. Verify that the server (if it is Windows) is correctly configured for NFS. For more information, see Troubleshooting adding a data store from a Windows Services NFS device (1004490).

To troubleshoot a mount being read-only:

  1. Verify that the permissions of the NFS server have not been set to read-only for this ESX host.
  2. Verify that the NFS share was not mounted with the read-only box selected.

Note: If your problem still exists after trying the steps in this article, please:

Tags

nfs-connectivity add-nfs-datastore

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&externalId=1003964

Troubleshooting VMFS-3 datastore issues

Symptoms

  • LUN is visible but the datastore is not available in /vmfs/volumes
  • Virtual machines fail to power on
  • Running virtual machines may stop responding, fail, or generate a Blue Screen
  • The ESX host becomes disconnected from VirtualCenter
  • You see these warnings:
    • WARNING: LVM: 4844: [vmhbaH:T:L:P] detected as a snapshot device. Disallowing access to the LUN since resignaturing is turned off.
    • <Date> esx vmkernel: 10:19:07:07.881 cpu3: 10340 SCSI: 5637: status SCSI LUN is in snapshot state, rstatus 0xc0de00 for vmhba1:0:6. residual R 999, CR 8-, ER3.
    • <Date> esx vmkernel: 10:19:07:07.881 cpu3: <world ID> SCSI 6624: Device vmhba1:0:6. is a deactivated snapshot.

cannot-power-on-vm vm-power-on-fails lun-connectivity

Purpose

This article provides steps to troubleshoot issues when the VMFS-3 datastore does not mount.

Resolution

Validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

  1. Verify that the LUN is presented to the ESX Server host. For more information, see Troubleshooting LUN connectivity issues (1003955).
  2. Verify that the LUN is not being detected as a deactivated snapshot. For more information, see, LUN is incorrectly detected as a snapshot (1002351).
  3. Verify that the datastore is not being detected as a snapshot. For more information, see VMFS Volume Can Be Erroneously Recognized as a Snapshot (6482648).Note: A resignature may have occurred leaving certain ESX Server hosts believing that the LUN is now a snapshot. If you decide to perform a resignature, plan a major outage window to do this. For more information, see:
  4. Verify that the LUN is not larger than 2TB/2047GB. This can occur if a LUN is extended. For more information, see Troubleshooting a LUN that was extended in size past the 2TB/2047GB limit (1004230).
  5. Verify that the LUN is not being masked by the ESX Server. For more information, see:
  6. Verify that write caching is not disabled on the array. This is verified using the storage array management interface. Consult your storage array vendor if you require assistance. Also, see Write-cache disabled on storage array causing performance issues or failures (1002282).
  7. Verify that the partition type for the VMFS-3 partition is set to fb. For more information, see Unable to access the VMFS datastore when the partition is missing or is not set to type fb (1002168).

If the issue continues to exist after trying the steps in this article:

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=1011340

Purpose

This article describes the various pathing policies that can be used with VMware ESX/ESXi 4.x and VMware ESXi 5.x.

Resolution

These pathing policies can be used with VMware ESX/ESXi 4.x and ESXi 5.x:

  • Most Recently Used (MRU) — Selects the first working path, discovered at system boot time. If this path becomes unavailable, the ESX/ESXi host switches to an alternative path and continues to use the new path while it is available. This is the default policy for Logical Unit Numbers (LUNs) presented from an Active/Passive array. ESX/ESXi does not return to the previous path when if, or when, it returns; it remains on the working path until it, for any reason, fails.Note: The preferred flag, while sometimes visible, is not applicable to the MRU pathing policy and can be disregarded.
  • Fixed (Fixed) — Uses the designated preferred path flag, if it has been configured. Otherwise, it uses the first working path discovered at system boot time. If the ESX/ESXi host cannot use the preferred path or it becomes unavailable, ESX/ESXi selects an alternative available path. The host automatically returns to the previously-defined preferred path as soon as it becomes available again. This is the default policy for LUNs presented from an Active/Active storage array.
  • Round Robin (RR) — Uses an automatic path selection rotating through all available paths, enabling the distribution of load across the configured paths.
    • For Active/Passive storage arrays, only the paths to the active controller will used in the Round Robin policy.
    • For Active/Active storage arrays, all paths will used in the Round Robin policy.

Note: This policy is not currently supported for Logical Units that are part of a Microsoft Cluster Service (MSCS) virtual machine.

  • Fixed path with Array Preference — The VMW_PSP_FIXED_AP policy was introduced in ESX/ESXi 4.1. It works for both Active/Active and Active/Passive storage arrays that support ALUA. This policy queries the storage array for the preferred path based on the arrays preference. If no preferred path is specified by the user, the storage array selects the preferred path based on specific criteria.Note: The VMW_PSP_FIXED_AP policy has been removed from the ESXi 5.0 release and VMW_PSP_MRU became the default PSP for all ALUA devices

Notes:

  • These pathing policies apply to VMware’s Native Multipathing (NMP) Path Selection Plugins (PSP). Third party PSPs have their own restrictions.
  • Switching to Round Robin from MRU or Fixed is safe and supported for all arrays, Please check with your vendor for supported Multipathing policies for your storage array. Switching to a unsupported pathing policy can cause an outage.

Warning: VMware does not recommend changing the LUN policy from Fixed to MRU, as the automatic selection of the pathing policy is based on the array that has been detected by the NMP PSP.

Additional Information

The Round Robin (RR) multipathing policies have configurable options that can be modified at the command-line interface. Some of these options include:

  • Number of bytes to send along one path for this device before the PSP switches to the next path.
  • Number of I/O operations to send along one path for this device before the PSP switches to the next path.

For more information, see Round Robin Operations with esxcli nmp roundrobin in the vSphere Command-Line Interface Installation and Reference Guide for the appropriate version of VMware product.

See Multipathing Considerations topic in vSphere 5 Documentation Center

See VMware PSPs topic in vSphere 5 Documentation Center

Tags

multipathing-san-policies

See Also

Update History

10 -3 – 2011 changed Please check with your vendor for supported Multipathing policies for your storage array. Switching to a unsupport pathing policy can cause an outage

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/documentLinkInt.do?micrositeID=&popup=true&languageId=&externalID=1013003

Collecting information about tasks in VMware ESX and ESXi

Symptoms

While troubleshooting issues with VMware ESX and VMware vCenter, there may be differences between what vCenter and ESX consider tasks. An issue may occur when a task within vCenter server times out, and when attempting to run other tasks, it reports the error:

Another task is already in progress.

Purpose

This article provides steps to collect information about tasks for ESX and ESXi hosts.

Resolution

Note: For more information on resolving the symptoms described above, see Restarting the Management agents on an ESX or ESXi Server (1003490).
If your problem is re-occuring, and you need to find out which task the ESX host is taking a long time to process, you can use the following steps to isolate the task.

ESX

To collect information about tasks for ESX hosts:

  1. Log into the ESX host at the console or via SSH.  For more information, see Unable to connect to an ESX host using Secure Shell (SSH) (1003807).
  2. In order to get a list of tasks on this host, run the command:vmware-vim-cmd vimsvc/task_list

    The output is similar to:

    (ManagedObjectReference) [
    ‘vim.Task:haTask-112-vim.VirtualMachine.createSnapshot-3887′,
    ‘vim.Task:haTask-pool21-vim.ResourcePool.updateConfig-33252′,
    ‘vim.Task:haTask-pool22-vim.ResourcePool.updateConfig-33253′,
    ‘vim.Task:haTask-pool3-vim.ResourcePool.updateConfig-33254′,
    ‘vim.Task:haTask-pool5-vim.ResourcePool.updateConfig-33255′,
    ‘vim.Task:haTask-pool6-vim.ResourcePool.updateConfig-33256′,
    ‘vim.Task:haTask-pool7-vim.ResourcePool.updateConfig-33257′,
    ‘vim.Task:haTask-pool8-vim.ResourcePool.updateConfig-33258′,
    ‘vim.Task:haTask-pool10-vim.ResourcePool.updateConfig-33260′
    ]

  3. To get a list of tasks associated to specific virtual machines, you must first get the Vmid of the virtual machine. Run the command:vmware-vim-cmd vmsvc/getallvms

    The output is similar to:

    Vmid        Name                  File                       Guest OS       Version   Annotation
    112    VM-1           [Datastore] VM-3/VM-3.vmx      winLonghornGuest        vmx-04
    128    VM-2           [Datastore] VM-3/VM-3.vmx      winXPProGuest           vmx-04
    144    VM-3           [Datastore] VM-3/VM-3.vmx      winNetStandardGuest     vmx-04

  4. Make note of the values under the Vmid column as they will be referenced in later steps.
  5. When you have the Vmid, you can then get a list of tasks associated with a specific virtual machine. Run the command:vmware-vim-cmd vmsvc/get.tasklist <VMID>

    where <VMID> is the number identified in step 4.

    The output is similar to:

    (ManagedObjectReference) [
    ‘vim.Task:haTask-112-vim.VirtualMachine.createSnapshot-3887′
    ]

  6. Make note of the task identifier. In the above example, the task identifier is 3887.
  7. To get information about a particular task’s status, run the command:vmware-vim-cmd vimsvc/task_info <task identifier>

    where <task identifier> is the number recorded in step 6.

    The output is similar to:

    (vmodl.fault.ManagedObjectNotFound) {
    dynamicType = <unset>,
    faultCause = (vmodl.MethodFault) null,
    obj = ‘vim.Task:3887′,
    msg = “The object has already been deleted or has not been completely created”,
    }

ESXi

To collect information about tasks for ESX hosts:

  1. Log into the ESXi host at the console. For more information, see Tech Support Mode for Emergency Support (1003677).
  2. In order to get a list of tasks on this host, run the command:vim-cmd vimsvc/task_list

    The output is similar to

    (ManagedObjectReference) [
    ‘vim.Task:haTask-112-vim.VirtualMachine.createSnapshot-3887′,
    ‘vim.Task:haTask-pool21-vim.ResourcePool.updateConfig-33252′,
    ‘vim.Task:haTask-pool22-vim.ResourcePool.updateConfig-33253′,
    ‘vim.Task:haTask-pool3-vim.ResourcePool.updateConfig-33254′,
    ‘vim.Task:haTask-pool5-vim.ResourcePool.updateConfig-33255′,
    ‘vim.Task:haTask-pool6-vim.ResourcePool.updateConfig-33256′,
    ‘vim.Task:haTask-pool7-vim.ResourcePool.updateConfig-33257′,
    ‘vim.Task:haTask-pool8-vim.ResourcePool.updateConfig-33258′,
    ‘vim.Task:haTask-pool10-vim.ResourcePool.updateConfig-33260′
    ]

  3. To get a list of tasks associated to specific virtual machines, you must first get the Vmid of the virtual machine. Run the command:vim-cmd vmsvc/getallvms

    The output is similar to:

    Vmid        Name                  File                       Guest OS       Version   Annotation
    112    VM-1           [Datastore] VM-3/VM-3.vmx      winLonghornGuest        vmx-04
    128    VM-2           [Datastore] VM-3/VM-3.vmx      winXPProGuest           vmx-04
    144    VM-3           [Datastore] VM-3/VM-3.vmx      winNetStandardGuest     vmx-04

  4. Make note of the values under the Vmid column as they will be referenced in later steps.
  5. When you have the Vmid, you can then get a list of tasks associated with a specific virtual machine by running the command:vim-cmd vmsvc/get.tasklist <VMID>

    where <VMID> is the number identified in step 4.

    The output is similar to:

    (ManagedObjectReference) [
    ‘vim.Task:haTask-112-vim.VirtualMachine.createSnapshot-3887′
    ]

  6. Make note of the task identifier. In the above example, the task identifier is 3887.
  7. To get information about a particular task’s status, run the command:vim-cmd vimsvc/task_info <task identifier>

    where <task identifier> is the number recorded in step 6.

    The output is similar to:

    (vmodl.fault.ManagedObjectNotFound) {
    dynamicType = <unset>,
    faultCause = (vmodl.MethodFault) null,
    obj = ‘vim.Task:3887′,
    msg = “The object has already been deleted or has not been completely created”,
    }

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/documentLinkInt.do?micrositeID=&popup=true&languageId=&externalID=1005566

Symptoms

  • Unable to connect with the VMware Infrastructure /vSphereClient to the ESX host from VirtualCenter/vCenter Server.
  • Unable to directly connect with the vSphere Client to the ESX host.

Purpose

The vCenter Server Agent, also referred to as vpxa or the vmware-vpxa service, is what allows a vCenter Server to connect to a ESX host. Specifically, vpxa is the communication conduit to the hostd, which in turn communicates to the ESX kernel. This article provides troubleshooting steps for when the vpxa does not start.

Resolution

Validate that each troubleshooting step below is true for your environment. Each step provides instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution.

  1. Log in to the ESX host service console and acquire root privileges.
  2. Verify that the installation of the vCenter Server agent is not corrupted. From the command-line, run:rpm -V VMware-vpxa

    There is only output from the command if errors are found. For example:

    rpm -V VMware-vpxa
    S.5….T   /opt/vmware/vpxa/sbin/vpxa

    This indicates that the Size, MD5 checksum, and Timestamp for the file /opt/vmware/vpxa/sbin/vpxa are wrong, and therefore the installation of the VMware-vpxa package is corrupt. If you do have a corrupted installation, proceed to the next step to re-install the VirtualCenter agent.

    Note: The Red Hat Package Manager does not keep tabs on dynamic configuration files, so you cannot see errors about them.

  3. Verify that the vCenter Server agent is the correct version. For more information, see Verifying and reinstalling the correct version of VMware VirtualCenter Server agent (1003714).
  4. Verify that there is adequate disk space available on the ESX host service console. For more information, see Investigating disk space on an ESX/ESXi host (1003564).
  5. Verify that no processes are over utilizing the resources on the ESX host service console. For more information, see Checking for resource starvation of the ESX service console (1003496).
  6. Verify that the vpxa process is not exceeding allocated memory.Change to the the vpxa log directory, run:

    cd /var/log/vmware/vpxa

    View the vpxa log file, run:

    more vpxa.log

    Examine the log for errors. For example:

    [2007-07-28 17:57:25.416 ‘Memory checker’ 5458864 error] Current value 143700 exceeds hard limit 128000. Shutting down process.
    [2007-07-28 17:57:25.420 ‘Memory checker’ 3076453280 info] Resource checker stopped.

    These errors indicate there is not enough service console memory allocated for the vCenter Server agent. This may be due to one or more causes:

Tags

vpxa-agent  virtualcenter-agent-cannot-start

See Also

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1001596

Troubleshooting VMware High Availability (HA)

Details

You are experiencing:

  • VMware High Availability (HA) failover errors:HA agent on <server> in cluster <cluster> in <datacenter> has an error

    Insufficient resources to satisfy HA failover level on cluster

  • HA agent configuration errors on ESX hosts:
    • Failed to connect to host
    • Failed to install the VirtualCenter agent
    • cmd addnode failed for primary node: Internal AAM Error – agent could not start
    • cmd addnode failed for primary node:/opt/vmware/aam/bin/ft_startup failed
  • Configuration of hosts IP address is inconsistent on host <hostname> address resolved to <IP> and <IP>
  • Port errors:Ports not freed after stop_ftbb
  • The first node in the HA cluster enables correctly but the second node fails to configure HA just after 90%.
  • The network settings and HA configuration are all correct. DNS and ping tests are all okay.
  • Disabling and re-enabling HA on the cluster did not resolve the issue.
  •  VMware Infrastructure (VI) Client displays the error:Internal AAM Errors – agent could not start
  • In the aam logs on the ESX, the file aam_config_util_addnode.log shows text similar to:11/27/09 16:20:49 [myexit ] Failure location:
    11/27/09 16:20:49 [myexit ] function main::myexit called from line 2199
    11/27/09 16:20:49 [myexit ] function main::start_agent called from line 1168
    11/27/09 16:20:49 [myexit ] function main::add_aam_node called from line 171
    11/27/09 16:20:49 [myexit ] VMwareresult=failure
  • Adding a host to the cluster fails with the error:Cannot complete the configuration of the HA agent on the host.  Other HA configuration error.

Solution

Note: For VMware vCenter Server 5.0  HA/FDM troubleshooting, see Troubleshooting Fault Domain Manager (FDM) issues (2004429).

This article guides you through the process of troubleshooting a VMware HA cluster. The article identifies common configuration problems as well as confirming the availability of required resources on your ESX Server.

Validate that each troubleshooting step below is true for your environment. Each step provides instructions or a link to a document, in order to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

Note: If you perform a corrective action in any of the following steps, attempt to reconfigure VMware HA again. Most issues can also be solved by unconfiguring, and then reconfiguring HA again. Ensure you do this action before proceeding with these steps.

  1. Check the release notes for current releases to see if the problem has been resolved in a bug fix.  See the Documentation page for vSphere 4 or VMware Infrastructure 3.
  2. Verify that there are enough licenses to configure VMware HA. For more information, see Verifying that a feature is licensed (1003692).
  3. Verify that name resolution is correctly configured on the ESX Server. For more information, see Identifying issues with and setting up name resolution on ESX Server (1003735).
  4. Verify that name resolution is correctly configured on the vCenter Server. For more information, see Configuring name resolution for VMware VirtualCenter (1003713).
  5. Verify that the time is correct on all ESX Servers with the date command. For more information on setting up time synchronization with ESX Server, see Installing and Configuring NTP on VMware ESX Server (1339).
  6. Verify that network connectivity exists from the VirtualCenter Server to the ESX Server. For more information, see Testing network connectivity with the Ping command (1003486).
  7. Verify that network connectivity exists from the ESX Server to the isolation response address. For more information, see Testing network connectivity with the Ping command (1003486).
  8. Verify that all of the required network ports are open. For more information, see Testing port connectivity with the Telnet command (1003487).Notes:
    • HA uses the following ports:Incoming port – TCP/UDP 8042-8045
      Outgoing port – TCP/UDP 2050-2250
    • Also, ensure that AAM (Automated Availability Manager) is enabled on the ESX Security Profile. If these ports are not open on the ESX firewall, HA will not configure.
  9. If configured with Advanced Settings, confirm that the configuration is valid. For more information, see Advanced Configuration options for VMware High Availability (1006421).
  10. Verify that the correct version of the VirtualCenter agent service is installed. For more information on determining agent versions and how to manually uninstall and reinstall the HA agents on an ESX host, see Verifying and reinstalling the correct version of VMware VirtualCenter Server agent (1003714).
  11. Verify the VirtualCenter Server Service has been restarted. To restart the VirtualCenter Server Service, see Stopping, starting, or restarting the vCenter Server service (1003895).
  12. Verify that VMware HA is only attempting to configure on one Service Console. For more information, see VMware High Availability configuration issues when an iSCSI Service Console is on the same network (1003789).
  13. Verify that the VMware HA cluster is not corrupted. To do this you need to create another cluster as a test. For more information, see Recreating VMware High Availability Cluster (1003715).
  14. Verify that that UDP 8043 packets used for the HA backbone communications are not dropped between the ESX hosts.  For more information see HA fails to configure after task passes 90% “Internal AAM Error – agent could not start” (1018217).
  15. Ensure that the ESXi host userworld swap option is enabled. For more information see ESXi hosts without swap enabled cannot be added to a VMware High Availability Cluster (1004177).

Notes: If your problem still exists after trying the steps in this article:

Additional Information:

Tags

service-console ha-agent ha-agent-failure ha-fails ha-host-failure vmware-high-availability

This Article Replaces

1003691

1018125

1001068

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1003735

Identifying issues with and setting up name resolution on ESX/ESXi Server

Symptoms

  • Unable to configure VMware High Availability (HA)
  • Configuring VMware HA fails at about 97% completion
  • Configuring VMware HA fails between 85 to 91%.
  • Reconfiguring VMware HA fails
  • Adding an ESX host to a cluster fails
  • Enabling VMware HA in a cluster fails
  • These errors are generated when attempting to configure VMware HA:
    • An error occurred during configuration of the HA agent on the host
    • HA agent on <ESX> in cluster <cluster> in <Datacenter> has an error
    • Error: Cannot complete the configuration of the HA agent on the host. Other HA configuration error
    • cmd addnode failed for primary node:/opt/vmware/aam/bin/ft_startup failed
  • Host fails to remediate or exit Maintenance Mode.
  • You see the error:Operation timed out
  • In the Tasks view of VirtualCenter/vCenter Server, you see the error:There are errors during the remediation operation
    Failed to find host
  • Update Manager is unable to scan ESX/ESXi hosts.

Purpose

The errors listed in the Symptoms section are generated as a result of name resolution issues.

This article guides you through identifying issues with name resolution which can seriously impact the normal operation of ESX/ESXi, particularly in HA clustered environments. The article also details correctly configuring host files when there is no DNS server in the environment or if the DNS server is incorrectly configured.

Resolution

Identifying issues

There is a problem with name resolution if any of the following tests fail. ESX/ESXi hosts must be able to find each other by:

  • IP address
  • Short Name
  • Fully Qualified Domain Name (FQDN)

If an issue with name resolution has been identified it must be resolved either on the DNS server or by using hosts files.

Note: After making any changes to DNS or hosts files, ensure to delete the file /etc/FT_HOSTS (or /etc/opt/vmware/aam/FT_HOSTS and /var/run/vmware/aam/FT_HOSTS) on all affected ESX/ESXi hosts.

  1. Verify that all ESX/ESXi hosts can ping each other by short name. All ESX/ESXi hosts in the environment must be able to ping each other by using short name only. For more information, see Testing network connectivity with the Ping command (1003486).
  2. Verify that all ESX/ESXi hosts can nslookup each other.Use nslookup to verify that the right name is being associated to a particular IP address.

    For example:

    [root@esx-server-1 /]# nslookup 192.168.0.5
    Server: 192.168.0.7
    Address: 192.168.0.7#53

    5.0.168.192.in-addr.arpa name = esx-server-2.domain.com.

  3. Verify that all ESX/ESXi hosts can reverse nslookup each other.Use nslookup to verify that the right IP address is being associated to a particular name.

    For example:

    [root@esx-server-1 /]# nslookup esx-server-2
    Server: 192.168.0.7
    Address: 192.168.0.7#53

    Name: esx-server-2.domain.com
    Address: 192.168.0.5

Additional checks for VMware High Availability (HA) environments

The following are additional checks for VMware High Availability environments:

  1. Verify that the reported host name is in lowercase when you run:[root@esx-server-1 /]# hostname
    [root@esx-server-1 /]# hostname –s
  2. Verify that all host names in /etc/hosts are in lowercase.
  3. Verify that search domain in /etc/resolv.conf is in lowercase.
  4. Verify that the host name in /etc/sysconfig/network is a fully qualified domain name, and is lowercase.
  5. Verify that the host name in /etc/vmware/esx.conf is a fully qualified domain name, and is lowercase.
  6. If your ESX/ESXi hosts are registered in DNS, verify that your system host name is lowercase. Run the following command to ensure the FQDN is resolvable and all lowercase:nslookup <short hostname>
  7. Verify that all primary Service Consoles in the VMware HA cluster have the same name.
  8. Verify that all primary Service Consoles are in the same IP subnet.

Note: If a VMotion VMkernel port is on same vSwitch as the primary Service Console, or if a host has multiple Service Consoles, refer to After installation or upgrade to VirtualCenter 2.5.0 Update 2 an Incompatible HA Networks error is generated (1006541).

Configuring hosts files on ESX/ESXi

The hosts file on the ESX host is located at /etc/hosts.

Open the file for editing using a text editor such as nano or vi .

Below is an example hosts file.

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost

# Any line beginning with a pound sign is a comment and will not be read.
192.168.0.5 esx-server-1.domain.com esx-server-1
192.168.0.6 esx-server-2.domain.com esx-server-2

# The VMware Virtual Center Server must also have an entry here

192.168.0.20 virtualcenter.domain.com virtualcenter
Note: localhost must always be present in the hosts file. Do not modify or remove the entry for localhost.

  • The hosts file must be identical on all ESX/ESXi hosts in the cluster.
  • There must be an entry for every ESX/ESXi host in the cluster.
  • Every host must have an IP address, Fully Qualified Domain Name (FQDN), and short name.
  • The hosts file is case sensitive. Be sure to use lowercase throughout the environment.

Tags

cannot-exit-maintenance-mode ha-fails host-fails-to-remediate name-resolution-esx name-resolution-issues

See Also

Update History

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1003713

Configuring name resolution for VMware vCenter Server

Symptoms

  • Unable to add a host to VirtualCenter by name but you can add a host by IP.
  • VMware High Availability (HA) Cluster fails to configure.
  • When adding an ESX or ESXi host to VirtualCenter, you see the error:

The host type is not supported, or was added to a cluster but does not support clustering features.

Purpose

This article guides you through the process of configuring name resolution for VMware VirtualCenter. Configuring name resolution often necessary when there is no DNS server on the network or the DNS server is incorrectly configured.

Resolution

If you are using DNS server, use the nslookup utility on the client device to check that the DNS name (for example, esx1.mycompany.com) is resolved into the correct IP address. For example:

#nslookup esx1.mycompany.com

If it is not resolving into the correct IP address, edit the hosts file to configure name resolution. The default location for the hosts file is %SystemRoot%\system32\drivers\etc\hosts

The hosts file location may be changed. The directory is determined by the registry key:

\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\DataBasePath

Populate the hosts file with IP address, Fully Qualified Domain Name (FQDN), and Short Name of all ESX Servers.

The contents of the hosts file must look similar to the following.

192.168.0.1         server1.domain.com         server1

192.168.0.2         server2.domain.com         server2

192.168.0.3         server3.domain.com         server3

where server# is the name of your server.

Verify that all ESX Servers are pingable by name from the VMware VirtualCenter Server. For more information, see Testing network connectivity with the Ping command (1003486).

You must repeat the above steps on machines using VMware Virtual Infrastructure Client.

Tags

configure-name-resolution-fo-vcenter dns-incorrectly-configured ha-fails host-type-unsupported no-dns-server

See Also

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1018217

HA fails to configure at 90% completion with the error: Internal AAM Error – agent could not start

Symptoms

  • On the first node in a cluster, HA enables correctly.
  • On the second node in a cluster, HA configuration fails at 90% with the errors:
    • In VI Client or vSphere Client:Internal AAM Errors – agent could not start
    • In the aam logs on the ESX, aam_config_util_addnode.log contains entries similar to:11/27/09 16:20:49 [myexit ] Failure location:
      11/27/09 16:20:49 [myexit ] function main::myexit called from line 2199
      11/27/09 16:20:49 [myexit ] function main::start_agent called from line 1168
      11/27/09 16:20:49 [myexit ] function main::add_aam_node called from line 171
      11/27/09 16:20:49 [myexit ] VMwareresult=failure
  • The network settings and HA configuration are correct.
  • DNS and ping tests work correctly.

Resolution

It has been reported in a small number of cases that UDP packets between the ESX hosts on port 8043 are dropped. Port 8043 is used between primary nodes for FT backbone communication. This has been known to occur with HP ProCurve 1810G switches that have the automatic denial-of-service protection feature enabled and on Cisco 4948 switches with ICMP rate limiting enabled.

This can be checked (on ESX classic) by running the tcpdump command on the service consoles of the ESX hosts in question:

For example:

# tcpdump -i vswif0 -s 900 -n udp port 8043 -w ${HOSTNAME}.pcap
Run the above commands on each ESX host in the cluster at the same time then Enable HA.  When it fails, stop tcpdump and compare the tcpdump files with a packet analyzer such as Wireshark. If you see packets sent out from ESX A to ESX B but not received by ESX B, then there may be an issue with the external network infrastructure.

Check whether the physical switch is using any IPS/IDS (Intrusion Protection/Intrusion Detection) or anti-DOS (Denial of Service) features. If possible disable these features temporarily and re-test HA.

Additionally, try connecting the service console network uplinks to a different physical switch.

Note: HA uses the following ports:

  • Incoming port – TCP/UDP 8042-8045
  • Outgoing port – TCP/UDP 2050-2250

Additional Information

If this issue persists, file a support request with VMware Support and note this KB article ID in the problem description. For more information, see Filing a Support Request (1021619).

Tags

iscsci-service-console

See Also

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/documentLinkInt.do?micrositeID=&popup=true&languageId=&externalID=1004050

Troubleshooting template deployment or cloning when it fails

Symptoms

  • Deploying a template fails
  • Cannot deploy a virtual machine
  • Failure when cloning virtual machines
  • Failure to boot guest operating system after installing VMware Tools
  • Cannot clone a virtual machine
  • You receive errors similar to:Network copy failed for file.
    [] /home/vmware/xxx/nvram
  • You recieve this error when attempting to clone a virtual machine through vCenter Server:Failed to connect to host

Purpose

This article provides troubleshooting steps for template deployment and cloning in vCenter Server failures. These steps eliminate common causes for the problem by verifying the sysprep files, verifying the virtual machine configuration, and verifying the the vCenter Server configuration.

Resolution

Validate that each troubleshooting step below is true for your environment. Each step provides instructions or a link to a document to eliminate possible causes and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Do not skip a step.

  1. Name resolution problems in the ESX/ESXi environment can cause issues with file copies. Verify that Name Resolution is valid on ESX/ESXi. For more information, see Identifying issues with and setting up name resolution on ESX/ESXi Server (1003735).
  2. The correct version of sysprep is critical in deployments. Verify that the correct version of Microsoft sysprep has been installed to properly do the customization. For more information, see Sysprep files locations and versions (1005593).
  3. Verify that virtual machines are installed with the latest version of VMware Tools. For more information, see Verifying a VMware Tools build version (1003947).
  4. In some cases, other VMware products can interfere with the normal operation of vCenter Server. Verify that no other VMware products have been installed on vCenter Server. For more information, see Ensuring vCenter Server is the only VMware product installed on host (1005594).
  5. Verify that the guest operating system is defined correctly. Incorrect definitions can cause deployment failures. For more information, see Ensuring the guest operating system type is set correctly (1005870).
  6. Verify if the slow deployment is specific to one template or if it affects all templates. To properly test this,VMware recommends you to create a brand new template and then test the deployment. This provides a clean test of the environment. For more information, see Deploying a single template is slow (1004028).
  7. Determine if the issue is caused due to conditions related to slow cloning or deployment of templates. For more information, see Diagnosing slow deployment of templates or clones from VirtualCenter (1004002).

Note: If your problem still exists after trying the steps in this article, please:

Additional Information

Verify on ESX/ESXi 4.1 if the storage array devices in the environment support the hardware acceleration functionality and if they are responding correctly to VAAI primitives. If there is no VAAI support on the array, cloning or Storage vMotion may fail at 18%. For more information see Cloning or Storage vMotion fails at 18% with the error: Failed to clone: Connection timed out (1029244).

For further information on VAAI, see:

Incompatible HA Networks appearing when attempting to configure HA (High Availability)

Symptoms

  • VMware High Availability (HA) does not work.
  • Configuration of VMware HA fails with VirtualCenter 2.5 Update 2.
  • Errors similar to these appears in the Tasks and Events tab:
    • HA agent on <esxhostname> in cluster <clustername> in <datacenter> has an error Incompatible HA Networks:
      Cluster has network(s) missing on host: x.x.x.x
      Consider using the Advanced Cluster Settings das.allowNetwork to control network usage.
    • Currently has no available networks for HA communication. The following networks are currently used by HA:Service console

Resolution

As of VirtualCenter 2.5 Update 2, two advanced options have been added to allow greater control over the  networks used for cluster communication. These two advanced settings enable greater flexibility in the control and usage of VMware HA networks, while still preserving the VMware HA requirement that all cluster nodes have identical, compatible networks.

Note: This new network compatibility check is implemented to increase cluster reliability and only impacts existing clusters after VirtualCenter is upgraded to Update 2 if they were previously configured with incompatible networks.

The parameters are:

  • das.allowVmotionNetworks – Allows for a NIC that is used for VMotion networks to be considered for VMware HA usage. This parameter enables a host that has only one NIC configured for management and VMotion combined to be used in VMware High Availability communication. By default, any VMotion network is ignored. To use the VMotion network, add das.allowVmotionNetworks with a value of true to the HA advanced options.
  • das.allowNetwork[…]- Allows the use of port group names to control the networks used for VMware HA. The value is set as the name of the portgroup, for example, Service Console or Management Network. When configured, the VMware HA cluster only uses the specified networks for VMware HA communication.

This error is often experienced after an upgrade when the cluster has both ESX hosts and ESXi hosts in the configuration, if they previously had incompatible networks and were successfully configured. The reason is that the ESX standard hosts have a service console portgroup for the service console IP address and one VMkernel port group for VMotion IP address, whereas in ESXi, both the Management IP address (similar to the service console in standard ESX 3.x) and the VMotion IP address are on the VMkernel port group.  Therefore, standard ESX hosts only use the service console network, whereas the ESXi Server hosts use both, or whichever was picked up first, causing the incompatible network error to be displayed.

To configure VMware HA to use the new settings:

  1. Log in to VirtualCenter with VMware Infrastructure (VI) Client as an administrator.
  2. Edit the settings of the cluster and deselect Enable VMware HA.
  3. Click OK, and wait for the servers to unconfigure for VMware HA.
  4. Click ESX Server > Configuration > Networking on each of the ESX hosts in the cluster and note the portgroups that are common between the servers.Note: The portgroup names should all be identical, and must have network connectivity with each other. For more information about testing network connectivity using ping, see Testing network connectivity with the Ping command (1003486).
  5. Edit the settings of the cluster, and select Enable VMware HA.
  6. Click VMware HA.
  7. Click Advanced Options.
  8. Add the das.allowNetworkXoption with a value of the portgroup name, where X is a number between 0 and 10, as shown in the following image:Note: You can add more than one portgroup, if required (to a maximum of 10), as shown in the following image:
  9. When all network portgroups have been added, click OK to exit out of the advanced options.
  10. Click OK to exit out of the Cluster configuration, and start the reconfiguration of VMware HA with the modified network settings.

Cloning or Storage vMotion fails at 18% with the error: Failed to clone: Connection timed out

Symptoms

  • Storage vMotion of a virtual machine fails at 18% on ESX/ESXi 4.1 hosts
  • You see the error:Failed to clone: Connection timed out.
  • The ESX /var/log/vmkernel log or ESXi /var/log/messages log for ESXi 4.x contains entries similar to:Oct 5 19:15:12 vmkernel: 26:15:27:13.225 cpu4:4100)NMP: nmp_CompleteCommandForPath: Command 0x83 (0x41033f4e0040) to NMP device “naa.60000970000192601889533031343443″ failed on physical path “vmhba2:C0:T0:L72″ H:0x0 D:0x2 P:0x0 Valid sense data: 0x5
    Oct 5 19:15:12 0x25 0x0.
    Oct 5 19:15:12 vmkernel: 26:15:27:13.226 cpu4:4100)NMP: nmp_CompleteCommandForPath: Command 0x83 (0x41033f6b0540) to NMP device “naa.60000970000192601889533031343443″ failed on physical path “vmhba2:C0:T0:L72″ H:0x0 D:0x2 P:0x0 Valid sense data: 0x5
    Oct 5 19:15:12 0x25 0x0.
    Oct 5 19:15:12 vmkernel: 26:15:27:13.227 cpu4:4100)NMP: nmp_CompleteCommandForPath: Command 0x83 (0x41033f37cb40) to NMP device “naa.60000970000192601889533031343443″ failed on physical path “vmhba2:C0:T0:L72″ H:0x0 D:0x2 P:0x0 Valid sense data: 0x5
    Oct 5 19:15:12 0x25 0x0.
    Oct 5 19:15:13 vmkernel: 26:15:27:14.237 cpu4:4440)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device “naa.60000970000192601889533031343443″ – issuing command 0x41033f6ba240
    Oct 5 19:15:13 vmkernel: 26:15:27:14.249 cpu4:4100)WARNING: NMP: nmpCompleteRetryForPath: Retry command 0x83 (0x41033f6ba240) to NMP device “naa.60000970000192601889533031343443″ failed on physical path “vmhba2:C0:T0:L72″ H:0x0 D:0x2 P:0x0 Valid sen
    Oct 5 19:15:13 se data: 0x5 0x25 0x0.
  • ESX/ESXi 3.5 and 4.0 do not experience this issue and can perform clones and Storage vMotion when sharing connectivity to the same storage

Resolution

This issue is caused by an incorrect response provided by the attached SAN storage for a SCSI command that it did not support. According to the logs, the SAN array provided the message Illegal Request (0x5) – Logical Unit Not Supported (0x25 0x0)in response to SCSI command type 0x83 (Extended Copy). The Extended Copy command is part of the vStorage APIs for Array Integration feature in VMware ESX and ESXi 4.1.

When VMware ESX or ESXi performs a Storage vMotion task, virtual machine data is moved between disks. If ESX is connected to an array that supports vStorage APIs for Array Integration (VAAI), it attempts an Extended Copy for disk clones. In other circumstances where an array that does not support VAAI, ESX perform this move via standard disk clone using host and SAN I/O.

In the above case, the Extended Copy was not supported by the SAN array. A valid response is Illegal Request – Invalid Field in Command Descriptor Block (CDB)(as opposed to Illegal Request – Logical Unit Not Supported.) It may be possible to enable VAAI support for a SAN array with additional firmware updates, engage your array vendor.

This miscommunication results in an operation time out for the disk copy on VMware ESX and a failed Storage vMotion. As a workaround, the VAAI primitives can be disabled in VMware ESX. With this configuration, ESX performs the data copy via standard disk clone using host and SAN I/O.

To disable array hardware acceleration for Cloning and Storage vMotion operations, see Disabling the VAAI functionality in ESX/ESXi 4.1 (1033665).

Note: In addition, you may have to update the firmware of the array to support VAAI.

Additional Information

VMware ESX, if connected to an array that supports VAAI, allows the array to offload workloads, such as block copying and allocation. This allows for operations such as virtual machine disk clones or Storage vMotion and block zeroing/allocations to be performed without consuming additional host computing resources or inducing unnecessary storage I/O. This capability is introduced with VMware vStorage APIs for Array Integration (VAAI) for VMware ESX/ESXi 4.1.

For more information, see vStorage APIs for Array Integration FAQ (1021976).

Note: VMware ESX attempts to perform or test VAAI capabilities (primitives) on attached storage devices associated to an array which conforms to a VAAI rule every 15 minutes. In the above scenario, there may be benign or informational errors logged every 15 minutes against all storage devices while these primitives are tested.

For additional technical information on SCSI command 0x83, see T10/99-143r1: 7.1 EXTENDED COPY command.

Note: The preceding link was correct as of April 14, 2010. If you find the link is broken, provide feedback and a VMware employee will update the link.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Sysprep file locations and versions

Symptoms

  • When attempting to customize the deployment of a virtual machine, the radio buttons are disabled (grayed out).
  • When a virtual machine is deployed from a template, the SID is always the same, despite the fact that you chose the option to generate a new SID during template deployment and guest operating system customization.
  • When attempting to create a new virtual machine from a template in vCenter Server, you see the error:Warning: Windows customization resources were not found on this server
  • In the guestcust.log file, you see the error:

deploy doesn’t contain known sysprep files

  • When you ignore the errors, the virtual machine deployment does not get customized

Purpose

This issue may occur if Microsoft’s Sysprep files are not found on the vCenter Server host, are not the correct version, or are not in the location they are expected.

This article guides you through the process of determining the correct version of Sysprep to use and the correct locations for these files.

Resolution

Microsoft has a different version of Sysprep for each release and service pack of Windows. You must use the version of Sysprep specific to the operating system you are deploying. The differences are not immediately visible in the packaging and documentation of the service packs, so it is necessary to manually investigate.

The contents of the Sysprep deploy.cab file must be extracted to the Sysprep Directory on the vCenter Server host. If the file downloaded from the Microsoft Web Site is a .cab file, the Installing the Microsoft Sysprep Tools appendix of the aa guide details how to install the Sysprep Tools.

If the file downloaded from the Microsoft Web Site is a .exe file, these additional steps must be executed to extract the files from the .exe:

  1. Open a Windows command prompt. For more information, see Opening a command or shell prompt (1003892).
  2. Change to the directory where the .exe file is saved.
  3. Enter the name of the .exe file with the /x switch to extract the files. For example:WindowsServer2003-KB926028-v2-x86-ENU.exe /x
  4. When prompted, choose a directory for the extracted files.
  5. Browse the directory and double-click the deploy.cab file.Note: In some cases, the deploy.cab file may be located within one of the subfolders created in Step 3.
  6. Select all the files, and copy them to the Sysprep Directory.

When the contents of the of the Sysprep deploy.cab file have been extracted to the Sysprep Directory on the vCenter Server:

  1. Log in to the vCenter Server as an Administrator.
  2. Click Start > Programs > Accessories > Windows Explorer.
  3. Navigate to the Sysprep Directory as listed in the table below.
  4. Right-click on the Sysprep .exe file and choose Properties.
  5. Click the Version tab. Record the number at the top next to File Version:.

The table below lists the Sysprep version for the Windows versions that are supported for Image Customization. Compare the Sysprep version number with the Windows version for which it is intended:

Notes:

  • If vCenter Server is installed on Windows Server 2008 and above, <directory_path> is %ALLUSERSPROFILE%\VMware\VMware VirtualCenter\Sysprep which generally translates to C:\ProgramData\VMware\VMware VirtualCenter\Sysprep by default.
  • If vCenter Server is installed on any other Windows operating system, <directory_path> is %ALLUSERSPROFILE%\Application Data\VMware\VMware VirtualCenter\Sysprep\ which generally translates to C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\Sysprep\ by default.
  • To check the SID of a server deployed from a template, you can use the PsGetSid. For more information, see http://technet.microsoft.com/en-us/sysinternals/bb897417.
Windows Version Sysprep Directory Sysprep Version
Windows 2000 Server SP4 with Update Rollup 1Download at

http://www.microsoft.com/downloads/details.aspx?FamilyID=0c4bfb06-2824-4d2b-abc1-0e2223133afb

<directory_path>\2k 5.0.2195.2104
Windows XP Pro SP2Download at

http://www.microsoft.com/downloads/details.aspx?FamilyId=3E90DC91-AC56-4665-949B-BEDA3080E0F6

<directory_path>\xp 5.1.2600.2180
Windows 2003 Server SP1Download at

http://www.microsoft.com/downloads/details.aspx?familyid=A34EDCF2-EBFD-4F99-BBC4-E93154C332D6

<directory_path>\svr2003 5.2.3790.1830(srv03_sp1_rtm.050324-1447)
Windows 2003 Server SP2Download at

http://www.microsoft.com/downloads/details.aspx?FamilyID=93f20bb1-97aa-4356-8b43-9584b7e72556

<directory_path>\svr2003 5.2.3790.3959(srv03_sp2_rtm.070216-1710)
Windows 2003 Server R2Download at

http://www.microsoft.com/downloads/details.aspx?FamilyID=93f20bb1-97aa-4356-8b43-9584b7e72556&displaylang=en

<directory_path>\svr2003 5.2.3790.3959(srv03_sp2_rtm.070216-1710)
Windows 2003 x64
Download at
http://www.microsoft.com/downloads/details.aspx?familyid=C2684C95-6864-4091-BC9A-52AEC5491AF7&displaylang=en
<directory_path>\svr2003-64 5.2.3790.3959(srv03_sp2_rtm.070216-1710)
Windows XP x64
Download at
http://www.microsoft.com/downloads/details.aspx?familyid=C2684C95-6864-4091-BC9A-52AEC5491AF7&displaylang=en
<directory_path>\xp-64 5.2.3790.3959(srv03_sp2_rtm.070216-1710)
Windows XP Pro SP3Download at

http://www.microsoft.com/downloads/details.aspx?familyid=673a1019-8e3e-4be0-ac31-70dd21b5afa7&displaylang=en

<directory_path>\xp 5.1.2600.5512
Windows VistaSystem Preparation tools are built into the Windows Vista operating system and do not have to be downloaded. Not Applicable Not Applicable
Windows Server 2008System Preparation tools are built into the Windows Server 2008 operating system and do not have to be downloaded. Not Applicable Not Applicable
Windows Server 2008 R2System Preparation tools are built into the Windows Server 2008 R2 operating system and do not have to be downloaded. Not Applicable Not Applicable
Windows 7System Preparation tools are built into the Windows 7 operating system and do not have to be downloaded. Not Applicable Not Applicable

Disabling the VAAI functionality in ESX/ESXi 4.1

Purpose

This article provides steps to disable the vStorage APIs for Array Integration (VAAI) functionality in ESX/ESXi 4.1. You may want to disable VAAI if the storage array devices in the environment do not support the hardware acceleration functionality or are not responding correctly to VAAI primitives.

For information on VAAI support in a given storage array or required firmware levels, contact the storage array vendor.

For more information on the VAAI functionality, see vStorage APIs for Array Integration FAQ (1021976).

Resolution

To disable VAAI in ESX/ESXi 4.1, you must modify these advanced configuration settings:

  • HardwareAcceleratedMove
  • HardwareAcceleratedInit
  • HardwareAcceleratedLocking

You can modify these settings using the vSphere Client, vSphere CLI, or a console connection to the ESX/ESXi host.

Disabling VAAI using the vSphere Client

  1. Open the VMware vSphere Client.
  2. In the Inventory pane, select the ESX host.
  3. Click the Configuration tab.
  4. Click Advanced Settings under Software.
  5. Click DataMover.
  6. Change the DataMover.HardwareAcceleratedMove setting to 0.
  7. Change the DataMover.HardwareAcceleratedInit setting to 0.
  8. Click VMFS3.
  9. Change the VMFS3.HardwareAcceleratedLocking setting to 0.
  10. Click OK to save your changes.
  11. Repeat this process for the all ESX/ESXi 4.1 hosts connected to the storage.

Disabling VAAI using vSphere CLI

Note: Ensure that the vSphere CLI (vCLI) is installed and is able to connect to the ESX hosts. For more information, see the vSphere Command-Line Interface Installation and Scripting Guide.

  1. Run these vicfg-advcfg commands to change the three settings:vicfg-advcfg <connection_options> -s 0 /DataMover/HardwareAcceleratedMove
    vicfg-advcfg <connection_options> -s 0 /DataMover/HardwareAcceleratedInit
    vicfg-advcfg <connection_options> -s 0 /VMFS3/HardwareAcceleratedLocking

    For more information and examples on using vCLI connection options, see the Common Options for vCLI Execution section of the vSphere Command-Line Interface Installation and Scripting Guide.

  2. Repeat this process for all the ESX/ESXi 4.1 hosts connected to the storage.

Disabling VAAI using an ESX/ESXi console connection

  1. Open a console to the ESX/ESXi host. For more information, see:
  1. Log in as root.
  2. Run these esxcfg-advcfgcommands to change the three settings:esxcfg-advcfg -s 0 /DataMover/HardwareAcceleratedMove
    esxcfg-advcfg -s 0 /DataMover/HardwareAcceleratedInit
    esxcfg-advcfg -s 0 /VMFS3/HardwareAcceleratedLocking
  3. Repeat this process for all the ESX/ESXi 4.1 hosts connected to the storage.

Note: You need not reboot the ESX/ESXi host for the changes to take effect.
For more information, see the Turn off Hardware Acceleration section of the ESX Configuration Guide.

Additional Information

To revert this configuration and to enable vStorage APIs for Array Integration functionality, change each of these settings from 0 to 1.

 

 

 

 

ESX/ESXi 4.x and ESXi 5.0 shutdown and reboot commands

Details

When virtual machines are running, ESX/ESXi might not clear the RAID controller’s cache if you shut down or reboot ESX/ESXi host by using the following commands on the service console:

  • reboot -f
  • halt
  • shutdown

Solution

You can shut down or reboot ESX/ESXi 4.x or ESXi 5.0 hosts using any of the following methods:

ESX 4.x

Log in to the ESX service console and perform one of the following steps from the service console to shutdown or reboot ESX 4.x hosts.

  1. Run the shutdown –r now command to reboot the system.
    Note: This command shuts down the virtual machines running on the ESX 4.0 hosts.
  2. Run the reboot command to reboot the system.
  3. Run the poweroff command to shut down ESX. After the shutdown, a message indicates that it is safe to power off your system. Press the power button until the machine powers off. You can then manually reboot the system.

ESXi 4.x/5.0

  1. In the console screen of the ESXi 4.0 host, press Ctrl+Alt+F2 to see the Direct Console User Interface (DCUI) screen.
  2. In the DCUI screen, press F12to view the shutdown related options for the ESXi host.
    • Press F2 to shut down.
    • Press F11 to reboot.

ESX/ESXi 4.x or ESXi 5.0

From vSphere Client

Before shutting down or rebooting the ESX/ESXi 4.x or ESXi 5.0 hosts, ensure that the hosts are put in maintenance mode. Powering off a managed host disconnects it from vCenter Server, but does not remove it from the inventory.

  1. Shut down or vMotion all virtual machines running on ESX/ESXi 4.x or ESXi 5.0 hosts.
  2. Put the ESX/ESXi 4.x or ESXi 5.0 hosts in the maintenance mode.
  3. Right-click the ESX/ESXi 4.x or ESXi 5.0 host that you want to shut down, and click Reboot or Shut Down.
    • If you select Reboot, the ESX/ESXi 4.x or ESXi 5.0 host shuts down and reboots.
    • If you select Shut Down, the ESX/ESXi 4.x or ESXi 5.0 host shuts down. You must manually power the system back.
  4. Provide a reason for the shut down or reboot. This information is added to the log.

From vCLI or vMA

 

Note: By default, the hostops.pl file in VMware vSphere CLI 4.0 U1 installations is located at C:\Program Files\VMware\VMware vSphere CLI\Perl\apps\host.

  • To put ESX/ESXi 4.x or ESXi 5.0 hosts in the maintenance mode, run the following command from vMA (vSphere Management Assistant) or vCLI (vSphere Command-Line Interface) console screen:/usr/lib/vmware-vcli/apps/host/hostops.pl –target_host <ESX-Host-FQDN> –operation enter_maintenance –url https://<vCenter-Host>/sdk/vimService.wsdl
  • To reboot ESX/ESXi 4.x or ESXi 5.0 hosts, run the following command from vMA or vCLI console screen:/usr/lib/vmware-vcli/apps/host/hostops.pl –target_host <ESX-Host-FQDN> –operation reboot –url https://<vCenter-Host>/sdk/vimService.wsdl
  • To shut down ESX/ESXi 4.x or ESXi 5.0 hosts, run the following command from vMA or vCLI console screen:

    /usr/lib/vmware-vcli/apps/host/hostops.pl –target_host <ESX-Host-FQDN> –operation shutdown –url https://<vCenter-Host>/sdk/vimService.wsdl

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004340

Powering off an unresponsive virtual machine on an ESX host

Symptoms

  • You cannot power off an ESX host virtual machine.
  • A virtual machine is unresponsive and cannot be killed or stopped.
  • You cannot access or unlock files on a virtual machine. For more information, see Virtual machine does not power on because of missing or locked files (10051).
  • After shutting down a virtual machine, vCenter Server shows the virtual machine as up and running.
  • There is no indication that a virtual machine is shut down.
  • You cannot edit properties in the virtual machine.
  • You see one or more of these errors:
    • Soap error 999. The operation is not allowed in current state.
    • The attempted operation cannot be performed in the current state (Powered Off).
    • The request refers to an object that no longer exists or has never existed

To determine if you must use the command line, attempt to power off the virtual machine:

  1. Connect VMware Infrastructure/vSphere Client to the vCenter Server. Right-click the virtual machine and click Power off.
  2. Connect vSphere Client directly to the ESX host. Right-click the virtual machine and click Power off.If this does not work, you must use the command line method.

Determining the virtual machine’s state

  1. Determine the host on which the virtual machine is running. This information is available in the virtual machine’s Summary tab when viewed in the vSphere Client page.
  2. Log in as root to the ESX host using an SSH client.
  3. Run this command to verify that the virtual machine is running on this host:# vmware-cmd -l

    The output of this command returns the full path to each virtual machine running on the ESX host. Verify that the virtual machine is listed, and record the full path for use in this process. For example:

    # /vmfs/volumes/<UUID>/<VMDIR>/<VMNAME>.vmx

  4. Run this command to determine the state in which the ESX host believes the virtual machine to be operating:# vmware-cmd <path.vmx> getstate

    If the output from this command is getstate() = on, the vCenter Server may not be communicating with the host properly. This issue must be addressed in order to complete the shutdown process.

    If the output from this command is getstate() = off, the ESX host may be unaware it is still running the virtual machine. This article provides additional assistance in addressing this issue.

Powering off the virtual machine using the vmware-cmd command

Caution: If you want to collect the virtual machine logs to assist in troubleshooting, do not perform the steps in this section.

This procedure uses the ESX command line tool and attempts to gracefully power off the virtual machine. It works if the virtual machine’s process is running properly and is accessible. If unsuccessful, the virtual machine’s process may not be running properly and may require further troubleshooting.

  1. From the Service Console of the ESX host, run the command:vmware-cmd <path.vmx> stop

    Note: <path.vmx> is the complete path to the configuration file, as determined in the previous section. To verify that it is stopped, run the command:

    # vmware-cmd <path.vmx> getstate

  2. From the Service Console of the ESX host, run the command:# vmware-cmd <path.vmx> stop hard

    Note: <path.vmx> is the complete path to the configuration file, as determined in the previous section. To verify that it is stopped, run the command:

    # vmware-cmd <path.vmx> getstate

  3. If the virtual machine is still inaccessible, proceed to the next section.

Powering off the virtual machine while collecting diagnostic information using the vm-support script

Use this procedure when you want to investigate the cause of the issue. This command attempts to power off the virtual machine while collecting diagnostic information. Perform these steps in order, as they are listed in order of potential impact to the system if performed incorrectly.

Perform these steps first:

  1. Determine the WorldID with the command:# vm-support -x
  2. Kill the virtual machine by using this command in the home directory of the virtual machine:# vm-support -X <world_ID>

    This can take upwards of 30 minutes to terminate the virtual machine.

    Note: This command uses several different methods to stop the virtual machine. When attempting each method, the command waits for a pre-determined amount of time. The timeout value can be configured to be 0 by adding -d0 to switch to the vm-support command.

If the preceding steps fail, perform these steps for an ESX 3.x host:

  1. List all running virtual machines to find the VMID of the affected virtual machine with the command:# cat /proc/vmware/vm/*/names
  2. Determine the master world ID with the command:# cat /proc/vmware/vm/####/cpu/status | less
  3. Scroll to the right with the arrow keys until you see the group field. It appears similar to:Group
    vm.####
  4. Run this command to shut the virtual machine down with the group ID:# /usr/lib/vmware/bin/vmkload_app -k 9 ####

If the preceding steps fail, perform these steps for an ESX 4.x host:

  1. List all running virtual machines to find the vmxCartelID of the affected virtual machine with the command:# /usr/lib/vmware/bin/vmdumper -l
  2. Scroll through the list until you see your virtual machine’s name. The output appears similar to:vmid=5151 pid=-1 cfgFile=”/vmfs/volumes/4a16a48a-d807aa7e-e674-001e4ffc52e9/mdineeen_test/vm_test.vmx” uuid=”56 4d a6 db 0a e2 e5 3e-a9 2b 31 4b 69 29 15 19″ displayName=”vm_test” vmxCartelID=####
  3. Run this command to shut the virtual machine down with the vmxCartelID:# /usr/lib/vmware/bin/vmkload_app -k 9 ####

Using the ESX command line to kill the virtual machine

If the virtual machine does not power off using the steps in this article, it has likely lost control of its process. You need to manually kill the process at the command line.

Caution: This procedure is potentially hazardous to the ESX host. If you do not identify the appropriate process id (PID), and kill the wrong process, it may have unanticipated results. If you are not comfortable with these procedures, contact VMware Technical Support and open a Service Request. Refer to this article when you create the SR.

  1. To determine if the virtual machine process is running on the ESX host, run the command:# ps auxwww |grep -i <VMNAME>.vmx

    The output of this command appears similar to this if the .vmx process is running:

    root 3093 0.0 0.3 2016 860 ? S< Jul30 0:17 /usr/lib/vmware/bin/vmkload_app /usr/lib/vmware/bin/vmware-vmx -ssched.group=host/user -# name=VMware ESX Server;version=3.5.0;licensename=VMware ESX Server;licenseversion=2.0 build-158874; -@ pipe=/tmp/vmhsdaemon-0/vmx569228e44baf49d1; /vmfs/volumes/49392e30-162037d0-17c6-001f29e9abec/<VMDIR>/<VMNAME>.vmx

    The process ID (PID) for this process is in bold. In this example, the PID is 3093. Take note of this number for use in these steps.

    Caution: Ensure that you identify the line specific only to the virtual machine you are attempting to repair. If you continue this process for another virtual machine the one in question, you can cause downtime for the other virtual machine.

    If the .vmx process is listed, it is possible that the virtual machine has lost control of the process and that it must be stopped manually.

  2. To kill the process, run the command:# kill <PID>
  3. Wait 30 seconds and check for the process again.
  4. If it is not terminated, run the command:# kill -9 <PID>
  5. Wait 30 seconds and check for the process again.
  6. If it is not terminated, the ESX host may need to be rebooted to clear the process. This is a last resort option, and should only be attempted if the preceding steps in this article are unsuccessful.

 

Collecting diagnostic information for VMware products

Purpose

VMware technicians request diagnostic information from you when a support request is addressed. This diagnostic information contains product specific logs and configuration files from the host on which the product is run. This information is gathered using a specific script or tool within the product.

Note: Collecting diagnostic information is the same as collecting or gathering log files.

This article provides procedures for obtaining diagnostic information for all VMware products.

The diagnostic information obtained by using this article is uploaded to VMware Technical Support. To properly identify your information, you need to use the Support Request (SR) number you receive when you create the new SR.

Notes:

Resolution

Select your product and, where appropriate, the version from this list:

VMware ACE (v2.x) Collecting diagnostic information for VMware ACE 2.x (1000588)
VMware Capacity Planner Collecting diagnostic information for VMware Capacity Planner (1008424)
VMware Consolidated Backup (v1.5) Collecting diagnostic information for VMware Virtual Consolidated Backup 1.5 (1006784)
VMware Converter Collecting diagnostic information for VMware Converter (1010633)
VMware Data Recovery Collecting diagnostic information for VMware Data Recovery (1012282)
VMware ESX/ESXi Collecting diagnostic information for VMware ESX/ESXi using the vSphere Client (653)Collecting diagnostic information for VMware ESX/ESXi using the vm-support command (1010705)
Collecting diagnostic information for VMware vSphere vCenter Server and ESX/ESXi using the vSphere PowerCLI (1027932)
VMware Fusion Collecting diagnostic information for VMware Fusion (1003894)
VMware Infrastructure SDK
( VMware vSphere Web Services, vSphere SDK for Perl, vSphere PowerCLI, vSphere vCLI, CIM SDK)
Collecting diagnostic information for VMware Infrastructure SDK (1001457)Collecting VMware Infrastructure Virtual Disk Development Kit Diagnostic Information (1006186)
VMware Lab Manager
v2.4 & 2.5 Collecting diagnostic information for Lab Manager 2.x (4637378)
v3.0 Collecting diagnostic information for Lab Manager 3.x (1006777)
v4.0 Collecting diagnostic information for Lab Manager 4.x (1012324)
VMware Server Collecting diagnostic information for VMware Server (1008254)
VMware Stage Manager (1.0.x) Collecting diagnostic information for VMware Stage Manager 1.0.x (1005865)
VMware Storage Appliance Collecting diagnostic information for a vSphere Storage Appliance cluster (2003549)
VMware ThinApp (v4.0x) Collecting diagnostic information for VMware ThinApp (1006152)
VMware Thinstall (v2.0 & 3.0) Collecting diagnostic information for VMware ThinApp (1006152)
VMware Tools Collecting diagnostic information for VMware Tools (1010744)
VMware Update Manager Collecting diagnostic information for VMware Update Manager (1003693)
VMware vCenter AppSpeed Collecting diagnostic information for VMware vCenter AppSpeed (1012876)
VMware vCenter CapacityIQ Collecting diagnostic information for VMware vCenter CapacityIQ (1022927)
VMware vCenter Chargeback Collecting diagnostic information for vCenter Chargeback (1020274)
VMware vCenter Configuration Manager (v4.x and 5.x) Collecting diagnostic information for VMware vCenter Configuration Manager (2001258)
VMware vCenter Orchestrator (v4.0) Collecting diagnostic data for VMware Orchestrator APIs (1010959)
VMware vCenter Operations Enterprise Collecting diagnostic information for VMware vCenter Operations Enterprise (2006599)
VMware vCenter Operations Standard Collecting diagnostic information for VMware vCenter Operations (1036655)
VMware vCenter Server Heartbeat Retrieving the VMware vCenter Server Heartbeat Logs and other useful information for support purposes (1008124)
VMware vCenter Site Recovery Manager Collecting diagnostic information for Site Recovery Manager (1009253)
VMware vCloud Director Collecting diagnostic information for VMware vCloud Director (1026312)
VMware View Manager (v3.x & v4.x) Collecting diagnostic information for VMware View 3.x and 4.0.x (1017939)
VMware Virtual Desktop Manager (v2.x) Collecting diagnostic information for VMware Virtual Desktop Manager (VDM)(1003901)
VMware VirtualCenter
v1.0, 1.1 & 1.2 Collecting diagnostic information for VirtualCenter 1.0, 1.1 and 1.2 (1365)
v1.3.1, 1.4 & 1.5 Collecting diagnostic information for VirtualCenter 1.3.1 and 1.4 (1935)
v2.0 & 2.5 Collecting diagnostic information for VMware VirtualCenter 2.x (1003688)
VMware Virtual Infrastructure Client %UserProfile%\\Local Settings\Application Data\VMware\vpx\viclient*.log
VMware vCenter Server (v4.x, 5.0) Collecting diagnostic information for VMware vCenter Server (1011641)
VMware Workstation (v6.0, Windows and Linux) See the Preface of the Workstation 6 User Manual
VMware Workstation (v7.0, Windows and Linux) See the Running the Support Script section of the Workstation 7 User Manual

Extending partitions in Windows using DiskPart

Caution: VMware strongly recommends that you have backups in place before performing any disk partition operation. Also make sure the virtual machine has no snapshots, before starting to extend the VMDK. If the virtual machine has snapshots use “Delete all” from the Snapshot Manager to commit them. Verify again in the Snapshot Manager, in the Edit Settings and the virtual machine datastore that the snapshots were committed.

To expand VMDK and extend a partition:

  1. Verify that the virtual machine does not have any snapshots by going into the virtual machine’s directory and looking for Delta files. Run the command:#ls -lah /vmfs/volumes/datastore_name/vm_name/*delta*
    -rw——- 1 root root 1.8G Oct 10 10:58 vm_name-000001-delta.vmdk
  2. If the virtual machine does have snapshots, commit them using these commands:#vmware-cmd -l /vmfs/volumes/datastore_name/vm_name/vm_name.vmx

    #vmware-cmd /vmfs/volumes/datastore_name/vm_name/vm_name.vmx removesnapshots removesnapshots() = 1

  3. Power off the virtual machine.
  4. To expand the VMDK using the VI Client (if the option exists), edit the settings of the virtual machine and click the hard disk you want to expand.
  5. Enter a new value in the New Sizefield:To expand the VMDK using the vmkfstools -X command, run the command:

    #vmkfstools -X <New Disk Size> <VMDK to extend>

    #vmkfstools -X 30G /vmfs/volumes/datastore_name/vm_name/vm_name.vmdk

    Note: Ensure that you point to the <vm_name>.vmdk, and not to the <vm_name>-flat.vmdk. Using vmkfstools -X is the only option to expand an IDE virtual disk .

  6. To extend the C: partition, find a helper virtual machine and attach the disk from the first virtual machine to the helper.To add an existing virtual disk to the helper virtual machine:
    1. Go to the Edit Settings menu of the virtual machine.
    2. Click Add > Hard Disk > Use Existing Virtual Disk.
    3. Navigate to the location of the disk and select to add it into the virtual machine.Note: A helper virtual machine is a virtual machine that has the same operating system to which you attach the disk.
  7. Start the virtual machine.
  8. Verify the volume in question has been mounted and has been assigned a drive letter. This can be set in Windows Disk Management or by selecting the volume and typing assign from within the DiskPart command.In versions of Windows prior to 2008, open a command prompt and run the DiskPart command:

    C:\Documents and Settings\username>diskpart

    Microsoft DiskPart version 5.1.3565
    Copyright (C) 1999-2003 Microsoft Corporation.
    On computer: USERNAME-HELPER-VM
    DISKPART> list volume

    Volume ### Ltr Label Fs Type Size Status Info
    ———- — ———– —– ———- ——- ——— ——–
    Volume 0 D CD-ROM 0 B
    Volume 1 C NTFS Partition 30 GB Healthy System
    Volume 2 E NTFS Partition 10 GB Healthy

    DISKPART> select Volume 2
    Volume 2 is the selected volume.

    DISKPART> extend
    DiskPart successfully extended the volume.
    DISKPART> exit
    Leaving DiskPart…

    Note: Ensure to choose the correct volume. The Size is the old value.

    Note: If you are in Windows 2003, and you see the error The volume you have selected may not be extended. Please select another volume and try again, see the Microsoft Knowledge Base article 841650.

  9. In Windows 2008, click Start > Computer Management > Disk Manager, right-click on the partition and select Extend Volume. For more information, see the Microsoft Knowledge Base article 325590.
  10. Power off and detach the disk from the helper virtual machine. Keep all default settings and do not delete the VMDK from the disk.
  11. Power on the first virtual machine and verify the disk size change.

Multipathing policies in ESX/ESXi 4.x and ESXi 5.x

Purpose

This article describes the various pathing policies that can be used with VMware ESX/ESXi 4.x and VMware ESXi 5.x.

Resolution

These pathing policies can be used with VMware ESX/ESXi 4.x and ESXi 5.x:

  • Most Recently Used (MRU) — Selects the first working path, discovered at system boot time. If this path becomes unavailable, the ESX/ESXi host switches to an alternative path and continues to use the new path while it is available. This is the default policy for Logical Unit Numbers (LUNs) presented from an Active/Passive array. ESX/ESXi does not return to the previous path when if, or when, it returns; it remains on the working path until it, for any reason, fails.Note: The preferred flag, while sometimes visible, is not applicable to the MRU pathing policy and can be disregarded.
  • Fixed (Fixed) — Uses the designated preferred path flag, if it has been configured. Otherwise, it uses the first working path discovered at system boot time. If the ESX/ESXi host cannot use the preferred path or it becomes unavailable, ESX/ESXi selects an alternative available path. The host automatically returns to the previously-defined preferred path as soon as it becomes available again. This is the default policy for LUNs presented from an Active/Active storage array.
  • Round Robin (RR) — Uses an automatic path selection rotating through all available paths, enabling the distribution of load across the configured paths.
    • For Active/Passive storage arrays, only the paths to the active controller will used in the Round Robin policy.
    • For Active/Active storage arrays, all paths will used in the Round Robin policy.

Note: This policy is not currently supported for Logical Units that are part of a Microsoft Cluster Service (MSCS) virtual machine.

  • Fixed path with Array Preference — The VMW_PSP_FIXED_AP policy was introduced in ESX/ESXi 4.1. It works for both Active/Active and Active/Passive storage arrays that support ALUA. This policy queries the storage array for the preferred path based on the arrays preference. If no preferred path is specified by the user, the storage array selects the preferred path based on specific criteria.Note: The VMW_PSP_FIXED_AP policy has been removed from the ESXi 5.0 release and VMW_PSP_MRU became the default PSP for all ALUA devices

Notes:

  • These pathing policies apply to VMware’s Native Multipathing (NMP) Path Selection Plugins (PSP). Third party PSPs have their own restrictions.
  • Switching to Round Robin from MRU or Fixed is safe and supported for all arrays, Please check with your vendor for supported Multipathing policies for your storage array. Switching to a unsupported pathing policy can cause an outage.

Warning: VMware does not recommend changing the LUN policy from Fixed to MRU, as the automatic selection of the pathing policy is based on the array that has been detected by the NMP PSP.

Additional Information

The Round Robin (RR) multipathing policies have configurable options that can be modified at the command-line interface. Some of these options include:

  • Number of bytes to send along one path for this device before the PSP switches to the next path.
  • Number of I/O operations to send along one path for this device before the PSP switches to the next path.

For more information, see Round Robin Operations with esxcli nmp roundrobin in the vSphere Command-Line Interface Installation and Reference Guide for the appropriate version of VMware product.

For more information, search for Multipathing Considerations and PSP in the vSphere 5 Documentation Center.

Modifying path information for ESX hosts

Purpose

This article explains how to enable and disable LUN paths, as well as changing the multipath policy information for ESX/ESXi hosts.

Resolution

There are two methods used to change multipaths and to enable/disable paths on the ESX/ESXi host:

  • ESX command line – use the command line to modify the multipath information when performing troubleshooting procedures.
  • VMware Infrastructure/vSphere Client –use this option when you are performing system maintenance.

ESXi 5.x

Change multipath policy

To change the multipath policy information from the ESXi host command line:

  1. Log in to the ESXi host.
  2. Run the command:
    esxcli storage nmp device set –device <naa_id> –psp <path_policy>Where <naa_id> is the NAA ID of the device and <path_policy> is one of the PSP options listed in Multipathing policies in ESX 4 (1011340).

    For example, to change the above device path policy to Round Robin:

    esxcli storage nmp device set –device naa.6006016010202a0080b3b8a4cc56e011 –psp VMW_PSP_RR

To change multipath settings for your storage in the vSphere Client:

  1. Select an ESX/ESXi host, and click the Configuration tab.
  2. Click Storage.
  3. Select a datastore or mapped LUN.
  4. Click Properties.
  5. In the Properties dialog, select the desired extent, if necessary.
  6. Click Extent Device > Manage Paths and obtain the paths in the Manage Path dialog.
  7. Under the Policy section, select the desired path from the dropdown menu.
  8. Click Change to confirm the change in path policy.

For information on multipathing options, see Multipathing policies in ESX 4 (1011340).

Enable or disable path

To enable or disable a path from the ESX/ESXi host command line:

  1. Log in to the ESX/ESXi host.
  2. Run the command:esxcli storage core path set –state=<state> -p <path>

    Where:

    • <path> is the particular path to be enabled/disabled
    • <device> is the NAA ID of the device
    • <state> is active or off

For example, to disable path fc.2000001b32865b73:2100001b32865b73-fc.50060160c6e018eb:5006016646e018eb-naa.6006016095101200d2ca9f57c8c2de11, which has a Runtime Name of: vmhba3:C0:T1:L0, for device naa.6006016010202a0080b3b8a4cc56e011:

esxcli storage core path set –state=off-p fc.2000001b32865b73:2100001b32865b73-fc.50060160c6e018eb:5006016646e018eb-naa.6006016095101200d2ca9f57c8c2de11

To enable or disable a path for your storage in the vSphere Client:

  1. Select an ESX/ESXi host, and click the Configuration tab.
  2. Click Storage.
  3. Select a datastore or mapped LUN.
  4. Click Properties.
  5. In the Properties dialog, select the desired extent, if necessary.
  6. Click Extent Device > Manage Paths and obtain the paths in the Manage Path dialog.
  7. Highlight the desired path, and right-click and click Disable or Enable.
  8. If the currently active path is disabled, it forces a path failover.

For information on multipathing options, see Multipathing policies in ESX 4 (1011340).

ESX/ESXi 4.x

Change multipath policy

To change the multipath policy information from the ESX/ESXi host command line:

  1. Log in to the ESX/ESXi host.
  2. Run the command:esxcli nmp -d <naa_id> <path_policy>

    Where <naa_id> is the NAA ID of the device and <path_policy> is one of the PSP listed in Multipathing policies in ESX 4 (1011340).

    For example, to change the above device path policy to Round Robin:

    esxcli nmp -d naa.6006016010202a0080b3b8a4cc56e011 VMW_PSP_RR

To change multipath settings for your storage in the vSphere Client:

  1. Select an ESX/ESXi host, and click the Configuration tab.
  2. Click Storage.
  3. Select a datastore or mapped LUN.
  4. Click Properties.
  5. In the Properties dialog, select the desired extent, if necessary.
  6. Click Extent Device > Manage Paths and obtain the paths in the Manage Path dialog.
  7. Under the Policy section, select the desired path from the dropdown menu.
  8. Click Change to confirm the change in path policy.

For information on multipathing options, see Multipathing policies in ESX 4 (1011340).

Enable or disable path

To enable or disable a path from the ESX/ESXi host command line:

  1. Log in to the ESX/ESXi host.
  2. Run the command:esxcfg-mpath -P=<path> -s=<state>

    Where:

    • <path> is the particular path to be enabled/disabled
    • <device> is the NAA ID of the device
    • <state> is active or off

For example, to disable path fc.2000001b32865b73:2100001b32865b73-fc.50060160c6e018eb:5006016646e018eb-naa.6006016095101200d2ca9f57c8c2de11, which has a Runtime Name of: vmhba3:C0:T1:L0, for device naa.6006016010202a0080b3b8a4cc56e011:

esxcfg-mpath -P= fc.2000001b32865b73:2100001b32865b73-fc.50060160c6e018eb:5006016646e018eb-naa.6006016095101200d2ca9f57c8c2de11 -s=off
To enable or disable a path for your storage in the vSphere Client:

  1. Select an ESX/ESXi host, and click the Configuration tab.
  2. Click Storage.
  3. Select a datastore or mapped LUN.
  4. Click Properties.
  5. In the Properties dialog, select the desired extent, if necessary.
  6. Click Extent Device > Manage Paths and obtain the paths in the Manage Path dialog.
  7. Highlight the desired path, and right-click and click Disable or Enable.
  8. If the currently active path is disabled, it forces a path failover.

To change the preferred path for a device or a LUN from the command line:

  1. Log in to the ESX/ESXi host.
  2. Run the command to get the list of path available and the path policy details:esxcfg-mpath -l
  3. Run this command to change the preferred path:esxcfg-mpath –preferred –path=<path> –lun=<device>

    Where

    <path> is the path to be enabled/disabled. For example, vmhba2:3:4
    <device> is the canonical name of the device. For example, vmhba2.1.4

  4. Run this command to verify the changes to the preferred path:esxcfg-mpath -l

    Note the change in the On Active preferred for the specified path.

For information on multipathing options, see Multipathing policies in ESX 4 (1011340).

ESX/ESXi 3.x

Change multipath policy

To change the multipath policy information from the ESX/ESXi host command line:

  1. Log in to the ESX/ESXi host
  2. Run the command:esxcfg-mpath –policy=mru –lun=<device>

    Where <device> is the Canonical name of the device and <path_policy> is one of fixed, mru or rr

To change multipath settings for your storage in the vSphere Client:

  1. Select an ESX/ESXi host and click the Configuration tab.
  2. Click Storage.
  3. Select a datastore or mapped LUN.
  4. Click Properties. The following dialog appears:In this example, you can see that the canonical name is vmhba2:1:0 and the true paths are vmhba2:1:0 and vmhba2:3:0 .
    The active path is vmhba2:1:0 and the policy is Most Recently Used.
  5. Click Manage Paths. The Manage Paths dialog appears:
  6. To change the policy, click Change in the Policy section. The Manage Paths – Selection Policy dialog appears.
  7. Click OK to return to the Manage Paths dialog.

Enable or disable path

To enable or disable a path from the ESX/ESXi host command line:

  1. Log in to the ESX/ESXi host.
  2. Run the command:esxcfg-mpath –path=<path> –lun=<device> –state=<state>

    Where:

    • <path> is the particular path to be enabled/disabled (for example, vmhba2:3:4)
    • <device> is the Canonical name of the device (for example, vmhba2.1.4_
    • <state> is on or off

VI Client

To enable or disable a path for your storage in the vSphere Client:

  1. To enable or disable a path, open the Manage Paths dialog (follow Steps 1-6 above).
  2. Select the desired path and click Change.As the policy for this LUN is Most Recently Used, the Preferred option is unavailable. If you disable the currently active path, it forces a path failover.
  3. Click OK to return to the Manage Paths dialog.

 

Configuring networking from the ESX service console command line

To configure networking from the ESX service console command line:

  1. Ensure the network adapter you want to use is currently connected with the command:[root@server root]# esxcfg-nics –l

    The output appears similar to:

    Name PCI Driver Link Speed Duplex Description
    vmnic0 06:00.00 tg3 Up 1000Mbps Full Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet
    vmnic1 07:00.00 tg3 Up 1000Mbps Full Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet

    In the Link column, Up indicates that the network adapter is available and functioning.

  1. List the current virtual switches with the command:[root@server root]# esxcfg-vswitch –l

    The output appears similar to:

    Switch Name Num Ports Used Ports Configured Ports Uplinks
    vSwitch0 32 3 32 vmnic0

    PortGroup Name Internal ID VLAN ID Used Ports Uplinks
    VM Network portgroup2 0 0 vmnic0

    In the example output, there exists a virtual machine network named VM Network with no Service Console portgroup. For illustration, the proceeding steps show you how to create a new virtual switch and place the service console port group on it.

  2. Create a new virtual switch with the command:[root@server root]# esxcfg-vswitch –a vSwitch1
  3. Create the Service Console portgroup on this new virtual switch:[root@server root]# esxcfg-vswitch –A “Service Console” vSwitch1

    Because there is a space in the name (Service Console), you must enclose it in quotation marks.

    Note: To create Service Consoles one at time, you may need to delete all previous settings. For more information, see Recreating Service Console Networking from the command line (1000266).

  4. Up-link vmnic1 to the new virtual switch with the command:[root@server root]# esxcfg-vswitch –L vmnic1 vSwitch1
  5. If you need to assign a VLAN, use the command:[root@server root]# esxcfg-vswitch -v <VLAN> -p “Service Console” vSwitch0

    where <VLANID> is the VLAN number. A zero here specifies no VLAN.

  6. Verify the new virtual switch configuration with the command:[root@server root]# esxcfg-vswitch –l

    The output appears similar to:

    Switch Name Num Ports Used Ports Configured Ports Uplinks
    vSwitch0 32 3 32 vmnic0

    PortGroup Name Internal ID VLAN ID Used Ports Uplinks
    Service Console portgroup5 0 1 vmnic0

    Switch Name Num Ports Used Ports Configured Ports Uplinks
    vSwitch1 64 1 64 vmnic1

    PortGroup Name Internal ID VLAN ID Used Ports Uplinks
    Service Console portgroup14 0 1 vmnic1

  7. Create the vswif (Service Console) interface. For example, run the command:[root@server root]# esxcfg-vswif –a vswif0 –i 192.168.1.10 –n 255.255.255.0 –p “Service Console”
    [‘Vnic’ warning] Generated New Mac address, 00:50:xx:xx:xx:xx for vswif0

    Nothing to flush.

  8. Verify the configuration with the command:[root@esx]# esxcfg-vswif –l
    Name Port Group IP Address Netmask Broadcast Enabled DHCP
    v swif0 Service Console 192.168.1.10 255.255.255.0 192.168.1.255 true false

Committing snapshots on ESXi host from command line

Symptoms

  • When checking the Snapshot Manager, no snapshots are detected for the virtual machine
  • The virtual machine is running on snapshot disksNote:

    To verify if the virtual machine is running on snapshot disks:

    1. Right-click the virtual machine and click Edit Settings.
    2. Select the Virtual Disk and check the Disk File. If it is labelled as VM-000001.vmdk , the virtual machine is running on snapshot disks.

Purpose

This article provides the steps to consolidate delta disks created by taking a snapshot of the virtual machine.

Resolution

To consolidate snapshots on an ESXi host:

  1. Log in as root to the ESXi console through iLO/DRAC. To log in to ESXi hosts using SSH, see Tech Support Mode for Emergency Support(1003677) and Using Tech Support Mode in ESXi 4.1 (1017910).
  2. Navigate to the virtual machine directory containing the .vmdk files.
  3. Run this command to list the files in the directory:ls -ltrh *.vmdk
  4. Locate any VM_NAME-00000#.vmdk or VM_NAME-00000#-delta.vmdk snapshot files. Look for numbered files following the hyphen (-) in the name. Also, verify that the timestamp is current on the delta files.
  5. Run this command to get a list of virtual machines and the VMID for each virtual machine:vim-cmd vmsvc/getallvms
  6. Make a note of the VMID for the specific virtual machine.
  7. To verify if the snapshot exists, run this command and check the Snapshot Name, Snapshot Created On, and Snapshot State:vim-cmd vmsvc/snapshot.get [VMID]

    You see an output similar to:

    Get Snapshot:
    |-ROOT
    –Snapshot Name        : Test
    –Snapshot Desciption  :
    –Snapshot Created On  : 8/27/2009 13:49:55
    –Snapshot State       : powered on

  8. Run this command to remove all snapshots:vim-cmd vmsvc/snapshot.removeall [VMID]
  9. If the snapshot.removeall command fails with the error cannot find vmid, run this command to create a new snapshot:vim-cmd vmsvc/snapshot.create [VmId] [snapshotName]
  10. Run this command and try to remove all snapshots again:vim-cmd vmsvc/snapshot.removeall [VmId]

Tags

create-snapshots  snapshot-command-line

Configuring IPv6 and IPsec on vSphere ESX and ESXi 4.1

Purpose

VMware vSphere ESX/ESXi 4.1 supports IPv4 and IPv6, though IPv6 support is disabled by default. This article provides steps to enable IPv6, and optionally configure IPsec for IPv6 VMkernel traffic.

For more information, see Advanced Networking: Internet Protocol Version 6 in the ESX/ESXi 4.1 Configuration Guide.

For ESX/ESXi 4.0, see Configuring IPv6 on ESX 4.0.x (1010812).

Resolution

VMware vSphere ESX/ESXi 4.1 supports IPv6 for use with the Service Console and VMkernel management interfaces, and is compatible with Software iSCSI, vMotion, High Availability (HA) and Fault Tolerance (FT).

Note: IPv6 is not supported for a dependent hardware iSCSI adapter or with TCP Checksum Offload.

Enabling IPv6 on vSphere ESX/ESXi 4.1

IPv6 support can be enabled or disabled on a vSphere ESX/ESXi 4.1 host using the vSphere Client, the console or using the vSphere Command-Line Interface. Enabling IPv6 requires a reboot to take effect.

To enable IPv6 using the vSphere Client:

  1. Connect to the host or vCenter Server using the vSphere Client.
  2. Select the host in the inventory and click the Configuration tab.
  3. Under the Hardware section, click the Networking link.
  4. In the Virtual Switch view, click the top-level Properties link.
  5. Select Enable IPv6 support on this host system.
  6. Click OK.
  7. Reboot the host for changes to take effect.Note: To disable IPv6, deselect the checkbox and reboot.

To enable IPv6 using the console or vCLI commands:

  1. Open a console to the ESX or ESXi host, or to the location the vCLI is installed. For more information, see:
  1. Enable IPv6 support on the VMkernel network interfaces using one of the commands:
    • At the console: esxcfg-vmknic --enable-ipv6 true
    • Using the vCLI: vicfg-vmknic <connection_options> --enable-ipv6 true
  1. For ESX only, additionally enable IPv6 support for the Service Console network interfaces using the command:
    • At the console: esxcfg-vswif --enable-ipv6 true
  1. Reboot the host for the changes to take effect.Note: To disable IPv6, replace true with false in the commands and reboot.

Configuring IPv6 interface addresses on vSphere ESX/ESXi 4.1

IPv6 addresses can be configured for VMkernel and Service Console network interfaces using the vSphere Client or using the command line.

To set an IPv6 address using the vSphere Client, see VMkernel Networking Configuration and Service Console Configuration in the ESX/ESXi 4.1 Configuration Guide.

To set an IPv6 address for a VMkernel network interfaces using the console or vCLI, use one of the commands:

esxcfg-vmknic --ip X:X:X:X::/X PortgroupName

vicfg-vmknic <connection_options> --ip X:X:X:X::/X PortgroupName

To set an IPv6 address for a Service Console network interface using the console, use the command:

esxcfg-vswif --ip X:X:X:X::/X vSwifName

Configuring IPsec for IPv6 on vSphere ESX/ESXi 4.1

Internet Protocol Security (IPsec) secures IP communications coming from and arriving at an ESX/ESXi host. VMware vSphere ESX/ESXi 4.1 supports IPsec using IPv6 with manual key exchange for VMkernel network interfaces only.

When IPsec is enabled on a host, authentication and encryption of incoming and outgoing packets is performed. When and how IP traffic is encrypted depends on configuration of the system’s security associations and policies. For more information, see the Internet Protocol Security section of the ESX/ESXi Server Configuration Guide.

Configuration can be performed from the ESX/ESXi host console using the esxcfg-ipsec command, or remotely via the vSphere Command-Line Interface using the vicfg-ipsec command. Configuration of IPsec cannot be performed using the vSphere Client. The two commands have the same syntax, and only vicfg-ipsec is used in subsequent examples. For more information, see the vSphere Command-Line Interface documentation and the vicfg-ipsec command reference.

  • To add a Security Association (SA), use the command:vicfg-ipsec <connection_options> --add-sa --sa-src x:x::/x --sa-dst x:x::/x --sa-mode transport --ealgo null --spi 0x200 --ialgo hmac-sha1 --ikey key SAName
  • To add a Security Policy (SP), use the command:vicfg-ipsec <connection_options> --add-sp --sp-src x:x::/x --sp-dst x:x::/x --src-port 100 --dst-port 200 --ulproto tcp --dir out --action ipsec --sp-mode transport --sa-name SAName SPName

    For example, to add a generic security policy with default options:

    vicfg-ipsec <connection_options> --add-sp --sp-src any -sp-dst any --src-port any --dst-port any --ulproto any --dir out --action ipsec --sp-mode transport --sa-name SAName SPName

    For example, to add a security policy to filter traffic like a firewall:

    vixcfg-ipsec <connection_options> --add-sp --sp-src x:x::/x --sp-dst x:x::/x --src-port 100 --dst-port 200 --ulproto tcp --dir out --action discard SPName

  • To list the defined Security Associations and Security Policies, use the commands:vicfg-ipsec <connection_options> --list-sa
    vicfg-ipsec <connection_options> --list-sp
  • To delete a defined Security Association or Security Policy, use the commands:vicfg-ipsec <connection_options> --remove-sa SAName
    vicfg-ipsec <connection_options> --remove-sp SPName

Additional Information

The Internet Engineering Task Force has designated IPv6 as the successor to IPv4. The adoption of IPv6, both as a standalone protocol and in a mixed environment with IPv4, is rapidly increasing. With IPv6, you can use vSphere features in an IPv6 environment.

A major difference between IPv4 and IPv6 is address length. IPv6 uses a 128-bit address rather than the 32-bit addresses used by IPv4. This helps alleviate the problem of address exhaustion that is present with IPv4 and eliminates the need for network address translation (NAT). Other notable differences include link-local addresses that appear as the interface is initialized, addresses that are set by router advertisements, and the ability to have multiple IPv6 addresses on an interface.

An IPv6-specific configuration in vSphere involves providing IPv6 addresses, either by entering static addresses or by using an automatic address configuration scheme for all relevant vSphere networking interfaces.

For more information, see the Advanced Networking: Internet Protocol Version 6 section of the ESX/ESXi 4.1 Configuration Guide

Editing configuration files in VMware ESXi and ESX

Purpose

This article provides steps to edit files in VMware ESX and VMware ESXi.

Resolution

Datastore Browser

This section is applicable to all versions of VMware ESX and ESXi.

To download, edit and upload files to a datastore using the Datastore Browser:

  1. Open the vSphere Client and connect to the vCenter, ESX or ESXi machine using appropriate administrator-level credentials.
  2. Select an ESXi or ESX host that can access the datastore containing the files you want to edit.
  3. Click the Configuration tab > Storage.
  4. Right-click on the datastore and click Browse Datastore.
  5. In the left pane, navigate to the directory that contains the files.
  6. In the right pane, right-click on the file you wish to edit and click Download.
  7. Download the file and make note of its location.
  8. Open a preferred text editor. For more information about preferred editors, see Preferred Editors in this article.
  9. Open the downloaded file, edit it, and save the file.
  10. Return to the vSphere Client.
  11. Right-click on the original file and click Rename.
  12. Add a .bak extension to the file name. This step is optional but ensures that any changes can be reverted easily.
  13. Identify the folder in the left pane where you want to upload the modified file.
  14. Click the upload icon in the toolbox. The icon is a cylinder with a green arrow pointing up.
  15. Click Upload File.
  16. Navigate to and click the file you just modified.
  17. Click Open.
  18. A warning appears concerning file naming and the potential to overwrite files. Read the warning and click Yes.

vSphere Management Assistant and vSphere Command-Line Interface

This section is applicable to VMware ESX 3.5 Update 2 and later.

To download, edit and upload files to a datastore using the vifs utility:

  1. Open a console to your vSphere Management Assistant (vMA) appliance or a vSphere Command-Line Interface (CLI).
  2. Download the file you wish to edit by executing:vifs.pl <connection parameters> –get ‘[<datastore>] <path>/<filename>’ <localpath>/<filename>

    Where:

    • <connection parameters> specify the host that has access to the datastore or vCenter and the username and password of an Administrator account.
    • <datastore> is the name of the datastore that contains the file you wish to edit.
    • <path> is the path within the datastore that contains the file.
    • <filename> is the name of the file you wish to edit.
    • <localpath> is the path where you will download the file to. Make note of this location for future steps.For more information and examples about the vifs utility, see Performing File System Operations with vifs in the vSphere CLI documentation.
  1. Modify the file as required using the vMA or transfer it to another system for modification. If you are using the vCLI, proceed to the following step.
  2. Open a preferred text editor. For more information about preferred editors, see Preferred Editors in this article.
  3. Open the downloaded file, modify it as required, and save the file.
  4. Return to the vMA appliance console or vSphere CLI.
  5. Make a backup copy of the original by executing:vifs.pl <connection parameters> –move ‘[<datastore>] <path>/<filename>’ ‘[<datastore>] <path>/<filename>.bak’

    Where:

    • <connection parameters> specify the host that has access to the datastore or vCenter and the username and password of an Administrator account.
    • <datastore> is the name of the datastore that contains the file you wish to edit. For our purposes, both <datastore> values should be the same.
    • <path> is the path within the datastore that contains the file. For our purposes, both <path> values should be the same.
    • <filename> is the name of the file you wish to edit. For our purposes, the second <filename> value should include a .bak extension to indicate a backup copy.Caution: Omitting the additional .bak extension to the destination file name will result in overwriting your original file.

      For more information and examples about the vifs utility, see Performing File System Operations with vifs section of the vSphere CLI documentation.

  1. Upload the modified file to the original location by executing:vifs.pl <connection parameters> –put <localpath>/<filename> ‘[<datastore>] <path>/<filename>’

    Where:

    • <connection parameters> specify the host that has access to the datastore or vCenter and the username and password of an Administrator account.
    • <datastore> is the name of the datastore that contains the file you wish to edit.
    • <path> is the path within the datastore that contains the file.
    • <filename> is the name of the file you wish to edit.
    • <localpath> is the path where you will download the file to. Make note of this location for future steps.For more information and examples about the vifs utility, see Performing File System Operations with vifs section of the vSphere CLI documentation.

VMware ESXi or ESX Terminal

This section applies to VMware ESXi and ESX 4.1 and earlier.

To edit files using the VMware ESX Service Console or VMware ESXi Technical Support Mode:

  1. Log into the VMware ESX host as the root user.
  2. Make a backup copy of the file you wish to edit by executing:cp <path>/<filename> <path>/<filename>.bak

    Where:

    • <path> is the full path of the file.
    • <filename> is the name of the file you wish to edit.Note: The second parameter in the cp command should have a filename with the .bak extension to indicate that it is a backup copy.
  1. Edit the file by executing:<editor> <path>/<filename>

    Where:

    • <editor> is your preferred editor. For more information about preferred editors, see Preferred Editors in this article.
    • <path> is the full path to the file.
    • <filename> is the name of the file you wish to edit.
  1. If prompted to overwrite, ensure that you have made a backup copy and type y. Press Enter to commit your changes.

Preferred Editors

There are different editors to choose from, depending on your running platform. This is non-exhaustive list of editors available in different platforms:

  • VMware ESX includes the open source terminal-based editors nano and vi. For more information, see the vi man pages or the nano man pages.
  • VMware ESXi includes the open source terminal-based editor vi. For more information, see the vi man pages.
  • Windows-based machines include basic text editors such as Notepad and Wordpad. If you are using either of these text editors, you must ensure that you are saving files in their original format and encoding.

Notes:

  • The editors available in the shell (vi or nano) are meant for troubleshooting purposes only and must be used only when directed by VMware. The vSphere Client, vCLI, and PowerCLI should be the primary method of propagating changes to your ESX/ESXi host.
  • Notepad saves documents as ANSI text which may not be the same file format as the files downloaded from a VMware ESX or ESXi host. For more information, see Using different language formats in Notepad from Microsoft’s Windows XP Professional Product Documentation.

See Also

Collecting diagnostic information for VMware ESX/ESXi using the vSphere Client

Details

Note: If you have been directed to this article through a VMware Fusion support request, see Collecting diagnostic information for VMware Fusion (1003894).

VMware Technical Support routinely requests the diagnostic information from you when a support request is addressed. This diagnostic information contains product specific logs and configuration files from the host on which the product is run. This information is gathered using a specific script or tool within the product.

This article provides procedures for obtaining diagnostic information for an VMware ESX/ESXi host using the the vSphere or VI Client. For other methods of collecting the same information, see  Collecting diagnostic information for VMware ESX/ESXi using the vm-support command (1010705) and Collecting diagnostic information for VMware vCenter Server and ESX/ESXi using the vSphere PowerCLI (1027932).

The diagnostic information obtained by using this article is uploaded to VMware Technical Support. To uniquely identify your information, use the Support Request (SR) number you receive when you create the new SR.

Solution

Diagnostic information can be obtained from VMware ESX/ESXi hosts using the vSphere or VI Client. The user interface differs between versions, use the instructions appropriate for your version:

Obtaining Diagnostic Information for ESXi 5.0 hosts using the vSphere Client

ESXi 5.0 host diagnostic information can be gathered using the vSphere Client connected to the ESXi host or to vCenter Server.

To gather diagnostic data using the VMware vSphere Client:

  1. Open the vSphere Client and connect to vCenter Server or directly to an ESXi 5.0 host.
  2. Log in using an account with administrative privileges or with the Global.Diagnostics permission.
  3. Select an ESXi host, cluster, or datacenter in the inventory.
  4. Click the File > Export > Export System Logs.
  5. If a group of ESXi hosts are available in the selected context, select the host or group of hosts from the Source list.
  6. Click Next.
  7. In the System Logs pane, select the components for which the diagnostic information must be obtained. To collect diagnostic information for all the components, click Select All.
  8. If required, select the Gather performance data option and specify a duration and interval.
  9. Click Next.
  10. In the Download Location pane, click Browse and select a location on the client’s disk where you want to to save the support bundle.
  11. Click Next.
  12. In the Ready to Complete pane, review the summary and click Finish. The Downloading System Logs Bundles dialog appears and provides progress status for the creation and downloading of the support bundle from each source. A Generate system logs bundles task is created.
  13. When complete, upload the logs to the FTP site. For more information, see Uploading

diagnostic information to VMware (1008525).

Obtaining Diagnostic Information for ESX/ESXi 4.x hosts using the vSphere Client

ESX/ESXi 4.x host diagnostic information can be gathered using the vSphere Client connected to the ESX/ESXi host or vCenter Server.

To gather diagnostic data using the VMware vSphere Client:

  1. Open the vSphere Client and connect to vCenter Server or directly to an ESXi 4.x host.
  2. Login with an administrative user or other account with the Global.Diagnostics permission.
  3. Click the File menu, and select Export, Export System Logs.
  4. In the Export System Logsdialog, select the host or group of hosts to collect diagnostic information from.Notes:
    • The list of hosts is not displayed when the vSphere Client is directly connected to an ESX/ESXi host.
    • Selecting Include information from vCenter Server and vSphere Client includes logs from vCenter Server and the Client in the same export.
  1. Specify a location on the client’s disk to save the support bundle. Click the Browse button and select a directory. Click OK.
  2. A Generate system logs task is created. When complete, the logs will be downloaded by the client. At this stage, it is common to receive a Certificate Security Warning similar to:Optionally view the certificate and install it to prevent future warnings. Click Ignore to download the log bundle.
  3. The log bundle(s) from the selected host(s) appear in the specified directory. When complete, upload the logs to the FTP site. For more information, see Uploading diagnostic information to VMware (1008525).

Obtaining Diagnostic Information For ESX 2.x and 3.x hosts using the Virtual Infrastructure Client

ESX/ESXi 3.x and ESX 2.x host diagnostic information can be gathered using the VMware Virtual Infrastructure (VI) Client connected to the ESX/ESXi host or to the VirtualCenter Server.

To gather diagnostic data using the VMware VI Client:

  1. Open the VI Client and connect to VirtualCenter Server or directly to an ESX/ESXi host.
  2. Login with an administrative user or other account with the Global.Diagnostics permission.
  3. Click the File menu, and select Export, Export Diagnostic Datas.
  4. In the Save Diagnostic Datadialog, select the host or group of hosts to collect diagnostic information from.Note:
    • The list of hosts is not displayed when the VI Client is directly connected to an ESX/ESXi host.
    • Selecting Include information from VirtualCenter Server includes logs from vCenter Server in the same export.
  1. Specify a location on the client’s disk to save the support bundle. Click the Browse button and select a directory. Click OK.
  2. A Generate System Logs task is created. When complete, the logs will be downloaded by the client. At this stage, it is common to receive a Certificate Problem Warning similar to:Optionally select to ignore certificate errors. Click Ignore to download the log bundle.
  3. The log bundle(s) from the selected host(s) appear in the specified directory. When complete, upload the logs to the FTP site. For more information, see Uploading diagnostic information to VMware (1008525).

ESX and ESXi 4.x password requirements and restrictions

Symptoms

  • You are unable to set a password in ESXi 4.x
  • You see the VMware vSphere Client error:
    • A general system error occured: passwd: Authentication token manipulation error
    • An internal error has occurred, and the wizard is unable to store the Administrator password securely. The customization cannot proceed. Please contact VMware technical support for more information.
  • You see the Console error:Weak password: not enough different characters or classes for this length.
    passwd: Authentication token manipulation error

Purpose

This article provides information about ESX and ESXi 4.x password requirements and restrictions.

Resolution

This issue may occur if a password is invalid.

A valid password requires a mix of upper and lower case letters, digits, and other characters. You can use a 7-character long password with characters from at least 3 of these 4 classes, or a 6-character long password containing characters from all the classes. An upper case letter that begins the password and a digit that ends it do not count towards the number of character classes used. It is recommended that the password does not contain the username.

A passphrase requires at least 3 words, 8 to 40 characters long and must contain enough different characters.

Notes:

  • vCenter 4.0 can handle up to 26 character passwords.
  • In ESXi 4.x, the password cannot contain the words admin, root, or administrator in any form.

Caution: Modifying password restrictions may reduce the security of your VMware environment.

ESX 4.0

VMware ESX 4.x uses the pam module pam_passwdqc.so. To know more about this module and the different syntax, see the pam_passwdqc man page.

To disable the restriction:

  1. Modify the /etc/pam.d/system-auth-generic file. Run the command:vi /etc/pam.d/system-auth-generic
  2. Change the line:password   required   /lib/security/$ISA/pam_passwdqc.so   min=8,8,8,7,6 similar=deny match=0

    to:

    password   required   /lib/security/$ISA/pam_passwdqc.so   min=0,0,0,0,0 similar=deny match=0

    or

    password   required   pam_cracklib.so try_first_pass retry=3

  3. Save the changes and change the password.

ESXi 4.0 and ESXi/ESX 4.1

VMware ESXi/ESX 4.1 and ESXi 4.0 uses the pam_passwdqc.so module to check for the password strength. By default, it uses these parameters:

pam_passwdqc.so retry=3 min=8,8,8,7,6

To modify these settings on an ESX/ESXi 4.1.x host, enter technical support mode and edit the /etc/pam.d/system-auth file.

Note: To ensure that changes to the file persist upon reboot, run this command before making edits to the /etc/pam.d/system-auth file: chmod +t /etc/pam.d/system-auth.

For more information about technical support mode, see:

For more information about the pam_passwdqc.so module syntax, see the pam_passwdqc man page.

Sample Configuration – Network Load Balancing (NLB) Multicast mode over routed subnet – Cisco Switch Static ARP Configuration

Purpose

NLB Multicast Mode – Static ARP Resolution

  • Since NLB packets are unconventional, meaning the IP address is Unicast while the MAC address of it is Multicast, switches and routers drop NLB packets.
  • NLB Multicast Packets get dropped by routers and switches, causing the ARP tables of switches to not get populated with cluster IP and MAC address.
  • Manual ARP Resolution of NLB cluster address is required on physical switch and router interfaces
  • Cluster IP and MAC static resolution is set on each switch port that connects to ESX/ESXi host
  • Virtual Switch NIC Team Policy > Notify Switches is set to Yes.
  • If CDP (Cisco Discovery Protocol) is Enabled on ESX/ESXi and the Cisco Switch – you can determine proper switch ports connecting to ESX/ESXi via the vSphere/Virtual Infrastructure Client.

Resolution

To configure the switch:

  1. Telnet in to Cisco Switch Console and log in.
  2. Run this command to enter Configuration mode:config t
  3. STATIC ARP RESOLUTION Cisco Global command modeFor example:

    arp [ip] [cluster multicast mac] ARPA
    arp 192.168.1.100 03bf.c0a8.0164 ARPA

  4. STATIC MAC RESOLUTION Cisco Global command modeFor example:

    mac-address-table static [cluster multicast mac] [vlan id] [interface]
    mac-address-table static 03bf.c0a8.0164 vlan 1 interface GigabitEthernet1/1 GigabitEthernet1/2
    GigabitEthernet1/15 GigabitEthernet1/16

Sample Configuration – Network Load Balancing (NLB) Multicast Mode Configuration

Purpose

Setting up NLB Multicast Mode

Multicast mode does not have the problem that unicast operation does since the servers can communicate with each other via the original addresses of their NLB network cards.

Each server’s NLB network card operating in multicast mode has two MAC addresses (the original one and the virtual one for the cluster), which causes some problems. Most routers reject the ARP replies sent by hosts in the cluster, since the router sees the response to the ARP request that contains a unicast IP address with a multicast MAC address. The router considers this to be invalid and rejects the update to the ARP table. In this case, you need to manually Configure Static ARP Resolution at the switch or router for each port connecting to ESX’s NICs. For related information, see Sample Configuration – Network Load Balancing (NLB) Multicast mode over routed subnet – Cisco Switch Static ARP Configuration (1006525).

Resolution

These versions of Windows are recommended:

  • Microsoft Windows 2003 Server and later
  • Microsoft Windows 2000 Server with Load Balancing

To configure NLB in your guest operating system:

Note: Each NLB cluster node is required to have NLB enabled and configured with the same cluster IP address and FQDN.

  1. Go to Local Area Connection Properties > General tab > Check Network Load Balancing.
  2. Click Properties.
  3. Click the Cluster Parameters tab.
  4. Enter the Cluster IP address.
  5. Select the Multicastoption.
    Note
    : Click to enlarge image.
  6. Click the Host Parameters tab
  7. Set the Priority (unique host identifier).
  8. Enter the dedicated host IP. This is the same as the NIC IP.
  9. Click OK. 

    Note
    : Click to enlarge image.
  10. Select Networking Protocols TCP/IP
  11. Enter the dedicated host IP specified in Step 8.
  12. Click Advanced.
  13. Click Add.
  14. Enter the Cluster IP.
  15. Click OK.
    Note
    : Click to enlarge image.

For additional information, see Sample Configuration – Network Load Balancing (NLB) Multicast mode over routed subnet – Cisco Switch Static ARP Configuration (1006525).

Note: For more information on weak and strong host Behavior in Windows, see the Microsoft Technet article http://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx

 

 

Removing an ESX host with running virtual machines from VirtualCenter

Details

In the event that you need to remove an ESX host from VirtualCenter with out entering maintenance mode or affecting the running virtual machines you have to manually remove the rpms associated with the registration of an ESX host to VirtualCenter.

Solution

To remove the ESX host from VirtualCenter without affecting the running virtual machines you must:

  1. Right-click the ESX host in the VirtualCenter inventory.
  2. Choose Disconnect.
  3. Right-click on the ESX host and select Remove. The ESX host is safely removed from VirtualCenter.

VMware recommends removing the VirtualCenter management agent and VMware HA services afterward. This can be done from the ESX Service Console by running the following commands:

  • export LGTO_AAM_VMWARE_REMOVAL=1
  • rpm -e LGTOaama
  • rpm -e LGTOaamvm
  • rpm -e VMware-vpxa

At this point, the ESX host may be added to another VirtualCenter server for management or be used as a standalone server.

VMware ESX and ESXi 4.0 Comparison

Purpose

This article provides a detailed comparison of VMware ESX and ESXi 4.0. The article is separated into capabilities or features and compared at that level.

Resolution

Capability VMware ESX VMware ESXi
Service Console Service Console is a standard Linux environment through which a user has privileged access to the VMware ESX kernel. This Linux-based privileged access allows you to manage your environment by installing agents and drivers and executing scripts and other Linux-environment code. VMware ESXi is designed to make the server a computing appliance. Accordingly, VMware ESXi behaves more like firmware than traditional software. To provide hardware-like security and reliability, VMware ESXi does not support a privileged access environment like the Service Console for management of VMware ESXi. To enable interaction with agents, VMware has provisioned CIM Providers through which monitoring and management tasks – traditionally done through Service Console agents – can be performed. VMware has provided remote scripting environments such as vCLI and PowerCLI to allow the remote execution of scripts.
CLI-Based Configuration VMware ESX Service Console has a host CLI through which VMware ESX can be configured. VMware ESX can also be configured using vSphere CLI (vCLI). The vSphere CLI (vCLI) is a remote scripting environment that interacts with VMware ESXi hosts to enable host configuration through scripts or specific commands. It replicates nearly all the equivalent COS commands for configuring ESX.Notes:

  • vCLI is limited to read-only access for the free version of VMware ESXi. To enable full functionality of vCLI on a VMware ESXi host, the host must be licensed with vSphere Essentials, vSphere Essential Plus, vSphere Standard, vSphere Advanced, vSphere Enterprise, or vSphere Enterprise Plus.
  • VMware vSphere PowerCLI (for Windows) and vSphere SDK for Perlaccess ESXi through the same API as vCLI. Similarly, these toolkits are limited to read-only access for the free version of VMware ESXi. When the host is upgraded to vSphere Essentials, vSphere Essential Plus, vSphere Standard, vSphere Advanced, vSphere Enterprise, or vSphere Enterprise Plus these toolkits have write-access and provide a scriptable method for managing ESXi hosts.Certain COS commands have not been implemented in the vCLI because they pertain to the management of the COS itself and not ESXi. For details, please see the vSphere Command-Line Interface Documentation.
Scriptable Installation VMware ESX supports scriptable installations through utilities like KickStart. VMware ESXi Installable does not support scriptable installations in the manner ESX does, at this time. VMware ESXi does provide support for post installation configuration script using vCLI-based configuration scripts.
Boot from SAN VMware ESX supports boot from SAN. Booting from SAN requires one dedicated LUN per server. VMware ESXi may be deployed as an embedded hypervisor or installed on a hard disk.In most enterprise settings, VMware ESXi is deployed as an embedded hypervisor directly on the server. This operational model does not require any local storage and no SAN booting is required because the hypervisor image is directly on the server.

The installable version of VMware ESXi does not support booting from SAN.

Serial Cable Connectivity VMware ESX supports interaction through direct-attached serial cable to the VMware ESX host. VMware ESXi does not support interaction through direct-attached serial cable to the VMware ESXi host at this time.
SNMP VMware ESX supports SNMP. VMware ESXi supports SNMP when licensed with vSphere Essentials, vSphere Essential Plus, vSphere Standard, vSphere Advanced, vSphere Enterprise, or vSphere Enterprise Plus.The free version of VMware ESXi does not support SNMP.
Active Directory Integration VMware ESX supports Active Directory integration through third-party agents installed on the Service Console. VMware ESXi does not support Active Directory authentication of local users at this time.
HW Instrumentation Service Console agents provide a range of HW instrumentation on VMware ESX. VMware ESXi provides HW instrumentation through CIM Providers. Standards-based CIM Providers are distributed with all versions of VMware ESXi. VMware partners include their own proprietary CIM Providers in customized versions of VMware ESXi. These customized versions are available either from VMware’s web site or the partner’s web site, depending on the partner.Remote console applications like Dell DRAC, HP iLO, IBM RSA, and FSC iRMC S2 are supported with ESXi.
Software Patches and Updates VMware ESX software patches and upgrades behave like traditional Linux based patches and upgrades. The installation of a software patch or upgrade may require multiple system boots as the patch or upgrade may have dependencies on previous patches or upgrades. VMware ESXi patches and updates behave like firmware patches and updates. Any given patch or update is all-inclusive of previous patches and updates. That is, installing patch version “n” includes all updates included in patch versions n-1, n-2, and so forth.  Furthermore, third party components such as OEM CIM providers can be updated independently of the base ESXi component, and vice versa.
VI Web Access VMware ESX supports managing your virtual machines through VI Web Access. You can use the VI Web Access to connect directly to the ESX host or to the VMware Infrastructure Client. VMware ESXi does not support web access at this time.
Licensing For licensing information, see the VMware Sphere Editions Comparison. For licensing information, see the VMware Sphere Editions Comparison.
Diagnostics and Troubleshooting VMware ESX Service Console can be used to issue commands that can help diagnose and repair support issues with the server. VMware ESXi has several ways to enable support of the product:

  • Remote command sets such as the vCLI include diagnostic commands such as vmkfstools, resxtop, and vmware-cmd.
  • The console interface of VMware ESXi (known as the DCUI or Direct Console User Interface) has functionality to help repair the system, including restarting of all management agents.
  • Tech Support Mode, which allows low-level access to the system so that advanced diagnostic commands can be issues. For more information, see Tech Support Mode for Emergency Support (1003677).
Jumbo Frames VMware ESX 4.0 fully supports Jumbo Frames. VMware ESXi 4.0 fully supports Jumbo Frames.

Timekeeping best practices for Windows, including NTP

Details

This article presents best practices for achieving accurate timekeeping in Windows Guest operating systems. These recommendations include a suggested configuration for time synchronization in the guest and on the host.

Solution

For achieving accurate timekeeping in Windows guest operating systems, there are two main issues to consider: correctly configuring time synchronization and avoiding excessive CPU and memory overcommitment. Time synchronization utilities are necessary to correct time drift introduced by hardware time drift and guest operating system timekeeping imprecision. Excessive overcommitment can cause timekeeping drift at rates that are uncorrectable by time synchronization utilities. This best practices document covers time synchronization recommendations.

Time Synchronization

Use either w32time or NTP as the primary time synchronization utility. w32time is the time synchronization utility that ships with Windows. NTP (the Network Time Protocol daemon) is available for Windows through a variety of 3rd party ports to Windows.

Windows Version Recommended Time Sync Utility
Windows 2008 w32time or NTP
Windows Vista w32time or NTP
Windows 2003 w32time or NTP
Windows XP NTP
Windows 2000 NTP

Configuring w32time

When using w32time, there are a number of configuration parameters that can be changed. The table below describes the relevant parameters and gives a recommended value. All of the parameters are stored in the registry. Some of them can also be modified via the w32tm utility instead of directly editing the registry. This best practices guide covers running w32time in NTP mode. w32time can also use the windows domain hierarchy as time servers, which is not covered in this best practices guide.

After changing w32time’s settings it is necessary to restart w32time. Either reboot the virtual machine, run net stop w32time && net start w32time from the command line, or stop and start the w32time service. After restarting the w32time service, run the command w32tm /resync to force w32time to resync the time.

Key Details
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\

Services\W32Time\

TimeProviders\NtpClient\

SpecialPollInterval

Recommended Value: 900Type: REG_DWORD

Description: This parameter controls how often w32time will poll the time server to check whether time on the client needs to be corrected. The parameter is specified as number of seconds to wait between polling. The recommended value of 900 specifies that the time server should be polled once every 15 minutes.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\

Services\W32Time\Parameters\

NtpServer

Recommended Value:1.pool.ntp.org,0x1

2.pool.ntp.org,0x1

3.pool.ntp.org,0x1

Type: REG_SZ


Description
: This parameter specifies the time servers to use. It is specified as a string of space separated servers. Specifying “,0x1″ after the server name indicates that the server should be contacted at the frequency specified by the SpecialPollInterval setting.

Note: Modify the recommended value to point to the ntp servers available in your environment.

w32tm Command:

w32tm /config “/manualpeerlist:

1.pool.ntp.org,0x1

2.pool.ntp.org,0x1

3.pool.ntp.org,0x1″

Note: Modify the command to use the ntp servers available in your environment.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\

Services\W32Time\Parameters\

Type

Recommended Value: NTPType: REG_SZ

Description: This parameter specifies the mode that w32time should use. A value of NT5DS indicates that w32time should use the Windows domain hierarchy as its time servers rather than the NTP servers specified in the NtpServer key.

w32tm Command:

w32tm /config /syncfromflags:

MANUAL

VMware Tools Time Synchronization and Configuration

When using w32time or NTP in the guest, disable VMware Tools periodic time synchronization.

To disable VMware Tools periodic time sync, use one of these options:

  • Set tools.syncTime = “0” in the configuration file ( .vmx file) of the virtual machine.OR
  • Deselect Time synchronization between the virtual machine and the host operating system in the VMware Tools toolbox GUI of the guest operating system.OR
  • Run the VMwareService.exe -cmd “vmx.set_option synctime 1 0″ command in the guest operating system. VMwareService.exe is typically installed in C:\Program Files\VMware\VMware Tools.

These options do not disable one-time synchronizations done by VMware Tools for events such as tools startup, taking a snapshot, resuming from a snapshot, resuming from suspend, or VMotion. These events synchronize time in the guest operating system with time in the host operating system even if VMware Tools periodic time sync is disabled, so it is important to make sure that the host operating system’s time is correct. For more information, see Timekeeping in VMware Virtual Machines.

To ensure the host operating system’s time is correct for VMware ACE, VMware Fusion, VMware GSX Server, VMware Player, VMware Server, and VMware Workstation run time synchronization software such as NTP or w32time in the host. For VMware ESX, run NTP in the service console. For VMware ESXi, run NTP on the VMkernel.

Necessary Patches

Time runs too fast in a Windows virtual machine when the Multimedia Timer interface is used (1005953) describes a known issue that may cause problems when running Windows in a virtual machine. This issue is addressed in recent VMware products. The table below specifies the actions required to ensure that you are using a product version that contains the fix:

Product Action
ESX 3.5 and later No action required
ESX 3.0.3 Ensure patch ESX303-200910401-BG is applied
ESX 3.0.2 Ensure patch ESX-1002087 is applied
ESX 3.0.1 Ensure patch ESX-1002082 is applied
ESX 3.0.0 Ensure patch ESX-1002081 is applied
ESX 2.5.x and earlier Upgrade to ESX 3.0.0 or later
Fusion 2.0 and later No action required
Fusion 1.x Upgrade to Fusion 2.0 or later
Player 2.0 and later No action required
Player 1.x Upgrade to Player 2.0 or later
Workstation 6.0 and later No action required
Workstation 5.x Upgrade to Workstation 6.0 or later

Reserved or overhead ports for virtual switches

Details

To account for overheads such as physical NIC ports (uplinks), CDP traffic, and network discovery, ESX 3.x, ESX/ESXi3.5.x, and ESX/ESXi 4.x allocate and reserve an additional eight ports per virtual switch beyond what is available for virtual machine use. This additional overhead is allocated regardless of the number of ports on the virtual switch, and accounts for the most common product deployment scenarios.

Solution

Configuring a virtual switch from vSphere Client
When configuring a virtual switch from vSphere Client, the ESX machine provides available virtual switch port count, which already reflects the fixed overhead (8, 24, 56, 120, 248, 504, 1016, 2040, and 4088 available ports respectively).

Note: Only ESX/ESXi 4.x have 2040 or 4088 ports.

Configuring a virtual switch using esxcfg-vswitch command
When configuring a virtual switch using esxcfg-vswitch command, this overhead must be explicitly taken into account when specifying the total number of ports to allocate. Available port counts as displayed in vSphere Client need to be incremented by eight when working with esxcfg-vswitch command.

vSphere Client  esxcfg-vswitch Total Ports
8 16 8 virtual machine + 8 Reserved
24 32 24 virtual machine + 8 Reserved
56* 64 56 virtual machine + 8 Reserved
120** 128 120 virtual machine + 8 Reserved
248 256 248 virtual machine + 8 Reserved
504 512 504 virtual machine + 8 Reserved
1016 1024 1016 virtual machine + 8 Reserved
2040*** 2048 2040 virtual machine + 8 Reserved
4088*** 4096 4088 virtual machine + 8 Reserved

* = System default for new virtual switches in ESX/ ESXi 3.x

** = System default for new virtual switches in ESX/ ESXi 4.x

*** = Only for ESX/ESXi 4.0

Note:  Certain higher-complexity product deployment scenarios might require more eight virtual switch ports for overhead.

Typically, each additional uplink connected to the same virtual switch beyond the first six uplinks reduces the number of ports available on that virtual switch for virtual machine use by one.

Deployment scenarios where a very large number of uplinks are teamed together on a single virtual switch might significantly impact the number of  ports on that virtual switch available for virtual machine use, and the overall size of the virtual switch might need to be adjusted accordingly.

The current port utilization data for virtual switches can be reviewed by using the esxcfg-vswitch –list command.

The current overhead utilization on a given virtual switch can be calculated by subtracting the Used Ports value for all PortGroups from the Used Ports value for that virtual switch.

Synchronizing ESX/ESXi time with a Microsoft Domain Controller

Symptoms

An ESX or ESXi host configured to use a Microsoft Windows 2003 or newer Domain Controller as a time source never synchronizes its clock with a default configuration.

Resolution

This issue is resolved in ESX/ESXi 4.1 Update 2. You can download the most recent version of ESX/ESXi 4.1 from the VMware Download Center.

Workaround

If you are using ESX/ESXi 4.1 Update 1, you can use this workaround:

When using Active Directory integration in ESX/ESXi 4.1 and newer, it is important to synchronize time between ESX/ESXi and the directory service to facilitate the Kerberos security protocol.

ESX and ESXi support synchronization of time with an external NTPv3 or NTPv4 server compliant with RFC 5905 and RFC 1305. Microsoft Windows 2003 and newer use the W32Time service to synchronize time for windows clients and facilitate the Kerberos v5 protocol. For more information, see the Microsoft Knowledge Base article 939322 and How the Windows Time Service Works.

By default, an unsynced Windows server chooses a 10-second dispersion and adds to the dispersion on each poll interval that it remains in sync. An ESX/ESXi host, by default, dos not accept any NTP reply with a root dispersion greater than 1.5 seconds.

The preceding links were correct as of March 14, 2010. If you find a link is broken, provide feedback and a VMware employee will update the link.

Configure Windows NTP Client

ESX/ESXi requires an accurate time source to synchronize with. To use a Windows 2003 or newer server, it should be configured to get its time from an accurate upstream NTP server. For more information, see the Microsoft Knowledge Base article 816042.

The preceding link was correct as of March 14, 2010. If you find a link is broken, provide feedback and a VMware employee will update the link.

Use the registry editor on the Windows server to make the configuration changes:

  1. Enable NTP mode:
    1. Locate HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters
    2. Set the Type value to “NTP
  2. Enable the NTP Client:
    1. Locate HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config
    2. Set the AnnounceFlags value to 5
  3. Specify the upstream NTP servers to sync from:
    1. Locate HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders
    2. Set the NtpServervalue to a list of at least 3 NTP servers.Example: You might set the value to 1.pool.ntp.org,0x1 2.pool.ntp.org,0x1 3.pool.ntp.org,0x1
  4. Specify a 150-minute update interval:
    1. Locate HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient
    2. Set the SpecialPollInterval value to 900
  5. Restart the W32time service for the changes to take effect.

Configure ESX/ESXi NTP and Likewise Clients

Configure ESX/ESXi to synchronize time with the Windows server Active Directory Domain Controller:

  1. Connect to the ESX/ESXi host or vCenter Server using the vSphere Client.
  2. Select the ESX/ESXi host in the inventory.
  3. Click the Configuration tab.
  4. Under the Software heading, click Time Configuration.
  5. Click Properties.
  6. Ensure that the NTP Client Enabled option is selected.
  7. Click Options.
  8. Click NTP Settings.
  9. Click Add and specify the fully qualified domain name or IP address of the Windows server Domain Controller(s).
  10. Click OK.
  11. Click OK to save the changes.

Additional configuration must be done from the command line.

  1. Open a console to the ESX or ESXi host. For more information, see Connecting to an ESX host using a SSH client (1019852) or Using Tech Support Mode in ESXi 4.1 and ESXi 5.0 (1017910).
  2. Open the file /etc/ntp.conf in a text editor. For more information, see Editing configuration files in VMware ESXi and ESX (1017022).
  3. Add the tos maxdistcommand on its own line:tos maxdist 30
  4. Save the configuration file.
  5. Make the file /etc/likewise/lsassd.confwritable using the command:chmod +w /etc/likewise/lsassd.conf
  6. Open the file /etc/likewise/lsassd.conf in a text editor. For more information, see Editing configuration files in VMware ESXi and ESX (1017022).
  7. Locate the sync-system-timeoption, uncomment it, and set the value to no:sync-system-time = no
  8. Save the configuration file.
  9. On ESXi, save the configuration changes to the boot bank so they persist across reboots using the command:/sbin/auto-backup.sh
  10. Restart the ntpd and lsassd service for the configuration change to take effect using the commands:service lsassd restart
    service ntpd restart

    Note: To restart the ntpd and lsassd services on an ESXi host use these commands:

    ./etc/init.d/lsassd restart
    ./etc/init.d/ntpd restart

If the ntpd and lsassd services are not restarting, consider restarting the management agents first. For more information about restarting the management agent, see Restarting the Management agents on an ESX or ESXi Server (1003490).

Once the configuration changes are completed, ensure that the time is synchronized between the ESX/ESXi host and the Windows server. For more information, see Troubleshooting NTP on ESX and ESXi (1005092).

Updating ESX 4.x to a newer released update

Purpose

This article provides steps which may be useful when updating from ESX 4.x to a newer released update.

This article does not include steps for upgrading from ESX 3.5 and previous versions. If you are upgrading from ESX 3.x, see Upgrading to ESX 4.0 and vCenter 4.0 best practices (1009039).

Note: This article assumes that you have read the appropriate documentation for the update you are applying:

Note: These guides contain definitive information. If there is a discrepancy between the guide and this article, you should assume that the guide is correct.

Resolution

vSphere 4.x currently offers these applications for updating from ESX 4.x to a newer version:

  • vihostupdate Command-Line Utility—The vihostupdate command applies software updates to ESX 4.x/ESXi 4.x hosts and installs and updates ESX/ESXi extensions such as VMkernel modules, drivers, and CIM providers. For more information, see the vSphere Upgrade Guide.

Note: You cannot use the vSphere Host Update utility to upgrade ESX 4.x hosts. This utility is only for standalone ESX 3.x and ESXi hosts. A standalone host is an ESX host that is not managed by vCenter Server. For more information, see Cannot patch or upgrade ESX 4.0 hosts with vSphere Host Update Utility (1012467) and the vSphere Upgrade Guide.

  • esxupdate Command-Line Utility—The esxupdate command applies software updates to ESX 4.x hosts. For more information, see About the esxupdate Utility in the ESX 4 Patch Management Guide.Download the appropriate Update Package for your ESX Server 4.x and unzip the file to an ESX host directory that has sufficient space. Make sure you have enough room to store the update package, as well as the amount of space required. You can verify the amount of space needed by looking at the update package with a third party archive utility.

    To install using esxupdate:

  1. Log in to the ESX host service console as root using an SSH Client, or directly from the Service Console.
  2. Create a local depot directory on the ESX host, such as /tmp/upgrade or under a VMFS datastore.For example:
    • # mkdir -p /tmp/upgrade
    • # mkdir -p/vmfs/volumes/<Name of the datastore>/upgradeNote: Ensure that you have enough free space.
  3. Download the update ZIP file from the VMware Download Center.
  4. Copy the upgrade package to the directory you created using a program such as WinSCP or use the upload file option in browse datastore and then copy it to the directory.
  5. Change your working directory to directory you created in Step 2.
  6. Place the host in the maintenance mode and run this command:# esxupdate –bundle <upgrade.zip> update
  • VMware vCenter Update Manager—For ESX/ESXi hosts that are managed by vCenter Server.

To upgrade a host using Update Manager, you must first create a baseline to use for host remediation. The procedures required to create this baseline vary depending on the currently deployed revision of ESX/ESXi in your environment. For more information, see the VMware vCenter Update Manager Administration Guide and find the guide corresponding to the version currently in use. In addition, read the release notes for the ESX version you are attempting to deploy. For more information, see the VMware vSphere 4 Documentation.

Upgrading to ESX 4.0 and vCenter 4.0 best practices

Purpose

This article provides steps which may be useful when upgrading to ESX 4.0 and vCenter Server 4.0.

Note: This article assumes that you have read the vSphere Upgrade Guide. This upgrade guide contains definitive information. If there is a discrepancy between the guide and this article, assume that the guide is correct.

Note: ESX 4.0 Update 1 is available through Update Manager. For information on upgrading to Update 1, see Updating ESX 4.x to a newer released update (1016209).

Resolution

Note: Read the VMware vSphere 4.0 Release Notes for known installation issues.

On the vCenter Server

  1. Make sure your hardware requirements are compliant:
    • Processor – 2 CPUs 2.0GHz or higher Intel or AMD x86 processors. Processor requirements may be higher if the database runs on the same machine.
    • Memory – 3GB RAM. RAM requirements may be higher if your database runs on the same machine.
    • Disk storage – 2GB. Disk requirements may be higher if your database runs on the same machine.
    • Networking – 1Gbit recommended.
  2. Verify that your existing database is supported with vCenter Server 4.0. If it is not, upgrade your existing database to a supported type:
    • Supported Microsoft SQL Server Databases:
      • Microsoft SQL Server 2005 ExpressNote: Microsoft SQL Server 2005 Express is intended to be used for small deployments of up to 5 hosts and/or 50 virtual machines.
      • Microsoft SQL Server 2005 Standard edition (SP1, SP2, SP3)
      • Microsoft SQL Server 2005 Standard edition (SP2, SP3) 64bit
      • Microsoft SQL Server 2005 Enterprise edition (SP1, SP2, SP3)
      • Microsoft SQL Server 2005 Enterprise edition (SP2, SP3) 64bit
      • Microsoft SQL Server 2008 Standard Edition
      • Microsoft SQL Server 2008 Standard Edition 64bit
      • Microsoft SQL Server 2008 Enterprise Edition
      • Microsoft SQL Server 2008 Enterprise Edition 64bit
    • Oracle Database Support:
      • Oracle 10g Standard edition (Release 1 [10.1.0.3.0])
      • Oracle 10g Enterprise edition (Release 1 [10.1.0.3.0])
      • Oracle 10g Standard edition (Release 2 [10.2.0.1.0])
      • Oracle 10g Enterprise edition (Release 2 [10.2.0.1.0])
      • Oracle 10g Enterprise edition (Release 2 [10.2.0.1.0]) x64
      • Oracle 11g Standard edition
      • Oracle 11g Enterprise edition
  3. Make a full backup of the vCenter database.
  4. Make sure that you have the following permissions:
    • Microsoft SQL:
      • Grant the System DSN user of the vCenter Database db_owner privileges on the vCenter database.
      • Grant the System DSN user of the vCenter Database db_owner privileges on the MSDB database.Note: The db_owner privileges on the MSDB database are required for installation and upgrade only.
    • Oracle:
      • Grant dba permissions to the vCenter user.
  5. Ensure that your ODBC System DSN is using the proper driver. Microsoft SQL must use the SQL Native Client driver.
  6. Stop the vCenter service. This step is recommended, especially if the vCenter database is on a remote system.Click Start > Control Panel > Administrative Tools > Services > VMware VirtualCenter Server.
  7. Log in to your vCenter Server with a Local Administrator account on your Windows system to run the upgrade.
  8. Ensure no processes are running that conflict with the ports that vCenter uses.
  9. Configure new vSphere 4.0 licences.

On the ESX Server

Note: When performing an upgrade on a host that has custom partitions created on ESX 3.x, ensure that the customized data on these partitions are backed up before performing an upgrade. For more information, see the Preparing for the Upgrade to ESX 4.0/ESXi 4.0 section of thevSphere Upgrade Guide.

  1. Make sure your hardware is compliant on the Hardware Compatibility Guide.This includes:
    • System compatibility
    • I/O compatibility (Network and HBA cards)
    • Storage compatibility
    • Backup software compatibility
  1. Make sure your current ESX version is supported for upgrade:
    • Limited support for ESX 2.5.5. For more information, see the vSphere Upgrade Guide.Note: There is no upgrade support for ESX 2.5.4 and below.
    • Full upgrade support for ESX 3.0.1 and higher.Note: ESX 3.0.0 needs to be upgraded to ESX 3.0.1 or higher before upgrading to ESX 4.0.
  2. Server hardware for the ESX 4.0 must be 64bit compatible.
  3. Make sure Intel VT is enabled in the host’s BIOS.
  4. If a SAN is connected to the ESX host, detach the fiber before continuing with the upgrade.Note: Do not disable HBA cards in the BIOS.
  5. Confirm that all virtual machines are migrated or powered down on the ESX host.
  6. Ensure that there is sufficient disk space available on the ESX host for the upgrade.Note: VMware recommends you to select a datastore that is local to the ESX host for the service console.The service console VMDK requires a minimum available space of 8.4GB. NFS and software iSCSI datastores are not supported as the destination for the ESX 4.0 service console VMDK.The service console must be installed on a VMFS datastore that is resident on a host’s local disk or on a SAN disk that is masked and zoned to that particular host only. The datastore cannot be shared between hosts.For information on determining the available disk space, see Investigating disk space on an ESX host (1003564).

Configuring networking from the ESX service console command line

Details

This article provides steps to configure networking for an ESX host when you only have access to the service console.

Solution

Note: ESX 4.0 Update 2 introduces a tool that simplifies the process of creating or restoring networking in the ESX service console. For more information, see Configuring or restoring networking from the ESX service console using console-setup (1022078).

To configure networking from the ESX service console command line:

  1. Ensure the network adapter you want to use is currently connected with the command:[root@server root]# esxcfg-nics –l

    The output appears similar to:

    Name PCI Driver Link Speed Duplex Description
    vmnic0 06:00.00 tg3 Up 1000Mbps Full Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet
    vmnic1 07:00.00 tg3 Up 1000Mbps Full Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet

    In the Link column, Up indicates that the network adapter is available and functioning.

  1. List the current virtual switches with the command:[root@server root]# esxcfg-vswitch –l

    The output appears similar to:

    Switch Name Num Ports Used Ports Configured Ports Uplinks
    vSwitch0 32 3 32 vmnic0

    PortGroup Name Internal ID VLAN ID Used Ports Uplinks
    VM Network portgroup2 0 0 vmnic0

    In the example output, there exists a virtual machine network named VM Network with no Service Console portgroup. For illustration, the proceeding steps show you how to create a new virtual switch and place the service console port group on it.

  2. Create a new virtual switch with the command:[root@server root]# esxcfg-vswitch –a vSwitch1
  3. Create the Service Console portgroup on this new virtual switch:[root@server root]# esxcfg-vswitch –A “Service Console” vSwitch1

    Because there is a space in the name (Service Console), you must enclose it in quotation marks.

    Note: To create Service Consoles one at time, you may need to delete all previous settings. For more information, see Recreating Service Console Networking from the command line (1000266).

  4. Up-link vmnic1 to the new virtual switch with the command:[root@server root]# esxcfg-vswitch –L vmnic1 vSwitch1
  5. If you need to assign a VLAN, use the command:[root@server root]# esxcfg-vswitch -v <VLAN> -p “Service Console” vSwitch0

    where <VLANID> is the VLAN number. A zero here specifies no VLAN.

  6. Verify the new virtual switch configuration with the command:[root@server root]# esxcfg-vswitch –l

    The output appears similar to:

    Switch Name Num Ports Used Ports Configured Ports Uplinks
    vSwitch0 32 3 32 vmnic0

    PortGroup Name Internal ID VLAN ID Used Ports Uplinks
    Service Console portgroup5 0 1 vmnic0

    Switch Name Num Ports Used Ports Configured Ports Uplinks
    vSwitch1 64 1 64 vmnic1

    PortGroup Name Internal ID VLAN ID Used Ports Uplinks
    Service Console portgroup14 0 1 vmnic1

  7. Create the vswif (Service Console) interface. For example, run the command:[root@server root]# esxcfg-vswif –a vswif0 –i 192.168.1.10 –n 255.255.255.0 –p “Service Console”
    [‘Vnic’ warning] Generated New Mac address, 00:50:xx:xx:xx:xx for vswif0

    Nothing to flush.

  8. Verify the configuration with the command:[root@esx]# esxcfg-vswif –l
    Name Port Group IP Address Netmask Broadcast Enabled DHCP
    v swif0 Service Console 192.168.1.10 255.255.255.0 192.168.1.255 true false
  9. Verify the networking configuration on the ESX host. See Verifying ESX host networking configuration on the service console (1003796) .

Access the ESXi Service Console

Here is a brief write-up on how to access the Service Console of VMware ESXi. As a disclaimer, this should only be done under the direct supervision of a VMware Support Engineer.

  1. From the ESXi console summary screen hit ALT-F1.
  2. Enter the word “unsupported” (without quotes).
  3. Enter in the root password for your system.
  4. Be careful

Now edit the inetd.conf file to enable remote SSH into this console;

  1. Edit /etc/inetd.conf (vi /etc/inetd.conf).
  2. Remove the # sign in front of the SSH line.
  3. Kill and restart the inetd process.
    1.) ps -ef |grep inetd
    2.) kill -HUP <pid>
    # pid is the Process ID, the first number displayed from ps -ef
  4. SSH into the IP of your ESXi server, using your root login/password.

Changing Service Console IP Address in ESX 3.5

Actually this is not that difficult, but remember you will require console access to the server. Be sure to put the machine in Maintenance Mode then Disconnect it from Virtual Center. Then connect to the console of the ESX host;

  1. First we need to remove the old IP, the easiest way is to delete the vswif interface
    • esxcfg-vswif -d vswif0
      • replace vswif0 with the interface you’d like to remove
  2. Then we need to create a new vswif interface with our new IP address
    • esxcfg-vswif -a vswif0 -p Service\ Console -i 10.1.1.1 -n 255.255.255.0 -b 10.1.1.255
      • replace vswif0 with the interface you’d like to use
      • replace Service\ Console with the name of your Service Console portgroup (this is the default)
      • -i reflects your new IP
      • -n reflects your subnet
      • -b reflects your broadcast
  3. Now we need to update our default gateway
    • This is a simple change to the /etc/sysconfig/network file
  4. One last thing you’ll want to do after changing your gateway is reset the vswif interface, this will ensure it is connected as well as generate the new default gateway.
    • esxcfg-vswif -s vswif0 (this will disable the vswif0 interface)
    • esxcfg-vswif -e vswif0 (this will enable the vswif0 interface)

Advanced Settings for VMware HA

After discussing the issue(s) about NFS.LockDisable and how those problems may be generated (ie: VMware HA initiation), a few comments came in about VMware HA and how the default settings may not fit their specific needs.

Well, here are some Advanced Settings that you can change from within your Cluster.

das.AllowNetwork By default the Service Console portgroup is used for failure detection. By entering in das.AllowNetwork you can specify an additional portgroup to use.

das.isolationAddress By default VMware HA nodes check heartbeats of other nodes . When this heartbeat is lost the suspected Isolated nodes then pings its Service Console gateway by default to check that it is truly isolated. By using the das.isolationAddress command, you can add additional IP addresses for the server to check. These IPs must be on the Service Console portgroup, or in the portgroup you’ve added for das.AllowNetwork

das.failureDetectionTime
This is the time required before VMware HA considers a host to be Isolated. Default setting is 14 seconds, this can be changed to a setting that will better fit your needs.

das.failureDetectionInterval
This is the rate of monitoring between VMware HA nodes, think of this as a heartbeat check.
Default setting is 1 second, if you feel that this is just too often you can change it.

VMware HA events are captured in the vpxa.log file on each VMware ESX host. DasHostIsolatedEvent is the syntax left in the log file when an Isolation happens.

If you’re having problems with configuring VMware HA, you can check the /var/log/vmware/aam/aam_config_util_*.log files. These logs include all your necessary information for installing, configuring and connecting to other HA nodes.

If it comes down to it, you can remove the HA software from the ESX node by doing the following;

[root@dpcrc_vmdevsap1 aam]# rpm -qa |grep aam
VMware-aam-haa-2.2.0-1 <<< should be listed if installed
VMware-aam-vcint-2.2.0-1 <<< should be listed if installed
[root@dpcrc_vmdevsap1 aam]# rpm -e VMware-aam-haa-2.2.0-1 VMware-aam-vcint-2.2.0-1

This will do a clean uninstall of the VMware HA software, you can then Reconfigure for VMware HA from within VirtualCenter.

Also, Duncan Epping over at yellow-bricks.com has a complete listing of Advanced VMware HA settings, you can check those out at http://www.yellow-bricks.com/2008/10/06/update-ha-advanced-options-2/


vmware 2011 Mega Launch

It is 9am Pacific Time on Tuesday, July 12th 2011 and I sure hope you’re tuned into the vmware Mega Launch so greatly titled “Raising the Bar, Part V”. If you’re not watching the live broadcast, stop right here and tune into it by clicking this link, then come back and read this post.

Spoiler alert… reading beyond this point talks about amazing updates and new features from vmware!

This by far has to be the most exciting launch in the history of vmware, not only are we getting an update to the vSphere product suite that has hundreds if not thousands of enhancements and new features, we’re also getting updates to other great products like vCloud Director, vShield and SRM.

In fact, there are so many changes and so much new great things to talk about I can’t do it all in one post. So I’ve decided that I will need to break these up into multiple posts, each with deep detail. I’ll release this posts as quickly as I can write them, but until I have them completed I want to provide you with some of the great core details from this mega launch.

So first off get ready for another new term from vmware, Cloud Infrastructure and Management. To sum it up, CIM basically includes vSphere (ESXi), vCenter, vShield and vCloud Director as a single package/methodology called CIM. These are all of the building blocks necessary to build a robust, elastic and efficient hybrid cloud. I have a feeling we’re going to hear a lot about how vSphere 5 along with the other above mention products are the industry best pieces for running a Cloud Infrastructure.

On a side tangent, there is so much discussion on the cloud you wouldn’t believe it. On an almost daily basis I’m meeting with customers to discuss their “Cloud Strategy”. Customers want Hybrid Cloud computing and with these latest updates that I’m going to discuss I think we’re finally at a place where we truly can have application and data mobility, moving our workloads fluidly across our own data-centers in an automated load balanced fashion, from compute to now storage, as well as being pushed out to external hosting (cloud) providers for extreme elasticity as well as fault tolerant (BC/DR) infrastructure.

Ok, so lets get started on all of these updates!

vSphere 5 (including ESXi 5.0)
First off, everyone should already know but if you do not, there is no longer Classic ESX with the traditional Service Console. vmware stated that version 4.1 would be their last release of the Classic ESX install and now with version 5.0 there is only ESXi.

Performance – There have been a number of enhancements to the core vmware enterprise hypervisor, in this latest release we’ll see huge performance improvements to the vmkernel but as well as in Virtual Machine density. ESXi hosts can support up to 512 virtual machines on 160 logical CPUs with up to 2TB of RAM, while Virtual Machines can now scale to 32 vCPUs with 1000GB of Memory and have been tested to push 1,000,000 IOPs. What this basically means is there shouldn’t be any performance related reason why you cannot virtualize any workload. The most demanding workloads are being virtualized such as Oracle RAC, Microsoft SQL, SAP and Exchange 2010.

Image Builder - this is a new utility built upon PowerCLI that allows you to create custom ESXi builds, it allows you to inject ESXi VIBs, Driver VIBs and OEM VIBs to create an installable or PXE boot-able (I’ll explain why shortly) ESXi image. If you’re unaware of what a VIB is, it stands for vmware Infrastructure Bundle and you can think of it almost as a RPM bundle.

Auto Deploy - Think UCS Service Profile but at the O/S level. There isn’t any hardware abstraction for moving an existing ESXi image between different hardware, but with Auto Deploy you can quickly and easily create stateless ESXi servers with no disk dependency. To sum it up, you PXE boot your server, the ESXi image is loaded into host memory from the Auto Deploy server, its configuration is applied using an answer file as well as host profile and that host is then connected/placed into vCenter. Hose something? A simple reboot will give you a fresh ESXi image in a matter of minutes. Need to expand your cluster? Bring up another host and add it to the cluster within minutes.

vCenter Virtual Appliance (VCVA) – Whoo Hoo! Looks like that Tech Preview of vCenter Server on Linux finally hit GA! vmware has released with vSphere 5 a virtual appliance of vCenter Server that is based on Linux! This also includes a feature rich browser based vSphere Client completely built on Adobe Flex, this is not a replacement for the traditional installed vSphere Client but it is a nice move forward in vSphere management. Ahhh, do you remember the MUI? :)

High Availability (HA) Completely Rewritten – Way too much to discuss here, but a complete rewrite to the core HA functionality has happened. HA can now leverages multiple communication paths between agents (referred to as FDM or Fault Domain Manager) including network and storage (datastore). HA agents no longer use a Primary/Secondary methodology, during cluster creation a single Master is chosen and each remaining host is a Slave.

VMFS5 – Oh my! 64TB datastores anyone with a single easy to use 1M block size? You got it! Along with VAAI 2.0 which includes two new block primitives, Thin Provision Stun (finally!) and Space Reclaim. NFS also doesn’t need to feel left out because we now have Full Clone, Extended Stats and Space Reservation for NFS datastores. We also have a new API called VASA, vStorage APIs for Storage Awareness which will provide a number of enhancements such as profile-driven storage (think EMC FAST-VP being integrated with vSphere). Quickly back to VAAI 2.0, Thin Provision Stun will protect your virtual machines if your datastore runs out of space and Space Reclaim will use SCSI UNMAP instead of WRITE ZERO to remove space, this will allow the array to release those blocks of data back to the free pool.

Storage DRS (SDRS) – DRS load balancing Virtual Machines across hosts is to SDRS performing Storage vMotion on VMDKs for better performance, capacity utilization, etc. This also includes initial placement as well as allowing affinity based rules for VMDKs. SDRS can monitor for capacity utilization as well as I/O metrics (latency) and dynamically balance your VMDKs across multiple datastores.

Storage vMotion – Snapshot support!  As well as being able to move around Linked Clones. There has also been some core enhancements to make things faster and more consistent.

vSphere Storage Appliance (VSA) – It is what it sounds like, a virtual storage appliance that allows SMB customers to use local disk on the ESXi host presented out as an NFS datastore to the vSphere Cluster. There is replication technology behind it so if you do lose an ESXi host you will not lose data nor will you lose connectivity to your virtual machines. This is meant for up to 3 ESXi hosts and is really tailored for the SMB or ROBO market.

There is so much more in vSphere 5, but like I said I wanted to just give a brief overview at this time.

Site Recovery Manager 5
Host Based Replication – New feature within SRM5, no longer is SAN storage/replication required for SRM. You can now replicate your data host based for disaster recovery scenarios in your virtual environment. Key takeaways, replication between heterogeneous datastores and it is managed as a property of the virtual machine. Powered-off VMs are not replicated, non-critical data (logs, etc) are not replicated. Physical RDMs are not supported. Snapshots work, snapshot is replicated, but VM is recovered with collapse snapshots. Fault Tolerant, Linked Clones and VM Templates are not supported.

Automated Failback – Replication is automatically reversed and with a single click you can failback your virtual machines from your disaster site to your production site. This is huge! You have no idea how much of a pain it is to failback a site with SRM, unless you’re using the EMC plug-in :)

Misc – Completely new interface, still within the vSphere Client as a plug-in but now you can manage it all from a single UI, no need to use two clients or a linked mode vCenter.

vCloud Director 1.5
Tons of new APIs within vCloud Director 1.5, including vCloud Orchestration via a vCenter Orchestrator module. Supported for Linked Clones is a huge leap forward, you can now deploy vApps in a matter of seconds with minimal storage consumption. Microsoft SQL is now supported as a back-end database which will make standing up a vCD instance in your lab a lot easier because you won’t need to worry about an Oracle database :). There is also support for federated multi-vClouds by linking vCD instances as well as enhanced vShield integration specifically around IPSec VPN.

Are you still awake? 1170+ words into this post and I’m still not complete….and this is just the brief overview! Whew!!  vmware you really outdid yourself!

vShield 5
vShield Edge – provides us with true multi-tenant site separation complete with VPN capabilities, DHCP, Stateful Firewall and now Static Routing within vShield Edge 5.0.

vShield App – gives us layer2/3 protection with VM-level enforcement now with group based policies found in vShield App 5.0 as well as enabling multiple trust zones on the same vSphere cluster. Layer 2 protection coupled with APIs enable automatic quarantining of compromised VMs.

vShield Data Security - is a new member of the vShield family that allows you to monitor virtual machines continuously and completely transparent to the VM for compliance such as PCI, PHI, PII and HIPAA to name a few.

vShield Manager – Enterprise roles found in Manager 5.0 now provide the separation of duties required by some security and compliance standards.

So there you have it…. a brief 1706 word blog post covering just the high-level details of the vmware mega launch. Like I said earlier, I’m going to try to focus in on some deep-dive details on some of the major topics above. But until then, read up as much as you can on the vmware website and hopefully relatively soon the bits will be available for public consumption so you can get all of this great fresh new code in your lab!

Posted under Cloud, SRM, Security, Storage, VMware HA, vCenter, vSphere

This post was written by Rick Scherer on July 12, 2011

Tags: sdrs, site recovery manager, srm, storage drs, vaai, vasa, vcd, vcloud director, VMware, vsa, vshield, vsphere, vsphere 5

EMC VNX Replicator now supported by VMware SRM 4.x

Just received notice this morning that EMC VNX Replicator has been approved for support for VMware Site Recovery Manager 4.0.x and 4.1.x. An excerpt of the message is below:

VMware ESX 4.1 Patch ESX410-201010401-SG: Updates vmkernel64, VMX, CIM

Details

Release date: November 15, 2010

Patch Classification Security
Build For build information, see KB 1027027.
Host Reboot Required Yes
Virtual Machine Migration or Shutdown Required Yes
PRs Fixed 600953, 554166, 514442, 583503, 582904, 582445, 586758, 612450, 579077, 601775, 606217, and 581205
Affected Hardware N/A
Affected Software N/A
VIBs Included vmware-esx-apps, vmware-esx-backuptools, vmware-esx-cim, vmware-esx-esxcli, vmware-esx-iscsi, vmware-esx-lsi, vmware-esx-nmp, vmware-esx-perftools, vmware-esx-scripts, vmware-esx-srvrmgmt, vmware-esx-uwlibs, vmware-esx-vmkctl, vmware-esx-vmkernel64, vmware-esx-vmnixmod, vmware-esx-vmwauth, vmware-esx-vmx, vmware-hostd-esx, kernel, omc, and vmwprovider
Related CVE numbers CVE-2010-0415, CVE-2010-0307, CVE-2010-0291, CVE-2010-0622, CVE-2010-1087, CVE-2010-1437, and CVE-2010-1088

Solution

Summaries and Symptoms

This patch updates the service console kernel to fix multiple security issues. The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the names CVE-2010-0415, CVE-2010-0307, CVE-2010-0291, CVE-2010-0622, CVE-2010-1087, CVE-2010-1437, and CVE-2010-1088 to these issues.

In addition, this patch fixes the following issues:

  • When an user who is a member of more than 32 groups attempts to log into the service console of an ESX host by using KVM or SSH, any one of the following issues might occur:
    • ESX host restarts
    • ESX host becomes unresponsive
    • ESX host displays a purple screen

Note: With this patch, a user who is a member of more than 128 groups can access the console, but loses any group information beyond the 128th group.

  • The Health Status tab shows false alerts and zero readings for voltage and temperature sensors when you connect a vSphere Client to the vCenter Server that manages ESX hosts running on Unisys ES7000 Model 7600R Enterprise Server or NEC Express5800/A1040.
  • When a storage controller fails, an ESX software iSCSI initiator instance with default settings takes about 45 seconds to detect the problem and inform the ESX storage stack to initiate storage failover. For some array models, such as those that use LSI controllers, this problem results in storage failover taking more than 60 seconds to complete after the I/O is sent from the virtual machines. This can cause I/O errors to be reported by applications and the guest operating system running in the virtual machine. This patch resolves this issue.
    After installing this patch, you can configure the parameters Noop Interval and Noop Timeout by using either vCenter Server or the vmkiscsi-tool in the service console. These parameters enable you to reduce the timeout value based on the storage arrays, so that the software iSCSI initiator can detect changes in the path state faster and initiate storage failovers. The default values of the parameters for Noop Interval is 40 and Noop Timeout is 10.
  • Virtual machines are suspended and multiple storage alerts are raised on multiple ESX hosts resulting in an all-paths-down state. The VMkernel log file contains the following message:
    NMP: nmp_DeviceAttemptFailover: Retry world failover device. “naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx” – failed to issue command due to Not found (APD), try again…
    This issue occurs because ESX does not accept new paths that are exported from EMC Symmetrix Storage after an ESX host boot-up or an all-paths-down state.
  • The esxcfg-volume utility might fail to mount VMFS volumes with snapshots, and displays an error message similar to the following:
    Error: Unable to resignature this VMFS3 volume due to duplicate extents found
    If you dynamically add capacity from the storage device to the VMFS datastore and perform a VMFS rescan operation, the VFMS volumes on ESX hosts might not mount under /vmfs/volumes/ when you use the esxcfg-volume utility. This issue might occur due to dynamic expansion of snapshot volumes from storage having multiple extents. With this patch, ESX improves handling for multiple partitions or devices in VMFS volumes.
  • The Network window in the vSphere 4 esxtop tool incorrectly reports a large number of dropped receive-packets (%DRPRX) for virtual machines that are using the E1000 virtual NIC with multicast traffic.
  • After an ESX host is upgraded to ESX 4.1 or after ESX 4.1 is installed, you might experience the following symptoms:
    • vSphere client might display incorrect values for the number of Processor sockets and cores per socket available in the ESX system. For example, for a ESX system that has 2 processor sockets and 6 cores per socket, the vSphere Client might display that the ESX system has 4 processor sockets and 3 cores per socket.
    • Some ESX 4.1 systems might use double the number of licenses required.
    • Some of ESX/ESXi hosts might lose their license.
  • A potential race condition between the destroying slowpath agent function and the socket wakeup callback function might cause an ESX host to stop responding and display a purple screen.
  • If VMware Tools of version ESX 3.5 or later is installed on Windows virtual machines that are configured with the automatic VMware Tools upgrade option, automatic upgrade to ESX 4.1 VMware tools on these virtual machines might fail with an error message similar to the following:
    Error upgrading VMware Tools.
  • If you start a Microsoft Windows Server 2003 32-bit virtual machine with/3GB switch defined in theboot.inifile on VMware ESX 4.1, you might see the following symptoms:
    • Read or Write memory errors occur in the guest operating system.
    • A Remote Procedure Call (RPC) error is reported and the virtual machine is forced to reboot often.
    • A stop code of type 0x000000F4 occurs.
    • Microsoft .NET or Java applications might fail with memory errors.
    • The Microsoft Windows Event log might contain error messages similar to the following:
      Event Type: Error
      Event Source: .NET Runtime
      Event Category: None
      Description:.NET Runtime version 2.0.50727.3615 – Fatal Execution Engine Error (7A0979AE) (80131506)
  • For ESX running on BULL servers, vSphere Client displays the names of the Processor and Power sensors starting with 96 on the Configuration tab. For example, Processor 96 or PowerSupply 97.
    With this patch, the Processor and Power sensor names are displayed starting with Processor 0 or PowerSupply 1.

Deployment Considerations

The required patch bundles and reboot information listed in the table above.

To configure Noop Interval and Noop Interval parameters in vCenter Server:

  1. Log in to the vCenter Server as administrator by using the vSphere Client.
  2. Select the Configuration tab.
  3. Click the Storage Adapters link in the Hardware panel.
  4. Select the vmhba for the iSCSI software adapter.
  5. Click the Properties link in the Details panel.
  6. Click Advanced in the iSCSI Initiator Properties window.
  7. Configure the required values for the Noop interval and Noop Timeout parameters.
    The default values of the parameters are Noop Interval is 40 and Noop Timeout is 10.

To configure Noop Interval and Noop Interval parameters by using the vmkiscsi-tool:

  1. Log in as root in the ESX service console.
  2. Run the following command to set the Noop interval:
    vmkiscsi-tool -W -a “noop_out_interval=15″ vmhba<nn>
  3. Enter the following command to set the Noop Timeout:
    vmkiscsi-tool -W -a “noop_out_timeout=10″ vmhba<nn>
  4. Enter the following command to view the updated values:
    vmkiscsi-tool -W -l vmhba<nn>

The required patch bundles and reboot information is listed in the table above.

vMotion fails in ESX hosts with 10Gb uplinks

Symptoms

In ESX 4.1 hosts that use 10Gb uplinks for the vmkernel portgroup, you experience these symptoms:

  • vMotion fails when entering the maintenance mode
  • vMotion fails when evacuating the host
  • In the vmkernel log, you see entries similar to:Jun 16 11:27:45 mmcloudesx040 vmkernel: 52:12:11:45.651 cpu26:108033)WARNING: Heap: 2218: Heap tcpip already at its maximumSize. Cannot expand.
    Jun 16 11:27:45 mmcloudesx040 vmkernel: 52:12:11:45.651 cpu26:108033)WARNING: Heap: 2481: Heap_Align(tcpip, 256/256 bytes, 256 align) failed.  caller: 0x41801d066b0d
    Jun 16 11:27:45 mmcloudesx040 vmkernel: 52:12:11:45.651 cpu26:108033)WARNING: VMotionUtil: 1495: 1308237951758224 S: write function failed.
    Jun 16 11:27:45 mmcloudesx040 vmkernel: 52:12:11:45.652 cpu26:108033)WARNING: Migrate: 296: 1308237951758224 S: Failed: Out of memory (0xbad0014) @0x0

Cause

When 10Gb uplinks are used for the vmkernel portgroup for vMotion, the maximum concurrent vMotions allowed is eight. When many concurrent vMotion processes run simultaneously, the TCP/IP heap may run out of memory.

Resolution

This issue is resolved in VMware ESX 4.1 Update 2. To download ESX 4.1 Update 2, see the VMware Download Center.

If you are unable to update, try one of these options:

  • Set the DRS Automation Level to Partially Automated to prevent virtual machines from being automatically migrated to the host to be put in Maintenance Mode. Manually vMotion the virtual machines from the host (not more than four at a time) to other hosts in the cluster, put the host in the Maintenance Mode, and then set the DRS Automation Level back to Fully Automated.
  • Reduce the maximum number of concurrent vMotions allowed. To do this:
    1. Open the vpxd.cfg file using the text editor.
    2. Add the following entries within the <vpxd> and </vpxd> tags:

<ResourceManager>
<maxCostPerHost>16</maxCostPerHost>

</ResourceManager>

This will only allow four simultaneous vMotion to occur at a given time. If failures still occur, set <maxCostPerHost> value to less than 12 (three simultaneous vMotions) or to 8 (two simultaneous vMotions).

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2002354&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250502420

After applying a host profile, starting or shutting down the ESX 4.1 host reports the error: /etc/sysconfig/network: line 1: .encoding: command not found

Symptoms

  • After applying a host profile to an ESX 4.1 host, the host reports this error while starting and shutting down:/etc/sysconfig/network: line 1: .encoding: command not found
  • The /etc/sysconfig/networkconfiguration file contains the line:.encoding=UTF-8

Cause

This issue occurs because the configuration file /etc/sysconfig/network contains the entry .encoding=UTF-8.

The /etc/sysconfig/network configuration file contains key=value configuration for the ESX Service Console network and cannot understand the .encoding=UTF-8 line because it does not follow the key=value notation.

Resolution

This message does not impact the functioning of the host and can be safely ignored.

To stop this error from appearing during the host startup or shutdown:

  1. Connect to the ESX host using SSH. For more information, see Connecting to an ESX host using a SSH client (1019852).
  2. Open the /etc/sysconfig/network file using a text editor. For more information, see Editing configuration files in VMware ESX (1017022).
  3. Remove the .encoding=UTF-8 line.
  4. Save and close the file.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1024805&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250502420

Unable to upgrade host using esxupdate or import an ESX 4.1 upgrade bundle into Update Manager 4.1

Symptoms

  • You are unable to import/upload the bundle upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip into VMware Update Manager 4.1
  • You see the error:[2010-07-28 13:38:35:607 ‘HostUpdateDepotManager’ 3652 ERROR] [metadataCache, 98] Metadata file contains invalid content: Error: Platforms changed for bulletin: ESX410-GA-esxupdate
    [2010-07-28 13:38:35:626 ‘ConfirmOfflinePatchTask.ConfirmOfflinePatchTask{12}’ 3652 ERROR] [confirmOfflinePatchTask, 203] confirm Offline Patch failed with exception Error: Platforms changed for bulletin: ESX410-GA-esxupdate
    [2010-07-28 13:38:35:632 ‘ConfirmOfflinePatchTask.ConfirmOfflinePatchTask{12}’ 3652 INFO] [vciTaskBase, 1278] SerializeToVimFault fault:
    (integrity.fault.FileUploadInvalidPackage) {
  • Upgrading an ESX 4.0.x host to ESX 4.1 via the esxupdate command as described in the vSphere Upgrade Guidemay return the error:[root@host]# esxupdate update –bundle=upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip
    Encountered error MetadataFormatError:
    The error data is:
    Filename – None
    Message – Duplicate definitions of bulletin ESX410-GA-esxupdate with
    unequal attributes.
    Errno – 5
    Description – The format of the metadata is invalid.
  • You see the error:Failed to import data. Unable to persist upgrade package. Check if the package is already imported

Resolution

Note: If you have downloaded the Pre-Upgrade bundle after August 13, 2010, you should not encounter this issue. As of this date, the Pre-Upgrade package has been rebundled to prevent this issue.

Update Manager

This issue can be caused by using the pre-upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip bundle and uploading/importing itinto VMware Update Manager 4.1. The improper application of the pre-upgrade bundle causes a conflict between the ESX410-GA-esxupdate bulletin in the pre-upgrade bundle and a similar bulletin in the upgrade bundle.

To avoid encountering this issue, do not import/upload the pre-upgrade bundle into VMware Update Manager 4.1. VMware Update Manager 4.1 does not require this bundle for upgrade operations.

If you are currently experiencing this issue, contact VMware Technical Support for assistance. Reference this article in your Support Request. For more information, see How to Submit a Support Request.

Esxupdate

The issue in esxupdate can be caused by applying the pre-upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip twice prior to installing the upgrade bundle. The improper application of the pre-upgrade bundle causes a conflict between the ESX410-GA-esxupdate bulletin in the pre-upgrade bundle and a similar bulletin in the upgrade bundle.

To avoid encountering this issue, when using esxupdate, ensure to only apply the pre-upgrade package once prior to the upgrade bundle.

Alternatively, do not apply the pre-upgrade package at all. Instead, apply the esxupdate bulletin that is included in the upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip bundle.

Here’s an example of the steps on how to do that:

  1. Run # esxupdate –bundle <upgrade.zip> -b ESX410-GA-esxupdate update
  2. Run # esxupdate –bundle <upgrade.zip> update

If you are currently experiencing this issue, remove or rename the /etc/vmware/esxupdate/bulletins.zip file, on the ESX host:

  1. Run cd /etc/vmware/esxupdate.
  2. Run mv bulletins.zip bulletins.old.
  3. Apply the upgrade bundle by running:

esxupdate –bundle <upgrade.zip> update

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1026094&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250502420

Enabling NetQueue on Intel Gigabit Network Devices Using the igb Driver in ESX 4.x

Details

The asynchronous version of the ESX 4.x igb driver uses VMware’s NetQueue technology to enable Intel Virtual Machine Device Queues (VMDq) support for Ethernet devices based on the Intel® 82576 and 82580 Gigabit Ethernet Controllers. VMDq is optional and disabled by default.

Solution

Enabling VMDq

To enable VMDq:

  1. Ensure the correct version of the driver is installed and enabled to load automatically at boot:# esxcfg-module -e igb

    Note: Currently this is supported on Intel’s ESX and ESXi 4.x drivers (igb version 400.2.4.10 and higher) for 82576 and 82580 Gigabit Ethernet Controllers.

  2. Set per each port the optional VMDq load parameters for the igb module.
    • Configure IntMode=2. Setting a value of 2 for this option specifies using MSI-X, which enables the Ethernet controller to direct interrupt messages to multiple processor cores. MSI-X must be enabled in order for NetQueue to work with VMDq.
    • Set a value for the parameter to indicate the number of transmit and receive queues. The parameter value ranges from 1 to 8 since Intel 82576 and 82580 based network devices provide a maximum of 8 transmit queues and 8 receive queues per port. The value used sets both the transmit and receive queues to the same number.For a quad-port adapter, the following configuration turns on VMDq in full on all four ports:

      # esxcfg-module -s “IntMode=2,2,2,2 VMDQ=8,8,8,8″ igb

      The VMDq configuration is flexible. Systems with multiple ports are enabled and configured by comma-separated lists. The values are applied to the ports in the order they are enumerated on the PCI bus.

      For example:

      # esxcfg-module -s IntMode=0,0,2,2, … ,2,2 VMDQ=1,1,8,8, … ,4,4 igb

      Shows:

      • The values configured for ports 1 and 2 are: IntMode=0 and VMDQ=1
      • The values configured for ports 3 and 4 are: IntMode=2 and VMDQ=8
      • The values configured for the last two ports are: IntMode=2 and VMDQ=4
  3. Reboot the ESX host.

Limitation Notes:

  • With standard sized Ethernet packets (MTU = 1500 or less), the maximum number of ports supported in VMDq mode is 8, with each port using 8 transmit and receive queues.
  • When using Jumbo Frames (MTU between 1500 and 9000) and VMDq, the maximum number of supported ports is 4, and also the number of transmit and receive queues per port must be reduced to 4.

Verifying that VMDq is enabled

To verify that VMDq is enabled:

1.  Check the options configured for the igb module:
2.  # esxcfg-module -g igb
  1. The output should appear similar to:
4.  igb enabled = 1 options = 'IntMode=2,2,2,2,2,2,2,2 VMDQ=8,8,8,8,8,8,8,8'

The enabled value must equal 1, which indicates the igb module will load automatically. IntMode and VMDQ must be set for each port. The example above shows a configuration with 8 ports, where all interfaces are configured in full VMDq mode.

  1. Determine which ports use the igb driver using esxcfg-nics. Confirm the driver successfully claimed all supported devices present in the system (enumerate them using lspci and compare the list with the output of esxcfg-nics -l). Query the statistics on each interface using ethtool. If VMDq has been enabled successfully, statistics for multiple transmit and receive queues are shown (see tx_queue_0 through tx_queue_7 and rx_queue_0 through rx_queue_7 in the example below).# esxcfg-nics -l

    Name PCI Driver Link Speed Duplex MAC Address MTU Description
    vmnic0 04:00.00 bnx2 Up 1000Mbps Full 00:15:c5:f2:34:b0 1500 Broadcom Corporation Broadcom NetXtreme II BCM5708 1000Base-T
    vmnic1 08:00.00 bnx2 Down 0Mbps Half 00:15:c5:f2:34:b2 1500 Broadcom Corporation Broadcom NetXtreme II BCM5708 1000Base-T
    vmnic2 0d:00.00 igb Up 1000Mbps Full 00:1b:21:4d:10:98 1500 Intel Corporation 82576 Gigabit Network Connection
    vmnic3 0d:00.01 igb Up 1000Mbps Full 00:1b:21:4d:10:99 1500 Intel Corporation 82576 Gigabit Network Connection
    vmnic4 0e:00.00 igb Up 1000Mbps Full 00:1b:21:4d:10:9c 1500 Intel Corporation 82576 Gigabit Network Connection
    vmnic5 0e:00.01 igb Up 1000Mbps Full 00:1b:21:4d:10:9d 1500 Intel Corporation 82576 Gigabit Network Connection
    vmnic6 10:00.00 igb Up 1000Mbps Full 00:1b:21:56:2a:10 1500 Intel Corporation 82580 Gigabit Network Connection
    vmnic7 10:00.01 igb Up 1000Mbps Full 00:1b:21:56:2a:11 1500 Intel Corporation 82580 Gigabit Network Connection
    vmnic8 10:00.02 igb Up 1000Mbps Full 00:1b:21:56:2a:12 1500 Intel Corporation 82580 Gigabit Network Connection
    vmnic9 10:00.03 igb Up 1000Mbps Full 00:1b:21:56:2a:13 1500 Intel Corporation 82580 Gigabit Network Connection

    # lspci | grep -e 82576 -e 82580

    0d:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    0d:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    0e:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    0e:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    10:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
    10:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
    10:00.2 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
    10:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)

    # ethtool -S vmnic6

    NIC statistics:
    rx_packets: 0
    tx_packets: 0
    rx_bytes: 0
    tx_bytes: 0
    rx_broadcast: 0
    tx_broadcast: 0
    rx_multicast: 0
    tx_multicast: 0
    multicast: 0
    collisions: 0
    rx_crc_errors: 0
    rx_no_buffer_count: 0
    rx_missed_errors: 0
    rx_aborted_errors: 0
    tx_carrier_errors: 0
    tx_window_errors: 0
    tx_abort_late_coll: 0
    tx_deferred_ok: 0
    tx_single_coll_ok: 0
    tx_multi_coll_ok: 0
    tx_timeout_count: 0
    rx_long_length_errors: 0
    rx_short_length_errors: 0
    rx_align_errors: 0
    tx_tcp_seg_good: 0
    tx_tcp_seg_failed: 0
    rx_flow_control_xon: 0
    rx_flow_control_xoff: 0
    tx_flow_control_xon: 0
    tx_flow_control_xoff: 0
    rx_long_byte_count: 0
    tx_dma_out_of_sync: 0
    tx_smbus: 0
    rx_smbus: 0
    dropped_smbus: 0
    rx_errors: 0
    tx_errors: 0
    tx_dropped: 0
    rx_length_errors: 0
    rx_over_errors: 0
    rx_frame_errors: 0
    rx_fifo_errors: 0
    tx_fifo_errors: 0
    tx_heartbeat_errors: 0
    tx_queue_0_packets: 0
    tx_queue_0_bytes: 0
    tx_queue_0_restart: 0
    tx_queue_1_packets: 0
    tx_queue_1_bytes: 0
    tx_queue_1_restart: 0
    tx_queue_2_packets: 0
    tx_queue_2_bytes: 0
    tx_queue_2_restart: 0
    tx_queue_3_packets: 0
    tx_queue_3_bytes: 0
    tx_queue_3_restart: 0
    tx_queue_4_packets: 0
    tx_queue_4_bytes: 0
    tx_queue_4_restart: 0
    tx_queue_5_packets: 0
    tx_queue_5_bytes: 0
    tx_queue_5_restart: 0
    tx_queue_6_packets: 0
    tx_queue_6_bytes: 0
    tx_queue_6_restart: 0
    tx_queue_7_packets: 0
    tx_queue_7_bytes: 0
    tx_queue_7_restart: 0
    rx_queue_0_packets: 0
    rx_queue_0_bytes: 0
    rx_queue_0_drops: 0
    rx_queue_0_csum_err: 0
    rx_queue_0_alloc_failed: 0
    rx_queue_1_packets: 0
    rx_queue_1_bytes: 0
    rx_queue_1_drops: 0
    rx_queue_1_csum_err: 0
    rx_queue_1_alloc_failed: 0
    rx_queue_2_packets: 0
    rx_queue_2_bytes: 0
    rx_queue_2_drops: 0
    rx_queue_2_csum_err: 0
    rx_queue_2_alloc_failed: 0
    rx_queue_3_packets: 0
    rx_queue_3_bytes: 0
    rx_queue_3_drops: 0
    rx_queue_3_csum_err: 0
    rx_queue_3_alloc_failed: 0
    rx_queue_4_packets: 0
    rx_queue_4_bytes: 0
    rx_queue_4_drops: 0
    rx_queue_4_csum_err: 0
    rx_queue_4_alloc_failed: 0
    rx_queue_5_packets: 0
    rx_queue_5_bytes: 0
    rx_queue_5_drops: 0
    rx_queue_5_csum_err: 0
    rx_queue_5_alloc_failed: 0
    rx_queue_6_packets: 0
    rx_queue_6_bytes: 0
    rx_queue_6_drops: 0
    rx_queue_6_csum_err: 0
    rx_queue_6_alloc_failed: 0
    rx_queue_7_packets: 0
    rx_queue_7_bytes: 0
    rx_queue_7_drops: 0
    rx_queue_7_csum_err: 0
    rx_queue_7_alloc_failed: 0

Disabling VMDq

To disable VMDq:

  1. To return the igb driver to default (non-VMDq) mode, erase the optional VMDq load parameters:#esxcfg-module -s “” igb
  2. Reboot the ESX host.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1032909&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250502420

esxcfg-scsidevs –a command does not locate HBA cards that support adp94xx and aic79xx drivers

Details

After an ESX host starts, you cannot locate the HBA cards that support adp94xx and aic79xx drivers by running the esxcfg-scsidevs –a command from the service console of the ESX host. The issue occurs only when you enable Host RAID in the BIOS of HBAs that support adp94xx or aic79xx driver.

Solution

To work around this issue, perform these steps to locate the HBA cards that support aic79xx or adp94xx drivers:

  1. Press Ctrl+A when the ESX host starts.
  2. Disable the Host RAID support in the HBA card BIOS.
  3. Restart the ESX host.

You can also locate the HBA cards by running the lspci command from the service console.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1023976&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250502420

Accessing USB storage and other USB devices from the service console

Details

The technology that supports USB device passthrough from an ESX/ESXi host to a virtual machine does not support simultaneous USB device connections from USB passthrough and from the service console.

Solution

To access USB storage and other devices from the console, disable USB passthrough.

  1. Start the ESX host and enter the following command at the Linux service console command line:chkconfig usbarbitrator off
  2. Reboot the host.

The devices are available to the service console.

Note: The service console is typically used only when you contact a VMware technical support representative. In earlier versions of ESX, the service console was one of the interfaces to ESX hosts. Many of the service console commands are now deprecated.

The vSphere SDK is used for scripted manipulation of vSphere. The vSphere Client is the primary interface to all nonscripted activities, including configuring, monitoring, and managing virtual machines and resources.

VMware ESX 4.1 Patch ESX410-201011401-BG: Updates vmware-esx-drivers-net-bnx2x

Details

Release date: November 29, 2010

Patch Classification Critical
Build For build information, see KB 1029400.
Host Reboot Required Yes
Virtual Machine Migration or Shutdown Required Yes
PRs Fixed 593607
Affected Hardware N/A
Affected Software N/A
VIBs Included vmware-esx-drivers-net-bnx2x
Related CVE numbers N/A

Solution

Summaries and Symptoms

This patch fixes an issue where if you are using VMware ESX 4.1 with the Broadcom bnx2x (in-box driver version 1.54.1.v41.1-1vmw), you might see either of the following symptoms:

  • The ESX host might frequently disconnect from the network.
  • The ESX host might stop responding with a purple diagnostic screen that displays messages similar to the following:
    0:18:56:51.183 cpu10:4106)0x417f80057838:[0x4180016e7793]PktContainerGetPkt@vmkernel:nover+0xde stack: 0x1
    0:18:56:51.184 cpu10:4106)0x417f80057868:[0x4180016e78d2]Pkt_SlabAlloc@vmkernel:nover+0x81 stack: 0x417f800578d8
    0:18:56:51.184 cpu10:4106)0x417f80057888:[0x4180016e7acc]Pkt_AllocWithUseSizeNFlags@vmkernel:nover+0x17 stack: 0x417f800578b8
    0:18:56:51.185 cpu10:4106)0x417f800578b8:[0x41800175aa9d]vmk_PktAllocWithFlags@vmkernel:nover+0x6c stack: 0x1
    0:18:56:51.185 cpu10:4106)0x417f800578f8:[0x418001a63e45]vmklnx_dev_alloc_skb@esx:nover+0x9c stack: 0x4100aea1e988
    0:18:56:51.185 cpu10:4106)0x417f80057918:[0x418001a423da]__netdev_alloc_skb@esx:nover+0x1d stack: 0x417f800579a8
    0:18:56:51.186 cpu10:4106)0x417f80057b08:[0x418001b6c0cf]bnx2x_rx_int@esx:nover+0xf5e stack: 0x0
    0:18:56:51.186 cpu10:4106)0x417f80057b48:[0x418001b7e880]bnx2x_poll@esx:nover+0x1cf stack: 0x417f80057c64
    0:18:56:51.187 cpu10:4106)0x417f80057bc8:[0x418001a6513a]napi_poll@esx:nover+0x10d stack: 0x417fc1f0d078
  • The bnx2x driver or firmware generates panic messages and dumps a backtrace with the following messages in the /var/log/vmkernel log file:
    Jul 27 23:41:42 vmkernel: 0:00:34:23.762 cpu8:4401)<3>[ bnx2x_attn_int_deasserted3:3379(vmnic0)]MC assert!
    Jul 27 23:41:42 vmkernel: 0:00:34:23.762 cpu8:4401)<3>[bnx2x_attn_int_deasserted3:3384(vmnic0)]driver assert
    Jul 27 23:41:42 vmkernel: 0:00:34:23.762 cpu8:4401)<3>[ bnx2x_panic_dump:634(vmnic0)]begin crash dump

Deployment Considerations

None beyond the required patch bundles and reboot information listed in the table above.

Patch Download and Installation

See the VMware vCenter Update Manager Administration Guide for instructions on using Update Manager to download and install patches to automatically update ESX 4.1 hosts.

To update ESX 4.1 hosts without using Update Manager, download the patch ZIP file from http://support.vmware.com/selfsupport/download/ and install the bulletin by using esxupdate from the command line of the host. For more information, see the ESX 4.1 Patch Management Guide.

Driver CDs for ESX 4.0 that have not been re-released for ESX 4.1 fail to install

Symptoms

  • You are attempting to install drivers onto your ESX 4.1 machine from an installation media intended for ESX 4.0
  • You see the error:No bulletins for this platform could be found. Nothing to do. errno 13.

Resolution

If updated driver bundles are not available for download, the driver bundle needs to be slightly modified in order to allow installation.

Follow the steps in order to modify the bundle:

  1. Extract the file vmware.xml from the metadata.zip file within the bundle
  2. Inside the .xml file there are sections that look similar to:<platforms>
    <softwarePlatform locale=”” productLineID=”esx” version=”4.*” />
    <softwarePlatform locale=”” productLineID=”embeddedEsx” version=”4.*” /> </platforms>

    and

    <systemReqs>
    <swPlatform locale=”” productLineID=”esx” version=”4.0.0″ />
    <swPlatform locale=”” productLineID=”embeddedEsx” version=”4.0.0″ />
    <maintenanceMode>true</maintenanceMode>
    </systemReqs>

  3. Change the <systemReqs> section version values from 4.0.0 to 4.* so that the lines looks like:<swPlatform locale=”” productLineID=”esx” version=”4.*” />
    <swPlatform locale=”” productLineID=”embeddedEsx” version=”4.*” />
  4. Save the file and use it to replace the existing file within the driver bundle.
  5. Attempt to install the bundle again

Load Based Teaming in vSphere 4.1

Purpose

This article provides information about the new Load Based Teaming policy in vSphere 4.1.

Resolution

Note: Load Based Teaming policy is available only with distributed virtual switch (vDS). It is not available with standard switches.
In vSphere 4.0, there are three policies:

  • Route based on the originating virtual Port ID
  • Route based on IP hash
  • Route based on source MAC hash

These three policies provide static mapping from vSwitch port to pNIC adapter. It is possible that two heavy network load virtual machines are mapped to same physical adapter that is congested while the other adapters still have free bandwidth.

With Load Based Teaming in vSphere 4.1, after initial port based assignment, the load balancing algorithm regularly checks the load of all the teaming NICs. If one gets overloaded while another one has bandwidth available, it reassigns the port-NIC mapping to reach a balanced status. During the period until the next check is performed, the mapping is stable.

Note: Bandwidth is still limited to the maximum bandwidth a single pNIC provides.

Load Based Teaming has these advantages:

  • Dynamic adjustments to load
  • Different NIC speeds are taken into account. You can have a mix of 1Gbit, 10Gbit, and even 100Mbit NICs.

Note: If Load Based Teaming reassigns ports, the MAC address change to a different pSwitch port. The pSwitch must allow for this.

Additional Information

For more information on virtual networking, see VMware Virtual Networking Concepts.

Tags

load-based-teaming

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1022030&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

ALUA parameters in the output of ESX/ESXi 4.1 commands

Purpose

When the storage array of an ESX/ESXi 4.1 host supports Asymmetric Logical Unit Access (ALUA), the output of storage commands has some new parameters. This article provides information about these new parameters.

Resolution

About ALUA

ALUA devices can operate in two modes: implicit and/or explicit.

Explicit ALUA devices allow the host to use the Set Target Port Group task management command to set the Target Port Group’s state. In implicit ALUA, a device’s Target Port Group states are managed by the target device itself.

New parameters

If you run the command # esxcli nmp device list -d naa.60060160455025000aa724285e1ddf11, the output appears similar to:

naa.60060160455025000aa724285e1ddf11
Device Display Name: DGC Fibre Channel Disk (naa.60060160455025000aa724285e1ddf11)
Storage Array Type: VMW_SATP_ALUA_CX
Storage Array Type Device Config: {navireg=on, ipfilter=on}{implicit_support=on;explicit_support=on; explicit_allow=on;alua_followover=on;{TPG_id=1,TPG_state=AO}{TPG_id=2,TPG_state=ANO}}
Path Selection Policy: VMW_PSP_FIXED_AP
Path Selection Policy Device Config: {preferred=vmhba1:C0:T0:L0;current=vmhba1:C0:T0:L0}
Working Paths: vmhba1:C0:T0:L0

The output may contain these new device configuration parameters:

  • implicit_support=onThis parameter shows whether or not the device supports implicit ALUA. You cannot set this option as it is a property of the LUN.
  • explicit_supportThis parameter shows whether or not the device supports explicit ALUA. You cannot set this option as it is a property of the LUN.
  • explicit_allowThis parameter shows whether or not the user allows the SATP to exercise its explicit ALUA capability if the need arises during path failure. This only matters if the device actually supports explicit ALUA (that is, explicit_support is on). This option is turned on using the esxcli command enable_explicit_alua and turned off using the esxcli command disable_explicit_alua.
  • alua_followoverThis parameter shows whether or not the user allows the SATP to exercise the follow-over policy, which prevents path thrashing in multi-host setups. This option is turned on using the esxcli command enable_alua_followover and turned off using the esxcli command disable_alua_followover.

If you run the command # esxcli nmp path list -d naa.60060160455025000aa724285e1ddf11, the output appears similar to:

fc.20000000c987f8c5:10000000c987f8c5-fc.50060160bce0383c:5006016e3ce0383c-naa.60060160455025000aa724285e1ddf11
Runtime Name: vmhba2:C0:T1:L0
Device: naa.60060160455025000aa724285e1ddf11
Device Display Name: DGC Fibre Channel Disk (naa.60060160455025000aa724285e1ddf11)
Group State: active unoptimized
Array Priority: 0
Storage Array Type Path Config: {TPG_id=2,TPG_state=ANO,RTP_id=18,RTP_health=UP}
Path Selection Policy Path Config: {current: no; preferred: no}

fc.20000000c987f8c5:10000000c987f8c5-fc.50060160bce0383c:500601663ce0383c-naa.60060160455025000aa724285e1ddf11
Runtime Name: vmhba2:C0:T0:L0
Device: naa.60060160455025000aa724285e1ddf11
Device Display Name: DGC Fibre Channel Disk (naa.60060160455025000aa724285e1ddf11)
Group State: active
Array Priority: 1
Storage Array Type Path Config: {TPG_id=1,TPG_state=AO,RTP_id=7,RTP_health=UP}
Path Selection Policy Path Config: {current: no; preferred: no}

In the output, TPG_state = ANO means Active/Non-Optimized and TPG_state = AO means Active/Optimized.

Notes:

  • Target Port Groups (TGP) allow path grouping and dynamic load balancing. Each port in the same TPG has the same port state, which can be one of these:
    • Active/Optimized
    • Active/Non-Optimized
    • Standby
    • Unavailable
    • In-transition
  • Active indicates that the LUN is active on a TPG. If the LUN is not active, a failover would have to be requested to the storage processor to make it active.
  • Optimized has the best performance. This is reported by the storage processor as per its current internal organization of paths/cache.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1015437&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Troubleshooting network issues with the Cisco show tech-support command

Purpose

This article provides information about troubleshooting network issues using the Cisco show tech-support command.

Resolution

If you experience networking issues between vSwitch and physical switched environment, you can obtain information about the configuration of a Cisco router or switch by running the show tech-support command in privileged EXEC mode.

Note: This command does not alter the configuration of the router.

You can copy the collected information and submit to VMware support to aid in diagnosing networking issues. For more information, see How to Submit a Support Request.

Additional Information

For more information about the Cisco show tech-support command, see show tech-support in the Cisco IOS Configuration Fundamentals and Network Management Command Reference.

Note:The preceding link was correct as of February 17, 2010. If you find the link is broken, provide feedback and a VMware employee will update the link.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Insufficient virtual machine video RAM

Details

  • A virtual machine has run of out RAM for the display setting
  • You see an insufficient video RAM message:Insufficient video RAM. The maximum resolution of the virtual machine will be limited to 1176×885. To use the configured maximum resolution of 2560×1600, increase the amount of video RAM allocated to this virtual machine by setting svga.vramSize=”16384000″ in the virtual machine’s configuration file.

Solution

This is an informational message stating that large screen resolutions are not supported. This message does not indicate a problem.

VMware recommends that you accept the current maximum settings. If, however, you require a larger screen resolution, you can increase the svga.vramSize setting.

To increase the svga.vramSize setting:

  1. Power off the virtual machine.
  2. Right-click on the virtual machine and choose Edit Settings.
  3. Click the Hardware tab.
  4. Click Video Card.
  5. Select Enter total video RAM and enter a higher value.
  6. Click OK.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1012874&sliceId=2&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

ESX host boot stops at the error: VSD mount/Bin/SH:cant access TTY job control turned off

Symptoms

  • An ESX 4.0 host becomes unresponsive when it is put into maintenance mode.
  • Rebooting the host fails.
  • On the ESX console, you see the error:VSD mount/bin/sh:can’t access TTY; job control turned off.
  • The ESX host does not boot and drops into Troubleshooting (busy box) mode.
  • The last lines in the /var/log/messages log file are similar to:sysboot: Getting ‘/boot/cosvdmk’ parameter from esx.conf
    sysboot: COS VMDK Specified in esx.conf: /vmfs/volumes/4b27ec62-93ec3816-0475-00215aaf882a/esxconsole-4b27e9e3-20ee-69d7-ae11-00215aaf882a/esxconsole.vmdk
    sysboot: 66.vsd-mount returned critical failure
    sysboot: Executing ‘chvt 1′

Resolution

This issue is resolved in vSphere 4.0 Update 3. To upgrade to vSphere 4.0 Update 3, see the VMware vSphere Download Center.

This issue occurs if an ESX host cannot identify the esxconsole.vmdk file in which the service console resides.

Important: If your ESX is booting from SAN, the service console must be installed on a VMFS datastore that is resident on a host’s local disk or on a SAN disk that is masked and zoned to that particular host only. This datastore cannot be shared between hosts. For more information, see the ESX and vCenter Server Installation Guide. Verify these requirements are met before continuing with the troubleshooting.

To troubleshoot this issue:

  1. Go to the console of the ESX host. After the error message, ESX drops into Troubleshooting (busy box) mode.
  2. Find the .vmdk for the service console by running the command:grep “/boot/cosvmdk” /etc/vmware/esx.conf

    The output is similar to:

    /boot/cosvmdk = “/vmfs/volumes/<uuid>/<dir>/esxconsole.vmdk”

    For example:

    /boot/cosvmdk = “/vmfs/volumes/4a14d968-88bf7161-700f-00145ef48f76/esxconsole-4a14d906-2f96-7956-7284-00145ef48f74/esxconsole.vmdk”

  3. Make note of the <uuid> and the <dir> values in the output.
  4. Verify that the files exist by running the command:ls -al /vmfs/volumes/<uuid>/<dir>/*.vmdk

    Where <uuid> and <path> is from the output of step 2.

    The output is similar to:

    total 7906560
    drwxr-xr-x 1 root root        840 May 21 00:45 .
    drwxr-xr-t 1 root root       2660 Oct 21 09:10 ..
    -rw——- 1 root root 8095006720 Oct 26 15:37 esxconsole-flat.vmdk
    -rw——- 1 root root        475 May 21 00:32 esxconsole.vmdk
    drwxr-xr-x 1 root root        980 May 21 00:45 logs

    Note: You may receive the error: ls: /vmfs/volumes/4a14d968-88bf7161-700f-00145ef48f76/esxconsole-4a14d906-2f96-7956-7284-045ef48f74/: No such file or directory. If the directory does not exist, see ESX fails to boot when the disk containing the datastore with esxconsole.vmdk is detected as a snapshot (1012142).

  5. Ensure that the esxconsole-flat.vmdkIf the esxconsole.vmdk file does not exist, see Recreating a missing virtual disk (VMDK) header/descriptor file (1002511)and ensure that the following settings are in place:
    • ddb.adapterType = “buslogic”
    • ddb.consoleOsDisk = “True”
  6. Ensure that the esxconsole-flat.vmdk exists.Note: If the esxconsole-flat file does not exist, you must reinstall the ESX host to recreate the service console.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1009125&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Growing a local datastore from the command line in vSphere ESX 4.x

Purpose

VMFS Datastores in vSphere 4.x can be increased in size by adding a new extent on a different storage device (spanning), or by increasing the size of the existing storage device and then growing the existing datastore extent to fill that available adjacent capacity.

VMFS Datastore extents may be contained within Primary or Logical partitions, following the MBR/EBR partitioning scheme. VMFS Datastores on the ESX boot device are contained within a Logical partition, and those on an ESXi boot device are contained within a Primary partition.

  • Datastore extents within a Primary partition on a non-Local storage device can be grown into adjacent space using the vSphere Client. For more information, see the Changing VMFS Datastore Properties section of the ESX/ESXi Server Configuration Guide for your version of vSphere.
  • Datastore extents within Primary partitions on a Local or Boot storage device cannot be grown into adjacent space using the vSphere Client. This is the default layout for an ESXi 4.x installation. For more information, see Growing a local datastore from the command-line in vSphere ESXi 4.x (2002461).
  • Datastore extents within Extended and Logical partitions on a Local or Boot storage device cannot be grown into adjacent space using the vSphere Client. This is the default layout for an ESX 4.x installation. This article provides steps for growing an existing VMFS Datastore in a Logical partition to fill available adjacent space on the local boot device.

Notes:

  1. This article assumes that the underlying storage volume has already had its capacity increased from the hardware perspective, possibly by adding additional disk to a RAID set. For more information, engage your hardware vendor.
  2. A Datastore on a LUN detected as a snapshot cannot be grown. For more information, see ESX/ESXi 4.x handling of LUNs detected as snapshot (1011387).
  3. A Datastore’s partitions can only be grown into contiguous adjacent space on the disk. Ensure that the partitions in question are at the end of the disk.

Warning: Be very careful to not overlap the any Primary and Logical partitions. This could result in data loss.

Resolution

To increase the size of a Datastore on a local boot storage device, recreate the partition layout to accommodate the larger filesystem, and then grow the Datastore to fill the larger partition.

  1. Use the boot device hardware’s management tools to add additional disk capacity to the device. For more information, engage your hardware vendor.
  2. Open a console to the ESX host. For more information, see Unable to connect to an ESX host using Secure Shell (SSH) (1003807).
  3. Obtain the device identifier for the Datastore to be modified (eg: naa, mpx, eui, vml, etc). For more information, see Identifying disks when working with VMware ESX (1014953).vmkfstools -P "/vmfs/volumes/DatastoreName/"

    Example output:

    VMFS-3.33 file system spanning 1 partitions.
    File system label (if any): DatastoreName
    Mode: public
    Capacity 145223581696 (138496 file blocks * 1048576), 43937431552 (41902 blocks) avail
    UUID: 4a14d968-88bf7161-700f-00145ef48f76
    Partitions spanned (on "lvm"):
        mpx.vmhba0:C0:T0:L0:5

    Note: To obtain this information for the Datastore containing the Service Console virtual disk (VMDK) on ESX, use the command: vmkfstools -P `esxcfg-init --cos-vmdk`

  4. Record the amount of free disk space on the Datastore. For more information, see Investigating disk space on an ESX or ESXi host (1003564).
  5. Equipped with the device identifier, identify the existing partitions on the device using the partedUtil command. For more information, see Using the partedUtil command line utility on ESX and ESXi (1036609).partedUtil get "/vmfs/devices/disks/mpx.vmhba0:C0:T0:L0"

    For example, a disk containing an ESX 4.x installation with 4 existing partitions:

    17834 255 63 286515200     - Geometry of the disk. Disk size in sectors is 286515200.
    1 63 2249099 131 128       - Primary #1, Type 131=0x83=Linux, Bootable, Sectors 63-2239099
    2 2249100 2474009 252 0  - Primary #2, Type 252=0xFC=VMKcore, Sectors 2249100-2474009
    3 2474010 286487144 5 0  - Primary #3, Type 5=Extended, Sectors 2474010-286487144
    5 2474073 286487144 251 0  - Logical #5, Type 251=0xFB=VMFS, Sectors 2474010-286487144
    | |       |         |   |
    | |       |         |   \--- attribute
    | |       |         \------- type
    | |       \----------------- ending sector
    | \------------------------- starting sector
    \--------------------------- partition number

 

 

 

 

  1. Identify the partitions which need to be resized, and the size of the space to be used. From the example in step 5, Logical Partition 5 is contained within Extended Partition 3, they are the last partitions on the disk, and there is empty free space between them and the end of the disk. For example:
Datastore
2474073 286487114
Logical Partition 5
Type 251 (VMFS)
63 2249099 2240100 2474009 2474010 286487114 286487145 286515199
Primary
Partition 1
Primary
Partition 2
Extended Partition 3

Type 5

Empty Space
To Be Used
  1. Identify the desired ending sector number for the target VMFS Datastore’s partitions. To use all space out to the end of the disk, subtract 1 from the disk size in sectors as reported in step 5 to obtain the last usable sector.For example, Disk sector count 286515200 - 1 = 286515199 as the last usable sector.
  2. Resize the partition(s) containing the target VMFS Datastore using the partedUtilcommand, specifying the existing starting sector of the partition and the desired ending sector:partedUtil resize "/vmfs/devices/disks/device" PartitionNumber NewStartingSector NewEndingSector

    For example, to resize the Extended and Logical partitions from the example in step 5:

    partedUtil resize "/vmfs/devices/disks/mpx.vmhba0:C0:T0:L0" 3 247010 286515199
    partedUtil resize "/vmfs/devices/disks/mpx.vmhba0:C0:T0:L0" 5 247073 286515199

  3. During step 8, the partedUtilcommand may report the warning:The kernel was unable to re-read the partion table on /dev/device (Device or resource busy).

    If you receive this warning, reboot the host before proceeding with the next step. For more information, see Rebooting an ESX Server host (1003530).

  4. The partition tables have been adjusted, but the VMFS Datastore within the partitions is still the same size. There is now empty space within the partition in which the VMFS Datastore can be grown. For example:
Datastore Empty Space
2474073 286515199
Logical Partition 5

Type 251 (VMFS), Now Larger

63 2249099 2240100 2474009 2474010 286515199
Primary
Partition 1
Primary
Partition 2
Extended Partition 3

Type 5, Now Larger

  1. Grow the VMFS Datastore in to the new space using the vmkfstools --growfscommand, specifying the partition containing the target VMFS Datastore twice.vmkfstools --growfs "/vmfs/devices/disks/device:partition" "/vmfs/devices/disks/device:partition"

    For example:

    vmkfstools --growfs "/vmfs/devices/disks/mpx.vmhba0:C0:T0:L0:5" "/vmfs/devices/disks/mpx.vmhba0:C0:T0:L0:5"

  2. Validate that the size of the VMFS Datastore has increased. For more information, see Investigating disk space on an ESX or ESXi host (1003564).Note: Click the Refresh button in the vSphere Client to update the Datastore capacity and usage.

See Also

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1026321&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

ESX host reboots, becomes unresponsive, or experiences a purple diagnostic screen when logging into the service console

Symptoms

  • When an ESX host has been added to Active Directory and a domain user attempts to log on to the service console, the ESX host may:
    • Reboot
    • Become unresponsive
    • Experience a purple diagnostic (PSOD) screen similar to:0:13:21:15.531 cpu0:4096)[45m[33;1mVMware ESX 4.1.0 [Releasebuild-260247 X86_64]
      0:13:21:15.531 cpu0:4096)#GP Exception 13 in world 4096:console @ 0xffffffff8801ab88 0:13:21:15.531 cpu0:4096)cr0=0x80050033 cr2=0xffffc200053260b2 cr3=0xcf600000 cr4=0x660 0:13:21:15.531 cpu0:4096)frame=0x417fc7806ef8 ip=0xffffffff8801ab88 err=64 rflags=0x10006 0:13:21:15.531 cpu0:4096)rax=0x100f9 rbx=0xffffffff8012bcd8 rcx=0xffffffff8000a1d4 0:13:21:15.531 cpu0:4096)rdx=0xa1d4 rbp=0x417fc7806fd8 rsi=0x417fc79e9b60 0:13:21:15.531 cpu0:4096)rdi=0xffffffff8012bcd8 r8=0xffff802f800042a0 r9=0xffff 0:13:21:15.531 cpu0:4096)r10=0x0 r11=0xffff802f800042a0 r12=0xffffffff88041f42 0:13:21:15.531 cpu0:4096)r13=0x6 r14=0xffffffff8012bcd8 r15=0x6 0:13:21:15.531 cpu0:4096)pcpu:0 world:4096 name:”console” (S) 0:13:21:15.531 cpu0:4096)pcpu:1 world:4097 name:”idle1″ (I) 0:13:21:15.531 cpu0:4096)pcpu:2 world:4098 name:”idle2″ (I) 0:13:21:15.531 cpu0:4096)pcpu:3 world:4099 name:”idle3″ (I) 0:13:21:15.531 cpu0:4096)pcpu:4 world:4100 name:”idle4″ (I) 0:13:21:15.531 cpu0:4096)pcpu:5 world:4101 name:”idle5″ (I) 0:13:21:15.531 cpu0:4096)pcpu:6 world:4102 name:”idle6″ (I) 0:13:21:15.531 cpu0:4096)pcpu:7 world:4103 name:”idle7″ (I) 0:13:21:15.531 cpu0:4096)pcpu:8 world:4104 name:”idle8″ (I) 0:13:21:15.531 cpu0:4096)pcpu:9 world:4234 name:”helper23-4″ (S) 0:13:21:15.531 cpu0:4096)pcpu:10 world:4106 name:”idle10″ (I) 0:13:21:15.531 cpu0:4096)pcpu:11 world:4107 name:”idle11″ (I) 0:13:21:15.531 cpu0:4096)pcpu:12 world:4108 name:”idle12″ (I) 0:13:21:15.531 cpu0:4096)pcpu:13 world:4109 name:”idle13″ (I) 0:13:21:15.531 cpu0:4096)pcpu:14 world:4110 name:”idle14″ (I) 0:13:21:15.531 cpu0:4096)pcpu:15 world:4111 name:”idle15″ (I) @BlueScreen: #GP Exception 13 in world 4096:console @ 0xffffffff8801ab88 0:13:21:15.531 cpu0:4096)Code start: 0x418007800000 VMK uptime: 0:13:21:15.531 0:13:21:15.531 cpu0:4096)0x417fc7806fd8:[0xffffffff8801ab88]__vmk_versionInfo_str@esx:nover+0x7ff9a577 stack: 0xffffffff8012 0:13:21:15.531 cpu0:4096)0x417fc7806fe8:[0xffffffff8801a4f9]__vmk_versionInfo_str@esx:nover+0x7ff99ee8 stack: 0x0 0:13:21:15.532 cpu0:4096)0xffffffff8012bec8:[0xffffffff8012bcd8]__vmk_versionInfo_str@esx:nover+0x780ab6c7 stack: 0x0 0:13:21:15.539 cpu0:4096)FSbase:0x0 GSbase:0x418040000000 kernelGSbase:0x0 Coredump to disk. Slot 1 of 1.
  • This issue occurs if the user account that is accessing the service console is a member of more than 32 groups.
  • This issue does not occur if domain users connect using vSphere Client

Resolution

This issue is resolved in patch:

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1020893&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Migrating to a new vCenter Server with the Cisco Nexus 1000v vDS

Purpose

This article provides steps to migrate to a new instance of vCenter Server with the Cisco Nexus 1000v vDS (vNetwork Distributed Switches).

Resolution

To migrate to a new instance of vCenter Server with the Cisco Nexus 1000v vDS:

  1. Ensure that nothing is connected to or using the vDS. If you have anything running on vDS, create Standard vSwitches and migrate everything off the vDS. This includes powered off Virtual machines and templates as they reserve ports on the vDS.
  2. Back up the current configuration. On the Virtual Supervisor Module (VSM) type:#copy running-config location/filename
  3. Remove the ESX/ESXi hosts from the vDS.  Confirm the VEM is no longer connected to the VSM.  On the VSM run this command:#show mod

    or

    #show run

    The VSM returns no VEM information after the ESX/ESXi hosts are disconnected.

  4. If you are using multiple VSMs put them on the same host and migrate all Virtual machines off the ESX/ESXi host with the VSMs.
  5. Disconnect the Virtual Supervisor Module (VSM) from vCenter Server with the commands:n1kv# conf t
    n1kv(config)# svs connection [connection_name]
    n1kv(config-svs-conn)# no connect

    Note: For information, see Disconnecting From the vCenter Server in the Cisco Nexus 1000V System Management Configuration Guide.

  6. Add the ESX/ESXi host running the VSM(s) to the new vCenter.
  7. Install extension.xml on vCenter Server.
    1. Download the extension.xml from http://vsm_ip_address/.
    2. Log in to vCenter Server with vSphere Client.
    3. From the Plug-ins menu, right-click the white space in the Plug-in Manager and choose New Plug-in.
    4. Browse to the extension.xml saved in step a and click Register Plug-in.
    5. Ignore the security warning and click OK to confirm that the plug-in is successfully registered.
  8. Reconnect the VSM to vCenter Server with the commands:n1kv# conf t
    n1kv(config)# svs conn VC
    n1kv(config-svs-conn)# protocol vmware-vim
    n1kv(config-svs-conn)# remote ip address [IP_address]
    n1kv(config-svs-conn)# vmware dvs datacenter-name [DC_name]
    n1kv(config-svs-conn)# connect
  9. Check the connectivity of the VSM.n1kv# sh svs conn
    connection VC:
    ip address: [IP_address]
    protocol: vmware-vim https
    certificate: default
    datacenter name: DC_name
    DVS uuid: ac 36 07 50 42 88 e9 ab-03 fe 4f dd d1 30 cc 5c
    config status: Enabled
    operational status: Connected
    n1kv#

    Note: For more information, see Connecting to the vCenter Server in the Cisco Nexus 1000V System Management Configuration Guide.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1027622&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

The /var/log partition of the ESX host runs out of space due to vmkiscsid.log growth

Symptoms

When running VMware ESX 4.x with a software iSCSI initiator, you see these symptoms:

  • Running the df command in the host terminal shows that the partition /var/log is 100% full. For more information, see Investigating disk space on an ESX or ESXi host (1003564).
  • The log file /var/log/vmkernel does not update.
  • The log file /var/log/vmkiscsid.log or /var/log/vmkiscsid.log is very large.

Cause

On VMware ESX 4.x, the software iSCSI daemon log file at /var/log/vmkiscsid.log is not rotated or cleared, unlike other syslog-generated log files. Therefore, the vmkiscsid.log file grows to a very large size if the host has experienced, or is experiencing, ongoing communication issues with the iSCSI storage.

Resolution

Solution

This issue is resolved by ESX 4.0 Update 3, released 2011-05-05. After installing this update, 600KB of vmkiscsid logs are retained, across 6 files. For more information, see the release notes at https://www.vmware.com/support/vsphere4/doc/vsp_esx40_u3_rel_notes.html.

Workaround for ESX 4.0

  1. If you need the software iSCSI client logs for troubleshooting purposes, archive them prior to reboot. See the Additional Information section.
  2. Reboot the ESX 4.0 host.
  3. The file /var/log/vmkiscsid.log is erased during startup and the space used by the file is released.

Workaround for ESX 4.1

  1. Reboot the ESX 4.1 host.
  2. The file /var/log/vmkiscsid.log is rotated to /var/log/vmkiscsid.log.previous during startup.
  3. If you need the software iSCSI client logs for troubleshooting purposes, archive them following the reboot. See the Additional Information section.
  4. Delete the older file using a command similar to:

rm /var/log/vmkiscsid.log.previous

Additional Information

If there is rapid logging to the /var/log/vmkiscsid.log file, investigate underlying iSCSI storage connectivity issue by reviewing your storage target configuration and system logs.

Make a compressed backup of the /var/log/vmkiscsid.log or vmkiscsid.log.previous files before deleting them or rebooting the host, in case this information is needed for troubleshooting purposes. Archive the file to a location with sufficient space, such as a datastore. For example, use a command similar to:

tar czvf /vmfs/volumes/DatastoreName/vmkiscsid.tgz /var/log/vmkiscsid.log*

Note:

  • To view this file, you can extract the file using an archive manager.
  • The archive can be relocated back to /var/log/ after the system reboots to keep the system logs in a single location.

Working with firewall rules in ESX 4.x

Purpose

This article provides information about firewall rules in ESX 4.0 and ESX 4.1.

Resolution

VMware supports opening and closing firewall ports only through the vSphere Client or the esxcfg-firewall command. Using any other methods or scripts to open and close firewall ports can lead to unexpected behavior.

Use of iptables or other Linux commands to modify firewall default rules is not supported.

If you modify firewall rules for the ESX console by using the iptables or any command or utility other than the esxcfg-firewall command, accessing the service console through the firewall with any tools or utilities might cause the firewall to revert to its default configuration when your actions are complete. For example, configuring VMware High Availability (HA) on a host causes the firewall to revert to the default configuration specified by esxcfg-firewall if you have modified the rules by using the iptables command.

Opening and closing firewall ports with the esxcfg-firewall command

To open or close ports with the esxcfg-firewall command:

  1. Log in to the service console and acquire root privileges.
  2. Use this command to open the port:esxcfg-firewall –openPort <port_number>,tcp|udp,in|out,<port_name>

    Use this command to close the port:

    esxcfg-firewall –closePort <port_number>,tcp|udp,in|out,<port_name>

    Where:

    • <port_number> is the vendor-specified port number
    • tcp is for TCP traffic and udp is for UDP traffic
    • in opens the port for inbound traffic and out opens it for outbound traffic
    • <port_name> is a descriptive name to help identify the service or agent using the port. A unique name is not required.For example:

      #esxcfg-firewall –openPort 6380,tcp,in,Navisphere

      #esxcfg-firewall –closePort 6380,tcp,in

  3. Restart the vmware-hostd process. For more information, see Restarting the Management Agents on an ESX or ESXi Server (1003490).

Changing the default firewall rules by modifying default.xml

If you modify the defaults by using a Linux command, your changes are ignored and overwritten by the defaults specified for that service by the esxcfg-firewall command. If you want to change the defaults for a supported service, or define defaults for additional service types, you can modify or add to the rules in /etc/vmware/firewall/chains/default.xml. These rules follow the syntax of the iptables command. The default.xml file always uses the iptables-A option for the specified chain.

To change the default firewall rules using default.xml:

  1. Log in to the service console with administrator privileges.
  2. Edit the /etc/vmware/firewall/chains/default.xml file to correspond to your security policies.
  3. Restart the service console firewall with the command:service firewall restart
  4. Check that the specified services are correctly enabled or disabled with the command:esxcfg-firewall-e|d SERVICE
  5. Verify that your modified rules are working correctly with the command:iptables-nL

    Note: Do not use the iptables command to modify any settings.

You can modify the firewall defaults for each of the service types according to your own security policies. For example, these rules in the /etc/vmware/firewall/chains/default.xml file determine the firewall rules for the INPUT chain:

<ConfigRoot>
<chain name=”INPUT”>
<rule>-p tcp –dport 80 -j ACCEPT</rule>
<rule>-p tcp –dport 110 -j ACCEPT</rule>
<rule>-p tcp –dport 25 -j ACCEPT</rule>
</chain>…
</ConfigRoot>

The default.xml fragment above is equivalent to these iptable commands:

% iptables -A INPUT -p tcp –dport 80 -j ACCEPT
% iptables -A INPUT -p tcp –dport 110 -j ACCEPT
% iptables -A INPUT -p tcp –dport 25 -j ACCEPT

Changes in ESX 4.1

ESX 4.1 introduces these additional configuration files located in /etc/vmware/firewall/chains:

  • usercustom.xml
  • userdefault.xml

The default files custom.xml and default.xml are overridden by usercustom.xml and userdefault.xml. All configuration is saved in usercustom.xml and userdefault.xml.

Copy the original custom.xml and default.xml files and use them as a template for usercustom.xml and userdefault.xml.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1017523&sliceId=2&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

The vmkernel logs report a warning containing LinChar and LinuxCharWrite

Symptoms

You may observe a warning similar to the following within the /var/log/vmkernel logs of a VMware ESX host running on Hewlett Packard hardware:

WARNING: LinChar: LinuxCharWrite: M=224 m=0 flags=0x1002 write 29004 bytes at offset 0x8a1ccc failed (-16)

Resolution

This is not a VMware Issue.

This issue occurs when the Integrated Lights-Out 2 (iLO2) is reset. This issue has been observed in iLO2 version 1.80. For more information, see:

Note: The preceding links were correct as of August 09, 2011. If you find a link is broken, provide feedback and a VMware employee will update the link.

To workaround this issue, restart the hpagents. For more information, contact HP.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1017162&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

ESX 4.x host fails to boot after power operation with the error: fsck.ext3: Unable to resolve UUID

Symptoms

  • After power-cycling or rebooting an ESX 4.x server, this error is produced during boot:fsck.ext3: Unable to resolve ‘UUID=34d192db-17eb-442e-9613-c5c24c6fa9fa’

    And

    *** An error occurred during the file system check.
    *** Dropping you to a shell; the system will reboot
    *** when you leave the shell.

  • After encountering this error, you are unable to boot into ESX or Troubleshooting mode.
  • The unresolvable EXT file systems or partitions most commonly later appear to have mount points such as /var, /opt and /tmp.

Resolution

This issue occurs when the boot-time file system check utility (FSCK) for EXT-3 file systems cannot resolve a file system (by UUID) defined in /etc/fstab.
Issues that can result in this may include:

  • The default roll-back option is left enabled when a subsequent upgrade is being performed.
  • The device not present during system boot.
  • The unresolvable EXT file systems appear to reside on disks/devices that are initialized later during system boot (e.g. the last LUN).

Note: If you are experiencing an outage with virtual machines down, consider resolving the situation in a timely manner through the reinstallation of VMware ESX. Troubleshooting may take more time than a reinstallation, which is in the order of approximately 20 minutes.

Otherwise refer to instructions below for submission of information to VMware Technical Support for technical analysis.

Further troubleshooting is available in the shell:

  1. Confirm the UUIDs which were not resolvable, and remain so, by running fsck again without additional arguments. This information (or similar information) is displayed:# fsck

    fsck 1.39 (29-May-2006)
    e2fsck 1.39 (29-May-2006)
    esx-root: clean, 32953/641280 files, 414801/1281175 blocks
    e2fsck 1.39 (29-May-2006)
    /dev/sdt1: clean, 35/140832 files, 25323/281596 blocks
    fsck.ext3: Unable to resolve ‘UUID=34d192db-17eb-442e-9613-c5c24c6fa9fa’
    e2fsck 1.39 (29-May-2006)
    /dev/sdt6: clean, 31/250368 files, 27851/500220 blocks
    e2fsck 1.39 (29-May-2006)
    /dev/sdt7: clean, 22/250368 files, 16815/500220 blocks

  2. Record the UUID or UUIDs which failed to resolve. You may take a screen shot of your System Management Interface, take a picture, or write the values down.
  3. Confirm these same values in the /etc/fstab file.# cat /etc/fstab

    UUID=79815890-f11c-4907-80fe-d1cd6bf061f8 /        ext3    defaults                  1 1
    UUID=45460133-027b-40b6-8b4d-e52aaf4c417f /boot    ext3    defaults                  1 2
    None                    /dev/pts                   devpts  defaults                  0 0
    /dev/cdrom              /mnt/cdrom                 udf,iso9660 noauto,owner,kudzu,ro 0 0
    /dev/fd0                /mnt/floppy                auto    noauto,owner,kudzu        0 0
    None                    /proc                      proc    defaults                  0 0
    None                    /sys                       sysfs   defaults                  0 0
    UUID=34d192db-17eb-442e-9613-c5c24c6fa9fa /var/log ext3    defaults,errors=panic     1 2
    UUID=e32ec5f4-d795-414a-8d73-a2bb3ea86342 swap     swap    defaults                  0 0

    Note: Highlighted in blue is the mount point for the respective unresolvable UUID, in red.

  4. Verify what UUIDs the system is currently aware of by running the command:# ls -l /dev/disk/by-uuid

    total 0
    lrwxrwxrwx 1 root root 10 Nov  9 14:36 45460133-027b-40b6-8b4d-e52aaf4c417f -> ../../sdm1
    lrwxrwxrwx 1 root root 10 Nov  9 14:36 e32ec5f4-d795-414a-8d73-a2bb3ea86342 -> ../../sdr1
    lrwxrwxrwx 1 root root 10 Nov  9 14:36 34d192db-17eb-442e-9613-c5c24c6fa9fa -> ../../sdr2
    lrwxrwxrwx 1 root root 10 Nov  9 14:36 79815890-f11c-4907-80fe-d1cd6bf061f8 -> ../../sdr5

    Notes:

    • This output reveals the UUID-to-partition relationship for all discovered EXT partitions in the system. Affected mount points or content can be associated using the previous step.
    • It is possible in some environments that none of the known partitions reported by listing /dev/disk/by-uuid match the unresolved UUID. This is correctable; for additional instructions, proceed to the following sections and correct the content of the /etc/fstab file.

Solution

VMware is currently investigating further for a full root-cause and solution.

If you are able to reproduce this issue while maintaining production via alternate servers, contact VMware Technical Support after completing these steps:

  1. Log into the terminal of the affected ESX host.
  2. Remount the root partition in read-write mode:# mount / -o remount,rw
  3. Capture logs during startup via the serial port. For more information, see Enabling serial-line logging for an ESX and ESXi host (1003900).
  4. Reboot the ESX host and log the results via your listening serial terminal.
  5. Contact VMware Technical Support and file a Support Request. For additional information, see Filing a Support Request (1021619).

Workarounds

There are 2 recommended workarounds. Both workarounds involve the modification of the /etc/fstab file. You can either:

  • Generate a new UUID for the affected file system(s) and update /etc/fstab to match the new value(s).
  • Update /etc/fstab to incorporate the correct UUID from the file system.

Applying a new UUID

Apply a new UUID to the EXT-3 file systems which fail to resolve and update the /etc/fstab file.

  1. Run tune2fs against each Linux partition on the suspected disk device. For example:# tune2fs -l /dev/sdr2 | grep UUID
    Filesystem UUID:          34d192db-17eb-442e-9613-c5c24c6fa9fa

    # tune2fs -U random /dev/sdr2
    tune2fs 1.39 (29-May-2006)

    # tune2fs -l /dev/sdr2 | grep UUID
    Filesystem UUID:          25a18c70-ffcb-4b15-9d2d-1cfab1754d86

  2. Update /etc/fstab with the updated UUID. From earlier steps, /dev/sdr2 partition was determined to be the /var/log mount point:
    1. Remount the root partition in read-write mode:# mount / -o remount,rw
    2. Open the /etc/fstab file for re-writing. For more information, see Editing configuration files in VMware ESX (1017022).
    3. Search for, and change, the original UUID to the newly-generated UUID from earlier steps, above.
    4. Save the file and remount the root partition in read-only mode:# mount / -o remount,ro
    5. Reboot the server:

# shutdown -r now

Applying a UUID to /etc/fstab

  1. Remount the filesystem to a read/write state with the command:# mount / -o remount,rw
  2. Open the /etc/fstab file in an editor. For more information, see Editing configuration files in VMware ESX (1017022).
  3. Comment out the line referring to the older ESX installation by putting a hash symbol (#) before it.
  4. Save the file and exit the editor.
  5. Reboot the ESX host.
  6. To prevent this issue from occurring again, after the system is booted, follow the steps in Clean Up the ESX Bootloader Menu After Upgrade in the vSphere Upgrade Guide.

Removing residual roll-back installation UUIDs in /etc/fstab

Perform these steps to remove past installations from the /etc/fstab file. This may be necessary if roll-back data is preventing the ESX host from starting successfully.

  1. Log into the server using the root password.
  2. Remount the root filesystem in read-write mode with the command:# mount / -o remount,rw
  3. Open the /etc/fstab file in a text editor. For more information, see Editing configuration files in VMware ESX (1017022).
  4. Comment out or remove the line referring to the previous ESX installation by inserting a hash symbol (#) at the beginning of the line.
  5. Save the file and exit the editor.
  6. Reboot the ESX host.
  7. To prevent this issue from recurring, follow the steps in Clean Up the ESX Bootloader Menu After Upgrade in the vSphere Upgrade Guide after rebooting.

Additional Information

To list partitions in the system, run:

# fdisk -l | less

Note: All known partitions are shown. In many cases where this issue has occurred, the resolved and unresolved partitions reside on the same block or disk device. Specifically, they most commonly reside on the same device as the ESX Console OS or the ESX host’s system disk. For example:

System Disk

Disk /dev/sdm: 64.4 GB, 64424509440 bytes
64 heads, 32 sectors/track, 61440 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Device Boot      Start         End      Blocks   Id  System
/dev/sdm1   *           1        1100     1126384   83  Linux
/dev/sdm2            1101        1210      112640   fc  VMware VMKCORE
/dev/sdm3            1211       61440    61675520    5  Extended
/dev/sdm5            1211       40960    40703984   fb  VMware VMFS

Console OS Disk
Disk /dev/sdr: 7973 MB, 7973371904 bytes
255 heads, 63 sectors/track, 969 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot      Start         End      Blocks   Id  System
/dev/sdr1               1          76      610438+  82  Linux swap / Solaris
/dev/sdr2              77         331     2048287+  83  Linux
/dev/sdr3             332         969     5124735    5  Extended
/dev/sdr5             332         969     5124703+  83  Linux
Notes:

  • In these steps, the chosen disks/partitions should typically be ones known to be utilized by the ESX host operating system; RDMs used by Linux VMs can be skipped.
  • ESX host system disks typically contain type FB (VMFS) or FC (VMKCore) partitions.
  • ESX Console OS disks is usually appear to be relatively small, to the order of 8GB or slightly larger.
  • Press q to quit the less utility when you are done reviewing partitioning information.

Tags

esx-boot  esx-does-not-boot  unable-to-boot  cannot-boot-esx

See Also

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1012142&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

ESX fails to boot when the disk containing the datastore with esxconsole.vmdk is detected as a snapshot

Symptoms

You may encounter issues while booting an ESX 4.x machine if:

  • VMware ESX 4.x has been installed on the local disk or if it is boot from SAN.
  • There was a recent change to the disk or LUN in which ESX 4.x has been installed
  • There was a recent change to the controller connecting to the disk or LUN in which ESX 4.x is installed
  • The ESX host is attempting to boot from a LUN that has been replicated to another SAN.

Symptoms may include:

  • The system does not boot and drops into Troubleshooting (busy box) mode.
  • The boot process fails.
  • You see the error:* vsd-mount ……………………………………………. [Failed]
    /Bin/SH:cant access TTY job 1control turned off

Cause

VMware ESX 4.x uses a virtual disk to store the service console. The virtual disk is created during the installation phase on a VMFS volume residing on the boot disk or LUN. The VMFS volume containing the service console files (also referred to the esxconsole.vmdk) may be detected as a snapshot LUN. As a result of the VMFS volume being detected as a snapshot LUN, the VMware ESX host may be unable to access the service console files during boot.

For more information about snapshot LUNs, see VMFS Volume Can Be Erroneously Recognized as a Snapshot (6482648) and Snapshot LUN detection in ESX 3.x and ESX 4 (1011385).

Note: This article assumes that the disk, LUN, and storage controller are functioning properly. If you are having issues with local storage such as boot LUN presentation issues or a failed storage RAID, these issues must be identified before implementing this resolution provided in the article. Depending on the type of issue, you may need to engage the appropriate storage vendors for assistance.

Resolution

If the ESX host has detected the VMFS volume containing the esxconsole.vmdk file as a snapshot LUN, the ESX host drops into Troubleshooting (busy box) mode during boot.

Note: Depending on the installation type (for example, interactive mode or scripted mode), the service console files may be named differently. To verify the name of the service console file (esxconsole.vmdk, default-cos.vmdk, etc), run this command on the ESX command-line:

grep -i cosvmdk /etc/vmware/esx.conf

This procedure uses esxconsole.vmdk. Replace with the name of your service console file as necessary.

To allow your ESX host to boot successfully:

  1. Provide the necessary credentials to access the busy box.
  2. Run this command to enable resignaturing on the VMware ESX machine:esxcfg-advcfg -s 1 /LVM/EnableResignature

    You must get output similar to:

    Value of EnableResignature is 1 .

    Note: If the root is mounted as read only, run the command mount -o remount / to remount the volumes so that they are in a writable state.

  3. Run this command to unload the VMFS drivers:vmkload_mod -u vmfs3
  4. Run this command to load the VMFS drivers:vmkload_mod vmfs3
  5. Run this command to detect new VMFS volumes and resignature the volume:vmkfstools -V
  6. Run this command to identify the full path of the service console file:find /vmfs/volumes/ -name esxconsole.vmdk

    The output appears similar to:

    /vmfs/volumes/4a14d968-88bf7161-700f-00145ef48f76/esxconsole-4a14d906-2f96-7956-7284-00145ef48f74/esxconsole.vmdk

    Note: Make a note of this full path.

  7. Restart the VMware ESX machine. You see a menu provided by the grub boot loader.
  8. Press e to edit the grub entries manually.
  9. Scroll down to the line that starts with kernel /vmlinuz (it is indented under the VMware ESX 4.0 heading).
  10. Go to the end of the line and add a space followed by the option all on one line, specifying the full path from step 6:/boot/cosvmdk=/vmfs/volumes/<path>/esxconsole.vmdk
  11. Press Enter to accept the changes.
  12. Press b to boot using the modified settings. The ESX host successfully boots.Note: The changes made to the boot options are not saved. They only apply to the current boot process. The changes need to be made to the boot configuration files as described in these steps.
  13. Log into the console as root.
  14. Edit the /etc/vmware/esx.conf file with a text editor and modify two lines, specifying the full path from step 6./adv/Misc/CosCorefile = “/vmfs/volumes/<path>/core-dumps/cos-core”
    /boot/cosvmdk = “/vmfs/volumes/<path>/esxconsole.vmdk”
  15. Run this command to update the boot configuration files:esxcfg-boot -b

Additional Information

For more information regarding the grub boot loader, see the Grub’s User Interface documentation.

Note: The preceding link was correct as of November 25, 2009. If you find the link is broken, provide feedback and a VMware employee will update the link.

For more information about snapshot LUNs, see VMFS Volume Can Be Erroneously Recognized as a Snapshot (6482648) and Snapshot LUN detection in ESX 3.x and ESX 4 (1011385).

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1016297&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Virtual machines fail to power on after upgrading to vSphere 4.0 Update 1 with the error: Insufficient resources, No COS swap configured

Symptoms

  • Virtual machines fail to power on after upgrading to vSphere 4.0 Update 1
  • The ESX console reports the error:COS Swap not configured
  • The vmkernel logs contains the events:ESX2 vmkernel: 0:00:03:09.920 cpu4:4110)World: vm 4255: 1098: Starting world vmware-vmkauthd with flags 4
    ESX2 vmkernel: 0:00:03:14.954 cpu0:4096)VMNIX: VGA: 462: 2
    ESX2 vmkernel: 0:00:03:14.956 cpu0:4096)VMNIX: VGA: 462: 1
    ESX2 vmkernel: 0:00:03:14.961 cpu0:4096)VMNIX: VGA: 462: 3
    ESX2 vmkernel: 0:00:03:14.964 cpu0:4096)VMNIX: VGA: 462: 4
    ESX2 vmkernel: 0:00:03:14.968 cpu0:4096)VMNIX: VGA: 462: 5
    ESX2 vmkernel: 0:00:03:33.338 cpu0:4096)VMNIX: Logger: 475: sysalert Service Console has no configured swap space.
  • free -m output indicates a swap space of zero
  • /etc/fstab has the wrong UUID for the swap partition

Resolution

To resolve this issue:

  1. Find the device ID for the swap partition using the command fdisk -l.The output looks similar to /dev/sdX, where X is the device number. For example: /dev/sda
  2. Activate the swap on the partition with the command:swapon /dev/sdX
  3. To save the settings permanently, find the UUID for the swap partition using the command:blkid  /dev/sdX
  4. Edit the/etc/fstab to modify the UUID of the swap partition reflected in the step 2.
  5. Save and close the file.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1012683&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Increasing the block size of local VMFS storage in ESX 4.x during installation

Purpose

This article provides instructions on how to specify a block size during the formatting process of VMware ESX 4.x. By default it formats them with a 1MB block size, allowing for storage of files 256GB and under. This default VMFS Datastore cannot be easily reformatted with a higher block size as it contains the Service Console VMDK file.

Notes for ESXi users:

  • ESXi 4.1 U1 and earlier: This article is not applicable as there is no Service Console or associated VMDK file. The default VMFS Datastore is blank, and can be deleted and reformatted post-install. It is not necessary to change the VMFS Datastore block size during the installation.
  • ESXi 4.1 U2: You can specify the VMFS block size during GUI, text, and kickstart installation. For more information, see the Resolved Issues: Upgrade and Installation section of the see VMware ESXi 4.1 Update 2 Release Notes.

Resolution

The ESX 4.x installer utilizes the default block size of 1MB when creating VMFS volumes. For additional information on the VMFS block size, see the Additional Information section, below.

The service console in ESX 4.x is stored in a VMDK file on a VMFS partition, typically located on local storage. While VMware ESX 4.x is running, you cannot reformat the volume with the intended block size. If you have already completed an installation of VMware ESX 4.x, are unable to reinstall the product with the intended block size using steps provided below, and you require a larger block size on the local VMFS partition, see Reformatting the local VMFS partition’s block size in ESX 4.x (post-installation) (1013210).

New Installations: Formatting with a VMFS block size larger than 1MB

While the installer can be modified to format any new VMFS partitions with a specified block size, the following workarounds are also available:

  • Re-install the ESX host on a different drive (for example, a second RAID set or boot from SAN), and leave the original disk for the VMFS volume. You can then choose your blocksize when creating the second datastore at a later time.
  • Alternatively, i nstall ESX 3.5, create the volume with desired blocksize or re-format the volume with the intended block size, then upgrade to ESX 4.x. Specify to use the existing VMFS volume to store your Console OS VMDK.
  • Create a second RAID set, forming a discrete device or volume which can be utilized with the intended block size, post-installation.
  • Carve out a new LUN, volume, on the local controller for a new volume. This, too can be utilized with the intended block size post-installation, but additional proecdures are required:Note: You cannot create a second datastore (via another partition) on the same drive via the ESX GUI. You must use the vmkfstools command. You may also need to create a partition on the free space first with the fdisk command:

    vmkfstools -C vmfs -b Xm -S local2mBS /vmfs/devices/disks/naa.xxxxxxxxxx:y

    where:

    • Xm is the blocksize (1m, 2m, 4m, or 8m).
    • local2mBS is your volume name. If the volume name has a space (for example, volume name ), enclose it in quotation marks (for example, " volume name").
    • naa is the naa identifier, and y is the partition number. To determine this, run ls -la in the /vmfs/devices/disks folder.

Note: Depending on your disk controller type, naa. may be replaced with eui., t10., or mpx.. For more information, see Identifying disks when working with VMware ESX (1014953).

Reconfiguring the ESX 4.x installer and formatting new VMFS volumes with a specific block size

To reconfigure the installer to format VMFS partitions with a specified block size:

  1. Boot the ESX installation DVD and choose Install in graphical mode.
  2. Press Ctrl+Alt+F2 to switch to the shell.
  3. Run this command:ps | grep Xorg
  4. Kill the PID which shows Xorg -br -logfile ....Example: If this PID is 590, enter this command:

    kill 590

    Notes:

    • If you specified a GUI mode installation, killing the process identified as Xorg may switch you back to another console. If this occurs, press Ctrl+Alt+F2 to return to the previous console.
    • If after killing the Xorg process you see the message Press <return> to reboot, press Ctrl+Alt+F3 to go to another console and continue working there without rebooting.
  5. Switch to the configuration directory. Enter this command:cd /usr/lib/vmware/weasel
  6. Edit the configuration script. Enter this command:vi fsset.py

    Note: For more information on editing files, see Editing configuration files in VMware ESXi and ESX (1017022).

  7. Locate class vmfs3FileSystem(FileSystemType).
  8. Edit the blockSizeMBparameter to the block size that you want. It is currently be set to 1. The only values that work correctly are 1, 2, 4, and 8.Note: Press i for insert mode.
  9. Save and close the file, press Esc, type :wq! and press Enter. The exclamation mark is needed to force the action of saving as the file is read-only.
  10. Verify that the content has been changed. Enter this command:grep -i blockSizeMB fsset.py
  11. Switch back to the root directory. Enter this command:cd /
  12. Launch the installer with the new configuration. Enter this command:/bin/weasel

Additional Information

The largest file that can be created with a 1MB block size is 256GB in size. For more information about the maximum file size, see the Configuration Maximums for your version of ESX.

To create a file bigger than 256GB, the VMFS filesystem needs to have a block size larger than 1MB. These are the maximums:

Block Size Maximum File Size
1 MB 256 GB
2 MB 512 GB
4 MB 1 TB
8 MB 2 TB

For more information about block sizes, see Block size limitations of a VMFS datastore (1003565).

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1033570&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

virtual machine fails to power on with the error: Thin/TBZ disks cannot be opened in multiwriter mode. VMware ESX cannot open the virtual disk for clustering.

Symptoms

Summary

A virtual machine fails to power on with the error:

Thin/TBZ disks cannot be opened in multiwriter mode. VMware ESX cannot open the virtual disk for clustering.

Example

Reason: Thin/TBZ disks cannot be opened in multiwriter mode..
Cannot open the disk ‘/vmfs/volumes/4c549ecd-66066010-e610-002354a2261b/Windows Server 2008/Windows Server 2008.vmdk’ or one of the snapshot disks it depends on.
VMware ESX cannot open the virtual disk, “/vmfs/volumes/4c549ecd-66066010-e610-002354a2261b/Windows Server 2008/Windows Server 2008.vmdk” for clustering. Please verify that the virtual disk was created using the ‘thick’ option.

Impact

The virtual machine fails to power on.

Resolution

This issue usually occurs in configurations where a virtual disk is shared across multiple virtual machines.

The shared virtual disks must be in the eagerzeroedthick disk format to facilitate clustering configurations, such as Microsoft Clustering service and VMware Fault Tolerance. This issue occurs if the disk is not in the correct format. To verify if a virtual disk is zeroedthick/eagerzeroedthick, see Determining if a VMDK is zeroedthick or eagerzeroedthick (1011170).

To resolve this issue, see:

Tags

cannot-power-on-vm  virtual-disk-type  vm-power-on-fails

See Also

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1025642&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Changing the default password expiration policy in ESX 4.x

Purpose

This article provides information to change the default password aging policies in ESX.

Resolution

To change the default password aging setting you must edit the /etc/login.defs file and change the values of some of the parameters.

To change the default password aging setting:

  1. Open the /etc/login.defs file using a text editor.
  2. Edit the file and change the value of these parameters:PASS_MAX_DAYS   99999
    PASS_MIN_DAYS   0
    PASS_MIN_LEN    5
    PASS_WARN_AGE   7

    Where

    • PASS_MAX_DAYS is the maximum number of days a password may be used
    • PASS_MIN_DAYS is the minimum number of days allowed between password changes
    • PASS_MIN_LEN is the minimum acceptable password length
    • PASS_WARN_AGE is the number of days before which a password expiration warning appears

Additional Information

For other password policy options and to securely deploy vSphere 4.x in a production environment, see the vSphere 4.0 Security Hardening Guide.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1010789&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Configuring VMDirectPath I/O pass-through devices on an ESX host

Purpose

This article provides steps for configuring VMDirectPath I/O direct PCI device connections for virtual machines running on Intel Weybridge and Stoakley platforms.

Resolution

About VMDirectPath I/O pass-through devices

You can connect up to two passthrough (ESX 4.0) and up to four passthrough (ESX 4.1) devices to a virtual machine. When selecting the devices, keep in mind these restrictions:

Configuring pass-through devices

To configure pass-through devices on an ESX host:

  1. Select an ESX host from the Inventory of VMware vSphere Client.
  2. On the Configuration tab, click Advanced Settings. The Pass-through Configurationpage lists all available pass-through devices.Note:A green icon indicates that a device is enabled and active.An orange icon indicates that the state of the device has changed and the host must be rebooted before the device can be used.
  3. Click Edit.
  4. Select the devices and click OK.Note: If you have a chipset with VT-d, when you click Advanced Settings in vSphere Client, you can select what devices are dedicated to the VMDirectPath I/O.
  5. When the devices are selected, they are marked with an orange icon. Reboot for the change to take effect. After rebooting, the devices are marked with a green icon and are enabled.Note: The configuration changes are saved in the /etc/vmware/esx.conf file. The parent PCI bridge, and if two devices are under the same PCI bridge, only one entry is recorded.

    The PCI slot number where the device was connected is 00:0b:0. It is recorded as:

    /device/000:11.0/owner = “passthru”

    Note: 11 is the decimal equivalent of the hexadecimal 0b.

To configure a PCI device on a virtual machine:

  1. From the Inventory in vSphere Client, right-click the virtual machine and choose Edit Settings.
  2. Click the Hardware tab.
  3. Click Add.
  4. Choose the PCI Device.
  5. Click Next.Note: When the device is assigned, the virtual machine must have a memory reservation for the full configured memory size.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1012270&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

A LUN fails to appear as a usable device during a rescan with the VMkernel log error: Peripheral qualifier 0x1 not supported

Symptoms

  • During a rescan operation, the LUN might not appear as a usable device in the VMware Infrastructure/vSphere Client or in esxcfg-mpath.
  • You may observe a series of events similar to the following within the VMkernel logs:

ScsiScan: 839: Path ‘vmhba2:C0:T1:L0′: Vendor: ‘Vendor’ Model: ‘Model’ Rev: ‘0000’
ScsiScan: 842: Path ‘vmhba2:C0:T1:L0′: Type: 0x0, ANSI rev: 5, TPGS: 0 (none)
ScsiScan: 105: Path ‘vmhba2:C0:T1:L0′: Peripheral qualifier 0x1 not supported

Resolution

The issue is resolved in ESX 4.1 Update 1 and ESX 4.0 Update 3. These messages do not appear in the VMkernel logs.

These messages indicate that the storage array is providing a non-zero Peripheral Qualifier in response to a SCSI Inquiry command for a given LUN. In the example from the Symptoms section, the storage array provides a non-zero Peripheral Qualifier for LUN 0.

There are two situations in which you may receive this message:

  • This message is returned only for LUN 0 at 5 minute intervals and there is no LUN 0 presented to the ESX host.

    Note: Management commands (such as REPORT LUNS) are sent to LUN 0, irrespective of whether it is available or not.

In this case, the message can be safely ignored because the array correctly reports that the ESX host does not have access to LUN 0. However, it causes log spew and may make troubleshooting other potential problems difficult. VMware recommends that you present a LUN 0 to the ESX host to stop this spew.

  • This message is returned for LUNs besides LUN 0.This can be the result of incorrect storage side LUN presentation, invalid hostmode settings, incorrect or incompatible firmware versions, or other vendor specific configuration parameters. Contact your vendor and reference the VMware Hardware Compatibility Guide and VMware SAN Configuration Guide or iSCSI Configuration Guide to ensure that you have properly configured your storage array for use with your version of VMware ESX.

 

Possible return codes from SCSI SPC-3:

  • 0x0 = 000b – The specified peripheral device type is currently connected to this logical unit. However, it does not indicate that the device is ready for access by the initiator. If the device server is unable to determine whether or not a physical device is currently connected, it may use this peripheral qualifier when returning the INQUIRY data.
  • 0x1 = 001b – The device server is capable of supporting the specified peripheral device type on this logical unit. However, the physical device is not currently connected to this logical unit.
  • 0x2 = 010b – Reserved
  • 0x3 = 011b – The device server is not capable of supporting a physical device on this logical unit. For this peripheral qualifier, the peripheral device type should be set to 1Fh to provide compatibility with previous versions of SCSI. All other peripheral device type values are reserved for this peripheral qualifier.
  • 0x4 – 0x7 = 1xxb – Vendor specific

For further detailed information, see SCSI Standards Architecture from T10.

Note: The preceding link was correct as of May 20, 2010. If you find the link is broken, provide feedback and a VMware employee will update the link.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1013210&sliceId=1&docTypeID=DT_KB_1_1&dialogID=250492976&stateId=0 0 250518150

Reformatting the local VMFS partition’s block size in ESX 4.x (post-installation)

Purpose

This article provides steps to reformat the local VMFS partition’s block size in ESX 4.x, when there already is an existing ESX installation.

Note: For ESXi servers, reformat the local VMFS3 file system using the VMware vSphere client, or the vmkfstools command. For additional instructions, see:

Resolution

The VMware ESX 4.x installer formats created VMFS-3 partitions with a 1MB block size. If you have not completed an installation yet, see Increasing block size of local storage in ESX 4.x during installation (1012683).

Post-installation, re-formatting local storage with the intended block size requires relocating the Console OS VMDK after installation, which is not possible while ESX 4.x is booted.

Note: There are a number of workarounds available to utilize a larger block size for the local VMFS file system. These options are the simplest:

  • Create two logical units for the local storage device(s) in the server, utilizing the first for the ESX 4.x installation with the default VMFS block size, and a second VMFS volume with, for now, the default block size. The second may be re-formatted later with a larger block size.
  • Alternatively, if you are going to be upgrading to ESX 4.x, format the local VMFS partition on ESX 3.5 with a larger block size first, then begin your upgrade to ESX 4.x.

If these methods are not suitable or you require retaining the existing ESX installation, you can reformat the existing ESX 4.x local VMFS partition with a larger block size using the ESX 4.x installer environment. Please note, however, that this process will take considerably more time than reinstalling the product and using the steps provided in Increasing block size of local storage in ESX 4.x during installation (1012683), due to Console OS file transfers.

Existing installations: Reformatting the local VMFS partition with a larger block size

To reformat the ESX 4.x local VMFS partition with a larger block size while retaining the original installation of ESX, you require:

  • The ESX 4.x installation CD.
  • You have already migrated all of your locally-stored virtual machines to alternative storage or datastores.
  • A storage device (8.5GB or larger) to temporarily store the Console OS. This device can be a USB flash/hard disk, a SAN LUN, a secondary SCSI disk, logical unit, or volume, or a remote SSH server.Note: If you are using a flash/hard drive, it is reformatted in the steps below.

To reformat the local VMFS partition on your VMware ESX 4.x server after an installation has already been completed:

  1. Prepare the server for booting into the installer CD’s live environment:
    1. Shut down the ESX host.
    2. Insert, attach, or present the temporary storage device to the ESX host. If you are using an SSH server, disregard this step. This storage is used to store your Console VMDK file.
    3. Power on the ESX host.
    4. Boot into the ESX 4.x installer and select either the graphical or text installer.
    5. Complete the driver loading stage within the chosen installer. This loads the VMkernel drivers required to access the existing VMFS partition.
    6. When you are prompted about the license, choose Evaluation mode. You do not need to configure licensing.
    7. Complete the network configuration steps. After opting for DHCP or manual configuration, click the Test button. This commits the networking changes.
  1. Prepare the USB flash/hard drive or alternate block device:
    1. Open an alternate TTY session by pressing Ctrl+Alt+F2. Press Enter to start the login session.
    2. Run fdisk <device> to modify the partitions for the newly-added device.Note: Representation of attached devices varies between servers. A USB device used for the purposes of this article is recognised as /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0.
    3. Print the current partition table to ensure that the correct drive is selected by pressing p in fdisk.Notes:
      • If you have selected an incorrect device, press q to quit, then try another device.
      • If you are using an old drive or it contains partitions you need to remove, you can destroy the partition table entries by pressing d in fdisk.
    4. Press n to create a Linux EXT3 partition on this device (default). You may use the default parameters and partition the whole device.
    5. When you are finished, press w to write the changes and quit fdisk.
    6. After partitioning, format the new partition as EXT3. For example:mkfs.ext3 /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:1

      Note: If you see a prompt that indicates that the partition is not a block special device, click Continue.

    7. Create a directory or mount point for the device. For example:mkdir /mnt/mydrive
    8. Mount the filesystem with the command:mount -t ext3 <device path> <mount point>

      For example:

      mount -t ext3 /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:1 /mnt/drive

    9. Copy the console OS directory to the mount point. For example:cp -R /vmfs/volumes/Storage1/esxconsole* /mnt/drive

      Note: This process can take some time as the Console OS VMDK is a approximately 8.5 gigabytes or greater.

      Alternatively, you can secure-copy the Console OS to a network location:

      1. Verify that you can reach your designated SSH server from the console using ping.
      2. SCP the contents of the Console OS directory to the SSH server. Adjust the following command as required:scp -r /vmfs/volumes/Storage1/esxconsole* username@server:/destination_directory

        Note: The specified SSH server and its storage must have capacity and support for files of 8GB in size and larger.

  1. Reformat the original VMFS volume with your desired block size.For example, to format it with an 8MB block size and name it Storage1, run the command:

    vmkfstools -C vmfs3 -b 8m -S Storage1 /vmfs/devices/disks/mpx.vmhba0:C0:T0:L0:5

    Note: The local storage device address may be different on your server. Ensure that you have the correct device and partition selected before formatting with the vmkfstools command. In the above example, the VMFS3 volume was in the fifth partition (highlighted in blue). While this is the default configuration, your configuration may still be different. For example, it may have a different vmhba (highlighted in red), controller (C), target (T), or LUN (L) number.

  2. Copy the Console OS directory back from the mount point:
    • If you used a USB flash/hard drive or alternate block device, copy the Console OS directory from your secondary drive back to the VMFS volume. For example:cp -R /mnt/drive/* /vmfs/volumes/Storage1
    • If you used an SSH server, use SCP to copy the Console OS back from the SSH server to the new VMFS volume. For example:scp –r username@server:/copied-directory /vmfs/volumes/Storage1
  1. Reconfigure the ESX host:
    1. Navigate to the /vmfs/volumes/Storage1/<cos-location>/ directory. Run pwd and record the absolute path to the Console OS VMDK file. For example:# pwd

      /vmfs/volumes/4a79e784-066e4fef-9d4b-005056ab7e20/esxconsole-4a785116-c442-9826-6f60-005056ab7e20/

      Note: The VMDK filename is esxconsole.vmdk.

    2. Eject the installation CD, then reboot the ESX host.
    3. Interrupt the GRUB Bootloader. During startup, you have a few seconds to select ESX Server 4.0 or Troubleshooting Mode. Press a cursor key or a meta key to interrupt the countdown.
    4. Highlight ESX Server 4.0 (or ESX Server 4.1). Edit the kernel boot parameter for the first menu item by pressing a.
    5. Add the now-required /boot/cosvmdk parameter to your kernel boot line. It must be at the beginning of the line, preceding the values that are already provided already (do not erase them). For example:
      • Before:grub append> ro root=UUID=d01bc3a8-1e83-47ea-8250-a77cd15fc54 mem=300M quiet
      • After:grub append> /boot/cosvmdk=/vmfs/volumes/4a79e784-066e4fef-9d4b-005056ab7e20/esxconsole-4a785116-c442-9826-6f60-005056ab7e20/esxconsole.vmdk ro root=UUID=d01bc3a8-1e83-47ea-8250-a77cd15fc54 mem=300M quiet

        Note: Double-check your input. Typographical errors may result in a failed boot-up.

    1. Press Enter to save the change. This resumes the booting of ESX.Note: If you are returned to a recovery shell, there may have been a typographical error in step 5, or there may be other issues reaching the file or starting the host. You can retry the steps in this section (Reconfiguring the ESX host), or you can contact VMware Technical Support for assistance as there may be other issues. For more information on contacting VMware Technical Support, see How to Submit a Support Request.
    2. When ESX is online at the status screen, log into the console. You can log in using an SSH client.
    3. Edit the /etc/vmware/esx.conf file using a text editor.
    4. Locate the line that specifies the Console OS vmdk path and replace the path with the value or path recorded in step 1. For example:/boot/cosvmdk = “/vmfs/volumes/4a79e784-066e4fef-9d4b-005056ab7e20/esxconsole-4a785116-c442-9826-6f60-005056ab7e20/esxconsole.vmdk”
    5. Save your changes and return to the prompt.
    6. Run the command esxcfg-boot -b to update the ESX host boot configuration and initial RAM disk image.
    7. (Optional) Using the reboot command, restart the ESX

Reference taken from vmware

 

For more visit our complete blog site…!!

  1. #1 by xn--parcbb-fvab.com on December 28, 2012 - 5:25 am

    Saved as a favorite, I like your website!

  2. #2 by Arjun Kanuri on May 21, 2013 - 7:22 pm

    Having read this I thought it was extremely
    enlightening. I appreciate you taking the time and effort to put this information together.
    I once again find myself spending a lot of time both reading and commenting.

    But so what, it was still worth it!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: