VMware vSphere and ESXi has a very solid storage system. You might be interested in a fact, that there is a tool, similar to Windows chkdsk, which can check and identify incidents on metadata corruption which might affect file systems or underlying logical volumes. During such situations, you might want to run VMware vSphere On-Disk Metadata Analyzer (VOMA).
In case you're having problems (storage outages, after rebuilding a RAID or disk replacement, metadata consistency errors in vmkernel.log file) with VMFS datastores or a virtual flash resource, VOMA can be used from CLI of an ESXi host and you can check and fix minor consistency issues for VMFS datastore or virtual flash resource.
In order to use the tool the right way, you should check the KBs listed below the post. Make sure that the VMFS datastore does not span multiple extents (VOMA can be run only against a single extent). Also, you need to evacuate running VMs to another datastore or shut them down. However, if you have doubts that a VM after shut down won't restart, make sure that you have a propper backup of that VM before shutting down.
How to use VMware vSphere On-Disk Metadata Analyzer?
If you're not sure, file a support request with VMware, before doing any of this. Check both KB articles for best practices.
Basically you should make sure that:
- There are no VMs on the datastore you wish to analyze.
- For VMFS5 datastores, the datastore is unmounted on all ESXi hosts (I haven't done that so I have a message saying that one ESXi uses this datastore for heart beating)
- For VMFS3 the LUN masking has to be in place by using claim rules.
- The volume does not have several extents.
Step 1. Connect via SSH and enter this command to obtain the name and partition number of the device which has the VMFS datastore.
esxcli storage vmfs extent list
You'll see an output like this
Step 2. Then run this command by providing an absolute path to the device partition you want to check. You'll also need to give a partition number with the device name. Example below:
voma -m vmfs -f check -d /vmfs/devices/disks/t10.ATA_ST31000528AS__6VPB9053:1
Gives us just a notification that the datastore heart beating is taking place on this datastore. But no errors.
The options:
You can find all options by typing, as usually:
voma -h
The output looks like this:
You can log the output to a file with the -s option or further display help message with each VOMA command.
VOMA on our testing system (ESXi 6.5) has 4 options:
query – list functions supported by module
check – check for Errors
fix – check & fix
dump – collect metadata dump
To be honest, I haven't been aware of this utility and found the mention in VMware vSphere troubleshooting guide (PDF). It is certainly a good find which can detect some errors in case you have some doubts on your datastore.
vSphere On-disk Metadata Analyzer (VOMA) can help in a situation where you need isolate certain problems or clear doubts. You might have intermittent access to files on a datastore, but this can be network issues or storage issues. Having a tool which can clear doubts is a good thing. Similar to Microsoft's check disk, VOMA can be useful when troubleshooting VMware ESXi or vSphere storage problems.
A two VMware KB articles which are interesting to read. One of them helps you via the help of VOMA, recreate missing partition tables.
The other one gives you some further guidance on the VOMA tool with some cautions too.
Quote:
Shutting down a virtual machine running on files having certain types of corrupt metadata may make the virtual machine and its data permanently unavailable. Because of this it is always advisable to have current backups of the virtual machines in the environment. If you suspect that the virtual machine may become unavailable, because for example, there are read/write errors in the guest OS, or the virtual machine is unresponsive, you should open a support request.
Here is the link:
If you're having doubts or experiencing problems and have VMware support, it's always preferable to fille a support request. One never knows. Also, backup any VMs which you might suspect of any kind of data corruption.
More from ESX Virtualization
- What is VMware VMFS Locking Mechanism?
- VMware Transparent Page Sharing (TPS) Explained
- VCP6.5-DCV Objective 7.2 – Troubleshoot vSphere Storage and Networking
- VCP6.5-DCV Objective 2.2 – Configure Network I/O control (NIOC)
- Mastering VMware vSphere 6.5
Stay tuned through RSS, and social media channels (Twitter, FB, YouTube)
Yuriy Andamasov says
Thank you Vladan!
Definitely should be on list of troubleshooting tools
Joshua says
If all VMs need to be off the datastore and the datastore needs to be offline to use VOMA, why not just delete the datastore and recreate it from scratch and format it? Wouldn’t that clear up any errors, or is this for validating the disks behind the VMFS datastore?
Vladan SEGET says
You might not be able to move off all the VMs (large datastore, unaccessible VMDKs..). It can fix metadata issues only. Not the underlying raid groups.
JayST says
so how would you check if you can’t move the VMs?
Vladan SEGET says
Do you experience particular problems? You can first use the tool to check if there are any metadata problems. If yes, you should backup your VMs first (if not already done) before trying anything else.
If you can’t access some VMDKs then obviously you can’t move them, allong with the VMs.
With a good backup, you can just delete and recreate the datastore to clean things out.