VSAN: Fixing inaccessible objects

My posts recently have been covering an upgrade I completed at my workplace which is a pre-production VDI on VSAN environment. The upgrade took us from 5.5 U2 to 6.0 U1 on vSphere and 6.0.1 to 6.2 on View.

An error that stopped me in my tracks for quite a while was during the upgrade of the VSAN on-disk file format from v1 to v2. Before the upgrade I read the documentation and supplemented it with a few blogs posts from several people in the community. A few of great ones from: VMware Blogs, Cormac Hogan and Sam McGeown. These posts covered my environment (3 node VSAN) and also larger cluster configurations, which are inherently easier.

Before I start with the problem, I’d also like to give a mention to this article from Florian Grehl as it was really useful in helping me get the most out of the RVC, especially making marks for objects so you don’t have to keep referencing them the long winded way.

The Issue

The problem that I saw was that before I had even moved to version 6.0 U1 from 5.5, when I ran the cluster health check command, I was seeing inaccessible objects listed on one specific host for some reason or another:

VSAN1

The Fix

After some reading, it turns out that this can be caused by .vswp files that have become obsolete. There is a fix for this but not in the command set for RVC 5.5. Therefore, I continued the upgrade of the environment to get to 6.0 U1 where the command becomes available.

Upon running the command when the environment was all up to 6.0 U1, I saw the following:

VSAN2

At this point I accepted the prompts to delete the files that had been found through running the script.
VSAN3

VSAN4

Running this command saw a good reduction in the number of inaccessible objects. Although, I still had around 6 or so left. I ran the purge command again and it was unable to remove any files. I had to raise an SR with GSS to continue with this case as I read on the forums that there was a way of manually removing these but comes with the associated risk that you can greatly damage the VSAN environment if you are not careful!

Disclaimer: GSS assisted me with the following and did a great job helping me out. Please use the following commands with extreme caution and if in doubt always raise an SR first!

With that out of the way, I was able to use the VMware object tool to understand what these inaccessible objects were associated with:

VSAN5

As we can see from the above the objects relate to some .vmdk files. Luckily for me, I was aware of these VM’s as they live in the VDI environment that we use for Pre-Production. The relate to our Windows 8.1 desktops. Performing more investigation showed that these VM’s were also old and had been removed from the system (due to VDI recomposes, etc). This gave me and GSS the confidence to move on to manually removing these objects.

VSAN6

After repeating this process for the remaining inaccessible objects reported, I was then able to run the check_state command (as above), to ensure everything was good to go.
VSAN7

From this point on I was good to continue with the planned upgrade of the VSAN on disk file system. This will be documented in my VDI on VSAN series of posts.

13 comments

  1. Erik Verbruggn

    Thank you very much for this article. We ran into the same problems when our lab environment needed to be rebuild but not all disks were formatted.

    It’s also possible to remove the disk / disk group from the VSAN configuration, format the disks and re-add the disks. It’s very important to format the capacity disk AND the flash disk otherwise it will not work. You can use the remove partition function in the Web Client (6.0 U1) to format the disks.

    If it’s only a few objects or you do not want delete all the objects on the disk group it’s easier to use the steps in this article.

  2. Ryan Harris
    Ryan Harris

    Hi Erik,

    Sorry for the slow reply getting back to you. I thought I replied already but must not have saved it!

    It is possible to remove the disks from the configuration manually but I wanted to try and fix the issue directly rather than effectively working around and rebuilding it. The Web Client is great for the new features of upgrading/formatting disks but I like to do it the old fashioned way to see whats really going on!

    Thank you for your comments, I’m glad this article helped!

  3. philzy

    Hello!
    i have this problem
    > vsan.object_info /192.168.1.30/DC-LV-01/computers/HA-DRS-01/ 67160e57-545a-fa16-db96-3440b59368bc
    DOM Object: 67160e57-545a-fa16-db96-3440b59368bc (v3, owner: host-02., policy: No POLICY entry found in CMMDS)
    RAID_1
    Component: 67160e57-4aab-6017-f02d-3440b59368bc (state: ABSENT (6), host: host-01., md: naa.600605b0048058f01e231593191e6ac9, ssd: naa.600605b0048058f01e6f54cf401ef387, note: LSOM object not found,
    votes: 1)
    Component: 67160e57-c240-6217-9324-3440b59368bc (state: ABSENT (6), host: host-02., md: naa.600605b0047fea401ea006ff39e3d501, ssd: naa.600605b0047fea401ea0086a4f933a3e,
    votes: 1, usage: 2.0 GB)
    Witness: 67160e57-aa86-6317-5d5d-3440b59368bc (state: ABSENT (6), host: host-wtnss-01-02., md: mpx.vmhba1:C0:T1:L0, ssd: mpx.vmhba1:C 0:T2:L0, note: LSOM object not found,
    votes: 1)

    Tried your solution but get this
    [root@host-02:~] /usr/lib/vmware/osfs/bin/objtool getAttr -u 60160e57-5015-74e0-5cfe-3440b59368bc
    -c
    Failed to find lsom object: Not found
    Failed to get disk uuid
    object getAttr error: Failure

    Any ideas?

    • Ryan Harris
      Ryan Harris

      Hi,

      Do you still have this problem? I’ve been slow to reply recently due to personal commitments.

      I notice that you are not using the full getAttr –bypassDom -u **INACCESSIBLE_GUID** -c command in your output. Does this return the same value?

    • Marcel

      For anyone that may be reading this recently. I also had a few inaccessible items and noticed that I also got failed returns for the above command, but not if you cleared the -c at the end:

      /usr/lib/vmware/osfs/bin/objtool getAttr –bypassDom -u *uuid*

  4. Wanly

    I have the same problem. The vmdk and folders had been deleted. The objtool can not find and delete the file.

  5. Sam

    same as above, it was said from vcsa the object is on this host. but after checking that host, failed to find.

  6. Noel de Gabriele

    For those of you who are getting lsom object not found, I had the same issue. Managed to delete it by using following (ForceDelete)

    /usr/lib/vmware/osfs/bin/objtool delete -u **INACCESSIBLE_GUID** -c -f

    Obviously, backup first 🙂

  7. Pingback: vSAN Cluster내 inaccessible object의 Fix – STARION Vitrualization
  8. Pingback: Replacing an ESXi Host in a vSAN Cluster |

Leave a Reply

Your email address will not be published. Required fields are marked *