As explained in the Elastic Dedupe Engine section of the Nutanix Bible, Nutanix uses a fingerprint (SHA-1 hash) to find and remove duplicate data.

In this post I provide a sample script on how you can use the Nutanix Powershell CMDlets and remote SSH to manually fingerprint vDisks matching a specific search term.  This would come in handy where you want to fingerprint / dedupe VMs which may have been provisioned prior to enabling dedupe on the container.

Here’s an example script (inputs will need to be modified to match your environment):

This is the first of many more Nutanix Powershell examples to come!

  • quinney_david

    Thanks for the script! I’m currently running it on a datastore which has multiple clones of VM’s specifically for testing dedupe, so it will be interesting to see how much space can be saved since on-disk dedupe was enabled after our NOS 4.0.1 upgrade.

    One thing, while the script is running, it mentions “Logs will be prefixed with vdisk_manipulator-18019″ Where about’s are these logs stored on the CVM?

    also, a closing “)” is missing from SecureStringToBSTR($password))

    • stevenpoitras

      That is awesome! Definitely would be interested in seeing how much space is saved. Were they all provisioned via VAAI? If so they should already be “thin” but will still be great data!

      For the vdisk_manipulator logs you can find those logs in ~/data/logs on the CVM

      Thanks for seeing the typo, fixed! :P

  • Guest

    So far its saved approx 118GB out of 432GB used. The dedupe takes some time, I notice the space saved goes up a few GB every day :D

    In this use case I specifically cloned these VM’s from another datastore, to avoid VAAI clone. Not ideal, but this was to simulate what would happen if multiple VM’s would be storage vMotioned from another storage array to Nutanix in a migration scenario.

    So far I am seeing the most benefit from post-process compression, especially for infrastructure workloads. Savings are quite substantial

