The Nutanix Bible

  1. Intro

  2. A Brief Lesson in History - beta!

  3. Book of Web-Scale 

  4. Book of Nutanix

  5. Book of vSphere

  6. Book of Hyper-V

  7. Revisions


Welcome to The Nutanix Bible!  I work with the Nutanix platform on a daily basis – trying to find issues, push its limits as well as administer it for my production benchmarking lab.  This page is being produced to serve as a living document outlining tips and tricks used every day by myself and a variety of engineers at Nutanix.  This will also include summary items discussed as part of the Advanced Nutanix series.

NOTE: What you see here is an under the covers look at how things work.  With that said, all topics discussed are abstracted by NDFS and knowledge isn’t required to successfully operate a Nutanix environment!

The following localized versions are available:


A Brief Lesson in History – beta!

Before we get started I wanted to go through a brief history and some key drivers which have led us to where we are today.

Evolution of the datacenter

In the beginning everything was natively converged…

The era of the mainframe…

The mainframe ruled for many years and laid the core foundation of where we are today.  It allowed companies to leverage

Key characteristics:

  • Natively converged CPU, main memory and storage
  • Engineered internal redundancy


  • $$$
  • Inherent complexity
  • Lack of flexibility and highly siloed

Attack of the pizza boxes…

With mainframes it was very difficult for organizations within a business to leverage these capabilities which partly led to the entrance of pizza boxes or stand-alone servers.

Key characteristics:

  • CPU, main memory and DAS storage
  • Higher flexibility than the mainframe
  • Accessed over the network


  • Increased number of silos
  • Low, or unequal, resource utilization
  • Server became a single point of failure (SPOF) for compute AND storage

In comes centralized storage…

Businesses need to make money and data is a key piece of that puzzle.  With DAS orgs would either A) need more space than was locally available or B) need data HA where a server failure wouldn’t cause data unavailability.

Centralized storage game them both, sharable / larger pools of storage which also provide data protection

Key characteristics:

  • Pooled storage resources à better storage utilization
  • Centralized data protection via RAID à server loss didn’t cause a data loss
  • Storage I/Os are performed over the network


  • Potentially more expensive, however data is $$$
  • Complexity (SAN Fabric, WWPNs, RAID groups, volumes, spindle counts, etc.)
  • Another management tool / team

The birth of virtualization…

In this age compute utilization is low and resource efficiency impacts the bottom line.  Virtualization comes into the picture and allows multiple workloads / OS to run as VMs on a single piece of hardware.

Virtualization allows the business to increase utilization of the pizza boxes, but increases the number of silos and the impact of an outage.

Key characteristics:

  • Abstracting the OS from hardware (VM)
  • Very efficient compute utilization à workload consolidation


  • More silos and management complexity
  • Lack of VM high-availability à if a compute node fails the impact is much larger
  • Lack of pooled resources
  • And, another management tool / team

Virtualization becomes a teenager (…and starts dating)

The hypervisor has become a very efficient and feature filled solution.  With the advent of things like VMware’s vMotion, HA and DRS, users have the ability to provide VM High-availability and migrate compute workloads dynamically.

The only caveat is the reliance on centralized storage, causing the two paths to merge…

The only down turn, is the load on the storage array is much higher than before and VM sprawl leads to contention for storage I/O.

Key characteristics:

  • Clustering à Pooled compute resources
  • Ability to dynamically migrate workloads between compute nodes (DRS / vMotion)
  • VM High-availability (HA) in the case of a compute node failure
  • Requirement for centralized storage


  • Higher demand on storage due to VM sprawl
  • Requirements to scale out more arrays creating more silos and more complexity
  • Higher $ / GB due to requirement of an array
  • Possibility of resource contention on array
  • Makes storage configuration much more complex due to necessity to ensure:
    • VM to datastore / LUN ratios
    • Spindle count to facilitate I/O requirements

SSDs to the rescue! (…well we thought)

SSDs help alleviate this I/O bottleneck by providing much higher I/O performance without the need for tons of disk enclosures.

However, given the extreme advances in performance the controllers and network haven’t yet evolved to handle the vast I/O available.

Key characteristics:

  • Much higher I/O characteristics than traditional HDD
  • Essentially eliminates seek times


  • Bottleneck shifted from storage I/O on disk to the controller / network
  • Silos still remain
  • Array configuration complexity still remains

A look at the importance of latency

Many of you might be very clear with the graphic below which characterizes the various latencies for specific types of I/O:


From the above we can see that the CPU can access its caches at anywhere from ~0.5-7ns (L1 vs. L2).  For main memory these accesses occur at ~100ns, whereas a local 4K SSD read is ~150,000ns or 0.15ms.

Now let’s do some math…

If we take a typical enterprise class SSD (in this case the Intel S3700 – SPEC), this device is capable of the following:

  • Random I/O performance:
    • Random 4K Reads: Up to 75,000 IOPS
    • Random 4K Writes: Up to 36,000 IOPS
  • Sequential bandwidth:
    • Sustained Sequential Read: Up to 500MB/s
    • Sustained Sequential Write: Up to 460MB/s
  • Latency:
    • Read: 50us
    • Write: 65us

First, let’s take a look at the bandwidth…

For traditional storage there are a few main types of media for I/O:

  • Fiber Channel (FC)
    • 4-, 8-, and 10-Gb
  • Ethernet (including FCoE)
    • 1-, 10-Gb, (40-Gb IB), etc.

For the calculation below, we are using the 500MB/s Read and 460MB/s Write BW available from the Intel S3700.

The calculation is done as follows:

numSSD = ROUNDUP((numConnections * connBW (in GB/s))/ ssdBW (R or W))

NOTE: numbers were rounded up as a partial SSD isn’t possible.  This also does not account for the necessary CPU required to handle all of the I/O and assumes unlimited controller CPU power.

Network BWSSDs required to max network BW
Controller ConnectivityAvailable Network BWReadWrite
Dual 4Gb FC8Gb == 1GB23
Dual 8Gb FC16Gb == 2GB45
Dual 16Gb FC32Gb == 4GB89
Dual 1Gb Eth2Gb == 0.25GB11
Dual 10Gb Eth20Gb == 2.5GB56

As you can see if you wanted to leverage the theoretical maximum performance an SSD could offer the network can become a bottleneck anywhere from 1 to 9 SSDs depending on the type of networking leveraged.

Now let’s look at the impact to memory latency…

Typical main memory latency is ~100ns (will vary), we can perform the following calculations:

Local memory read latency = 100ns + [OS / hypervisor overhead]

Network memory read latency = 100ns + NW RTT latency + [2 x OS / hypervisor overhead]

If we assume a typical network RTT is ~0.5ms (will vary by switch vendor) which is ~500,000ns that would come down to:

Network memory read latency = 100ns + 500,000ns + [2 x OS / hypervisor overhead]

If we theoretically assume a very fast network with a 10,000ns RTT:

Network memory read latency = 100ns + 10,000ns + [2 x OS / hypervisor overhead]

What that means is even with a theoretically fast network there is a 10,000% overhead when compared to a non-network memory access.  With a slow network this can be upwards of a 500,000% latency overhead.

In order to alleviate this overhead server side caching technologies are introduced.

Making sense of it all…

Hopefully the above wasn’t a boring, dry math session, but rather objectively quantified some of the key drivers to where we are today.

In summary:

  1. Inefficient compute utilization led to the move to virtualization
  2. Features like vMotion, HA, and DRS led to the requirement of centralized storage
  3. VM sprawl led to the increase load and contention on storage
  4. SSDs came in to alleviate the issues but changed the bottleneck to the network / controllers
  5. Cache / memory accesses over the network face large overheads, minimizing their benefits
  6. Array configuration complexity still remains the same
  7. Server side caches are introduced to alleviate the load on the array / impact of the network, however introduces another component to the solution
  8. Nutanix arrives to provide the benefits of features like HA, vMotion and DRS while using local attached SSD and HDD
  9. Locality helps alleviate the bottlenecks / overheads traditionally faced when going over the network
  10. Shifts the focus from infrastructure to ease of management and simplifying the stack

Hopefully that helped add some light on where we came and a few of the drivers for the decisions we’ve made below.  In comes web-scale…

Book of Web-Scale

This section will be a primer to allow you to understand some of the core concepts behind “web-scale” infrastructure and why we leverage them. Before I get started, I just wanted to clearly state the Web-scale doesn’t mean you need to be “web-scale” (eg. Google, Facebook, Microsoft).  These constructs are applicable and beneficial at any scale (3-nodes or thousands of nodes).

Before let’s get started let’s put some of the historical challenges out on the table:

  • Complexity, complexity, complexity
  • Desire for incremental based growth
  • The want to be agile

There are a few key constructs used when talking about “Web-scale” infrastructure:

  • Hyper-convergence
  • Software defined intelligence
  • Distributed autonomous systems
  • Incremental and linear scale out

Other related items:

  • API-based automation and rich analytics
  • Self-healing

Now that we know all of the buzz words, we’ll break down these items and give a technical perspective on what they actually mean below.


No matter where you go you’ll hear differing opinions on what hyper-convergence actually is.  It will also vary based upon the scope of components (e.g. virtualization, networking, etc.). However, the core concept comes down to the following: natively combining two or more components into a single unit. Natively is a key word in the above. In order to be the most effective the components must be natively integrated and not just bundled together. In the case of Nutanix, we natively converge compute + storage to form a single node used in our appliance.  For others this might be converging storage with the network, etc. What it really means:

  • Natively integrating two or more components into a single unit which can be easily scaled


  • Single unit to scale
  • Localized I/O
  • Eliminates traditional compute / storage silos by converging them

Software-defined Intelligence

Software-defined intelligence is taking the core logic from normally proprietary or specialized hardware (eg. ASIC / FPGA) and doing it in software on commodity hardware. For Nutanix, we take the traditional storage logic (e.g. RAID, deduplication, compression, etc.) and put that into software which runs in each of the Nutanix CVMs on standard x86 hardware. What it really means:

  • Pulling key logic from hardware and doing it in software on commodity hardware


  • Rapid release cycles
  • Elimination of proprietary hardware reliance
  • Utilization of commodity hardware for better economics

Distributed Autonomous Systems

This really comes down to moving away from the traditional concept of having a single unit responsible for doing something and distributing that role among all nodes within the cluster.  You can think of this as creating a purely distributed system. Traditionally vendors have always assumed that hardware will be reliable, which, in most cases can be true.  However, core to distributed systems, is the idea that hardware will eventually fail and handling that fault in an elegant and non-disruptive way is key. These distributed systems are designed to accommodate and remediate failure, to form something that is self-healing and autonomous.  In the event of a component failure, the system will transparently handle and remediate the failure, continuing to operate as expected. Alerting will make the user aware, but rather than being a critical time-sensitive item, any remediation (e.g. replace a failed node) can be done on the admin’s schedule.  Another way to put it is fail in-place (rebuild without replace) For items where a “master” is needed an election process is utilized, in the event this master fails a new master is elected.  To distribute the processing of tasks MapReduce concepts are leveraged. What it really means:

  • Distributing roles and responsibilities to all nodes within the system
  • Utilizing concepts like MapReduce to perform distributed processing of tasks
  • Using an election process in the case where a “master” is needed


  • Eliminates any single points of failure (SPOF)
  • Distributes workload to eliminate any bottlenecks

Incremental and linear scale out

Incremental and linear scale out relates to the ability to start with a certain set of resources and as needed scale them out while linearly increasing the performance of the system.  All of the constructs mentioned above are critical enablers in making this a reality. For example, traditionally you’d have 3-layers of components for running virtual workloads: servers, storage, and network – all of which are scaled independently.  As an example, when you scale out the number of servers you’re not scaling out your storage performance. With a hyper-converged platform like Nutanix, when you scale out with new node(s) you’re scaling out:

  • The number of hypervisor / compute nodes
  • The number of storage controllers
  • The compute & storage performance / capacity
  • The number of nodes participating in cluster wide operations

What it really means:

  • The ability to incrementally scale storage / compute with linear increases to performance / ability


  • Ability to start small and scale
  • Elimination of any bottlenecks
  • Uniform and consistent performance at any scale

Book of Nutanix


Converged Platform

For a video explanation you can watch the following video below:

The Nutanix solution is a converged storage + compute solution which leverages local components and creates a distributed platform for virtualization aka virtual computing platform. The solution is a bundled hardware + software appliance which houses 2 (6000/7000 series) or 4 nodes (1000/2000/3000/3050 series) in a 2U footprint. Each node runs an industry standard hypervisor (ESXi, KVM, Hyper-V currently) and the Nutanix Controller VM (CVM).  The Nutanix CVM is what runs the Nutanix software and serves all of the I/O operations for the hypervisor and all VMs running on that host.  For the Nutanix units running VMware vSphere, the SCSI controller, which manages the SSD and HDD devices, is directly passed to the CVM leveraging VM-Direct Path (Intel VT-d).  In the case of Hyper-V the storage devices are passed through to the CVM. Below is an example of what a typical node logically looks like:   NDFS_NodeDetail2 Together, a group of Nutanix Nodes forms a distributed platform called the Nutanix Distributed Filesystem (NDFS).  NDFS appears to the hypervisor like any centralized storage array, however all of the I/Os are handled locally to provide the highest performance.  More detail on how these nodes form a distributed system can be found below. Below is an example of how these Nutanix nodes form NDFS: CVM_Dist


Cluster Components

For a video explanation you can watch the following video below:

The Nutanix platform is composed of the following high-level components:

  • Key Role: Distributed metadata store
  • Description: Cassandra stores and manages all of the cluster metadata in a distributed ring like manner based upon a heavily modified Apache Cassandra.  The Paxos algorithm is utilized to enforce strict consistency.  This service runs on every node in the cluster.  Cassandra is accessed via an interface called Medusa.
  • Key Role: Cluster configuration manager
  • Description: Zeus stores all of the cluster configuration including hosts, IPs, state, etc. and is based upon Apache Zookeeper.  This service runs on three nodes in the cluster, one of which is elected as a leader.  The leader receives all requests and forwards them to the peers.  If the leader fails to respond a new leader is automatically elected.   Zookeeper is accessed via an interface called Zeus.
  • Key Role: Data I/O manager
  • Description: Stargate is responsible for all data management and I/O operations and is the main interface from the hypervisor (via NFS, iSCSI or SMB).  This service runs on every node in the cluster in order to serve localized I/O.
  • Key Role: Map reduce cluster management and cleanup
  • Description: Curator is responsible for managing and distributing tasks throughout the cluster including disk balancing, proactive scrubbing, and many more items.  Curator runs on every node and is controlled by an elected Curator Master who is responsible for the task and job delegation.  There are two scan types for Curator, a full scan which occurs around every 6 hours and a partial scan which occurs every hour.
  • Key Role: UI and API
  • Description: Prism is the management gateway for component and administrators to configure and monitor the Nutanix cluster.  This includes Ncli, the HTML5 UI and REST API.  Prism runs on every node in the cluster and uses an elected leader like all components in the cluster.
  • Key Role: Cluster component & service manager
  • Description:  Genesis is a process which runs on each node and is responsible for any services interactions (start/stop/etc.) as well as for the initial configuration.  Genesis is a process which runs independently of the cluster and does not require the cluster to be configured/running.  The only requirement for genesis to be running is that Zookeeper is up and running.  The cluster_init and cluster_status pages are displayed by the genesis process.
  • Key Role: Job and Task scheduler
  • Description: Chronos is responsible for taking the jobs and tasks resulting from a Curator scan and scheduling/throttling tasks among nodes.  Chronos runs on every node and is controlled by an elected Chronos Master who is responsible for the task and job delegation and runs on the same node as the Curator Master.
  • Key Role: Replication/DR manager
  • Description: Cerebro is responsible for the replication and DR capabilities of NDFS.  This includes the scheduling of snapshots, the replication to remote sites, and the site migration/failover.  Cerebro runs on every node in the Nutanix cluster and all nodes participate in replication to remote clusters/sites.
  • Key Role: vDisk configuration manager
  • Description: Pithos is responsible for vDisk (NDFS file) configuration data.  Pithos runs on every node and is built on top of Cassandra.

Data Structure

The Nutanix Distributed Filesystem is composed of the following high-level structs:

Storage Pool
  • Key Role: Group of physical devices
  • Description: A storage pool is a group of physical storage devices including PCIe SSD, SSD, and HDD devices for the cluster.  The storage pool can span multiple Nutanix nodes and is expanded as the cluster scales.  In most configurations only a single storage pool is leveraged.
  • Key Role: Group of VMs/files
  • Description: A container is a logical segmentation of the Storage Pool and contains a group of VM or files (vDisks).  Some configuration options (eg. RF) are configured at the container level, however are applied at the individual VM/file level.  Containers typically have a 1 to 1 mapping with a datastore (in the case of NFS/SMB).
  • Key Role: vDisk
  • Description: A vDisk is any file over 512KB on NDFS including .vmdks and VM hard disks.  vDisks are composed of extents which are grouped and stored on disk as an extent group.

Below we show how these map between NDFS and the hypervisor: SP_structure

  • Key Role: Logically contiguous data
  • Description: A extent is a 1MB piece of logically contiguous data which consists of n number of contiguous blocks (varies depending on guest OS block size).  Extents are written/read/modified on a sub-extent basis (aka slice) for granularity and efficiency.  An extent’s slice may be trimmed when moving into the cache depending on the amount of data being read/cached.
Extent Group
  • Key Role: Physically contiguous stored data
  • Description: A extent group is a 1MB or 4MB piece of physically contiguous stored data.  This data is stored as a file on the storage device owned by the CVM.  Extents are dynamically distributed among extent groups to provide data striping across nodes/disks to improve performance.  NOTE: as of 4.0 extent groups can now be either 1MB or 4MB depending on dedupe.

Below we show how these structs relate between the various filesystems: NDFS_DataLayout_Text   Here is another graphical representation of how these units are logically related:   NDFS_DataStructure3

I/O Path Overview

For a video explanation you can watch the following video below:

The Nutanix I/O path is composed of the following high-level components: NDFS_IO_basev5

  • Key Role: Persistent write buffer
  • Description: The Oplog is similar to a filesystem journal and is built to handle bursty writes, coalesce them and then sequentially drain the data to the extent store.  Upon a write the OpLog is synchronously replicated to another n number of CVM’s OpLog before the write is acknowledged for data availability purposes.  All CVM OpLogs partake in the replication and are dynamically chosen based upon load.  The OpLog is stored on the SSD tier on the CVM to provide extremely fast write I/O performance, especially for random I/O workloads.  For sequential workloads the OpLog is bypassed and the writes go directly to the extent store.  If data is currently sitting in the OpLog and has not been drained, all read requests will be directly fulfilled from the OpLog until they have been drain where they would then be served by the extent store/content cache.  For containers where fingerprinting (aka Dedupe) has been enabled, all write I/Os will be fingerprinted using a hashing scheme allowing them to be deduped based upon fingerprint in the content cache.
Extent Store
  • Key Role: Persistent data storage
  • Description: The Extent Store is the persistent bulk storage of NDFS and spans SSD and HDD and is extensible to facilitate additional devices/tiers.  Data entering the extent store is either being A) drained from the OpLog or B) is sequential in nature and has bypassed the OpLog directly.  Nutanix ILM will determine tier placement dynamically based upon I/O patterns and will move data between tiers.
Content Cache
  • Key Role: Dynamic read cache
  • Description: The Content Cache (aka “Elastic Dedupe Engine”) is a deduped read cache which spans both the CVM’s memory and SSD.  Upon a read request of data not in the cache (or based upon a particular fingerprint) the data will be placed in to the single-touch pool of the content cache which completely sits in memory where it will use LRU until it is ejected from the cache.  Any subsequent read request will “move” (no data is actually moved, just cache metadata) the data into the memory portion of the multi-touch pool which consists of both memory and SSD.  From here there are two LRU cycles, one for the in-memory piece upon which eviction will move the data to the SSD section of the multi-touch pool where a new LRU counter is assigned.  Any read request for data in the multi-touch pool will cause the data to go to the peak of the multi-touch pool where it will be given a new LRU counter.  Fingerprinting is configured at the container level and can be configured via the UI.  By default fingerprinting is disabled.
  • Below we show a high-level overview of the Content Cache:


Extent Cache
  • Key Role: In-memory read cache
  • Description: The Extent Cache is an in-memory read cache that is completely in the CVM’s memory.  This will store non-fingerprinted extents for containers where fingerprinting and dedupe is disabled.  As of version 3.5 this is separate from the Content Cache, however these will be merging in a subsequent release.

Drive Breakdown

In this section I’ll cover how the various storage devices (SSD / HDD) are broken down, partitioned and utilized by the Nutanix platform. NOTE: All of the capacities used are in Base2 Gibibyte (GiB) instead of the Base10 Gigabyte (GB).  Formatting of the drives with a filesystem and associated overheads has also been taken into account.

SSD Devices

SSD devices store a few key items which are explained in greater detail above:

  • Nutanix Home (CVM core)
  • Cassandra (metadata storage) – MORE
  • OpLog (persistent write buffer) – MORE
  • Extent Store (persistent storage) – MORE

Below we show an example of the storage breakdown for a Nutanix node’s SSD(s):


NOTE: The sizing for OpLog is done dynamically as of release 4.0.1 which will allow the extent store portion to grow dynamically.  The values used are assuming a completely utilized OpLog.  Graphics and proportions aren’t drawn to scale.  When evaluating the Remaining GiB capacities do so from the top down.  For example the Remaining GiB to be used for the OpLog calculation would be after Nutanix Home and Cassandra have been subtracted from the formatted SSD capacity. Most models ship with 1 or 2 SSDs, however the same construct applies for models shipping with more SSD devices. For example, if we apply this to an example 3060 or 6060 node which has 2 x 400GB SSDs this would give us 100GiB of OpLog, 40GiB of Content Cache and ~440GiB of Extent Store SSD capacity per node.  Storage for Cassandra is a minimum reservation and may be larger depending on the quantity of data.


For a 3061 node which has 2 x 800GB SSDs this would give us 100GiB of OpLog, 40GiB of Content Cache and ~1.1TiB of Extent Store SSD capacity per node.


HDD Devices

Since HDD devices are primarily used for bulk storage, their breakdown is much simpler:

  • Curator Reservation (Curator storage) – MORE
  • Extent Store (persistent storage)

NDFS_HDD_breakdown   For example, if we apply this to an example 3060 node which has 4 x 1TB HDDs this would give us 80GiB reserved for Curator and ~3.4TiB of Extent Store HDD capacity per node. NDFS_HDD_3060 For a 6060 node which has 4 x 4TB HDDs this would give us 80GiB reserved for Curator and ~14TiB of Extent Store HDD capacity per node. NDFS_HDD_6060 NOTE: the above values are accurate as of 4.0.1 and may vary by release.

How It Works

Data Protection

For a video explanation you can watch the following video below:

The Nutanix platform currently uses a resiliency factor aka replication factor (RF) and checksum to ensure data redundancy and availability in the case of a node or disk failure or corruption.  As explained above the OpLog acts as a staging area to absorb incoming writes onto a low-latency SSD tier.  Upon being written to the local OpLog the data is synchronously replicated to another one or two Nutanix CVM’s OpLog (dependent on RF) before being acknowledged (Ack) as a successful write to the host.  This ensures that the data exists in at least two or three independent locations and is fault tolerant. NOTE: For RF3 a minimum of 5 nodes is required since metadata will be RF5.  Data RF is configured via Prism and is done at the container level. All nodes participate in OpLog replication to eliminate any “hot nodes” and ensuring linear performance at scale.  While the data is being written a checksum is computed and stored as part of its metadata. Data is then asynchronously drained to the extent store where the RF is implicitly maintained.  In the case of a node or disk failure the data is then re-replicated among all nodes in the cluster to maintain the RF.  Any time the data is read the checksum is computed to ensure the data is valid.  In the event where the checksum and data don’t match the replica of the data will be read and will replace the non-valid copy. Below we show an example of what this logically looks like: NDFS_OplogReplication

Data Locality

For a video explanation you can watch the following video:

Being a converged (compute+storage) platform, I/O and data locality is key to cluster and VM performance with Nutanix.  As explained above in the I/O path, all read/write IOs are served by the local Controller VM (CVM) which is on each hypervisor adjacent to normal VMs.  A VM’s data is served locally from the CVM and sits on local disks under the CVM’s control.  When a VM is moved from one hypervisor node to another (or during a HA event) the newly migrated VM’s data will be served by the now local CVM. When reading old data (stored on the now remote node/CVM) the I/O will be forwarded by the local CVM to the remote CVM.  All write I/Os will occur locally right away.  NDFS will detect the I/Os are occurring from a different node and will migrate the data locally in the background allowing for all read I/Os to now be served locally.  The data will only be migrated on a read as to not flood the network. Below we show an example of how data will “follow” the VM as it moves between hypervisor nodes: NDFS_Locality3

Scalable Metadata

For a video explanation you can watch the following video below:

Metadata is at the core of any intelligent system and is even more critical for any filesystem or storage array.  In terms of NDFS there are a few key structs that are critical for its success: it has to be right 100% of the time (aka. “strictly consistent”), it has to be scalable,  and it has to perform, at massive scale.  As mentioned in the architecture section above, NDFS utilizes a “ring like” structure as a key-value store which stores essential metadata as well as other platform data (eg. stats, etc.). In order to ensure metadata availability and redundancy a RF is utilized among an odd amount of nodes (eg. 3, 5, etc.). Upon a metadata write or update the row is written to a node in the ring and then replicated to n number of peers (where n is dependent on cluster size).  A majority of nodes must agree before anything is committed which is enforced using the paxos algorigthm.  This ensures strict consistency for all data and metadata stored as part of the platform. Below we show an example of a metadata insert/update for a 4 node cluster:


Performance at scale is also another important struct for NDFS metadata.  Contrary to traditional dual-controller or “master” models, each Nutanix node is responsible for a subset of the overall platform’s metadata.  This eliminates the traditional bottlenecks by allowing metadata to be served and manipulated by all nodes in the cluster.  A consistent hashing scheme is utilized to minimize the redistribution of keys during cluster size modifications (aka. “add/remove node”) When the cluster scales (eg. from 4 to 8 nodes), the nodes are inserted throughout the ring between nodes for “block awareness” and reliability. Below we show an example of the metadata “ring” and how it scales: Cassandra_ring

Shadow Clones

For a video explanation you can watch the following video:

The Nutanix Distributed Filesystem has a feature called ‘Shadow Clones’ which allows for distributed caching of particular vDisks or VM data which is in a ‘multi-reader’ scenario.  A great example of this is during a VDI deployment many ‘linked clones’ will be forwarding read requests to a central master or ‘Base VM’.  In the case of VMware View this is called the replica disk and is read by all linked clones and in XenDesktop this is called the MCS Master VM.  This will also work in any scenario which may be a multi-reader scenario (eg. deployment servers, repositories, etc.). Data or I/O locality is critical for the highest possible VM performance and a key struct of NDFS.  With Shadow Clones, NDFS will monitor vDisk access trends similar to what it does for data locality.  However in the case there are requests occurring from more than two remote CVMs (as well as the local CVM), and all of the requests are read I/O, the vDisk will be marked as immutable.  Once the disk has been marked as immutable the vDisk can then be cached locally by each CVM making read requests to it (aka Shadow Clones of the base vDisk). This will allow VMs on each node to read the Base VM’s vDisk locally. In the case of VDI, this means the replica disk can be cached by each node and all read requests for the base will be served locally.  NOTE:  The data will only be migrated on a read as to not flood the network and allow for efficient cache utilization.  In the case where the Base VM is modified the Shadow Clones will be dropped and the process will start over.  Shadow clones are enabled by default (as of 4.0.2) and can be enabled/disabled using the following NCLI command: ncli cluster edit-params enable-shadow-clones=<true/false>. Below we show an example of how Shadow Clones work and allow for distributed caching: ndfs_shadowclone_14pt

Elastic Dedupe Engine

For a video explanation you can watch the following video below:

The Elastic Dedupe Engine is a software based feature of NDFS which allows for data deduplication in the capacity (HDD) and performance (SSD/Memory) tiers.  Streams of data are fingerprinted during ingest using a SHA-1 hash at a 16K granularity.  This fingerprint is only done on data ingest and is then stored persistently as part of the written block’s metadata.  NOTE: Initially a 4K granularity was used for fingerprinting, however after testing 16K offered the best blend of dedupability with reduced metadata overhead.  When deduped data is pulled into the cache this is done at 4K. Contrary to traditional approaches which utilize background scans, requiring the data to be re-read, Nutanix performs the fingerprint in-line on ingest.  For duplicate data that can be deduplicated in the capacity tier the data does not need to be scanned or re-read, essentially duplicate copies can be removed. Below we show an example of how the Elastic Dedupe Engine scales and handles local VM I/O requests: NDFS_EDE_OnDisk2 Fingerprinting is done during data ingest of data with an I/O size of 64K or greater.  Intel acceleration is leveraged for the SHA-1 computation which accounts for very minimal CPU overhead.  In cases where fingerprinting is not done during ingest (eg. smaller I/O sizes), fingerprinting can be done as a background process. The Elastic Deduplication Engine spans both the capacity disk tier (HDD), but also the performance tier (SSD/Memory).  As duplicate data is determined, based upon multiple copies of the same fingerprints, a background process will remove the duplicate data using the NDFS Map Reduce framework (curator). For data that is being read, the data will be pulled into the NDFS Content Cache which is a multi-tier/pool cache.  Any subsequent requests for data having the same fingerprint will be pulled directly from the cache.  To learn more about the Content Cache and pool structure, please refer to the ‘Content Cache’ sub-section in the I/O path overview, or click HERE. Below we show an example of how the Elastic Dedupe Engine interacts with the NDFS I/O path: NDFS_IO_FPv3 You can view the current deduplication rates via Prism on the Storage > Dashboard page.


For a video explanation you can watch the following video below:

The Nutanix Capacity Optimization Engine (COE) is responsible for performing data transformations to increase data efficiency on disk.  Currently compression is one of the key features of the COE to perform data optimization. NDFS provides both in-line and post-process flavors of compression to best suit the customer’s needs and type of data.  In-line compression will compress sequential streams of data or large I/O sizes in memory before it is written to disk, while post-process compression will initially write the data as normal (in an un-compressed state) and then leverage the Curator framework to compress the data cluster wide. When in-line compression is enabled but the I/Os are random in nature, the data will be written un-compressed in the OpLog, coalesced and then compressed in memory before being written to the Extent Store. The Google Snappy compression library is leveraged which provides good compression ratios with minimal computational overhead and extremely fast compression / decompression rates. Below we show an example of how in-line compression interacts with the NDFS write I/O path: NDFS_InComp   For post-process compression all new write I/O is written in an un-compressed state and follows the normal NDFS I/O path.  After the compression delay (configurable) is met and the data has become cold (down-migrated to the HDD tier via ILM) the data is eligible to become compressed. Post-process compression uses the Curator MapReduce framework and all nodes will perform compression tasks.  Compression tasks will be throttled by Chronos. Below we show an example of how post-process compression interacts with the NDFS write I/O path: NDFS_PpCompv2   For read I/O the data is first decompressed in memory and then the I/O is served.  For data that is heavily accessed the data will become decompressed in the HDD tier and can then leverage ILM to move up to the SSD tier as well as be stored in the cache. Below we show an example of how decompression interacts with the NDFS I/O path during read: NDFS_Comp_Readv2   You can view the current compression rates via Prism on the Storage > Dashboard page.

Networking and I/O

For a video explanation you can watch the following video:

The Nutanix platform does not leverage any backplane for inter-node communication and only relies on a standard 10GbE network.  All storage I/O for VMs running on a Nutanix node is handled by the hypervisor on a dedicated private network.  The I/O request will be handled by the hypervisor which will then forward the request to the private IP on the local CVM.  The CVM will then perform the remote replication with other Nutanix nodes using its external IP over the public 10GbE network. For all read requests these will be served completely locally in most cases and never touch the 10GbE network. This means that the only traffic touching the public 10GbE network will be NDFS remote replication traffic and VM network I/O.  There will however be cases where the CVM will forward requests to other CVMs in the cluster in the case of a CVM being down or data being remote.  Also, cluster wide tasks such as disk balancing will temporarily generate I/O on the 10GbE network. Below we show an example of how the VM’s I/O path interacts with the private and public 10GbE network: NDFS_Network

Data Path Resiliency

For a video explanation you can watch the following video:

Reliability and resiliency is a key, if not the most important, concept within NDFS or any primary storage platform.

Contrary to traditional architectures which are built around the idea that hardware will be reliable; Nutanix takes a different approach: it expects hardware will eventually fail.  By doing so, the system is designed to handle these failures in an elegant and non-disruptive manner.

NOTE: that doesn’t mean the hardware quality isn’t there, just a concept shift.  The Nutanix hardware and QA teams undergo an exhaustive qualification and vetting process.

Potential levels of failure

Being a distributed system NDFS is built to handle component, service and CVM failures, which can be characterized on a few levels:

  • Disk Failure
  • CVM “Failure”
  • Node Failure
Disk Failure

A disk failure can be characterized as just that, a disk which has either been removed, had a dye failure, or is experiencing I/O errors and has been proactively removed.

VM impact:

  • HA event: No
  • Failed I/Os: No
  • Latency: No impact

In the event of a disk failure, a Curator scan (MapReduce Framework) will occur immediately.  It will scan the metadata (Cassandra) to find the data previously hosted on the failed disk and the nodes / disks hosting the replicas.

Once it has found that data that needs to be “re-replicated” it will distribute the replication tasks to the nodes throughout the cluster.

An important thing to highlight here is given how Nutanix distributes data and replicas across all nodes / CVMs / disks; all nodes / CVMs / disks will participate in the re-replication.

This substantially reduces the time required for re-protection, as the power of the full cluster can be utilized; the larger the cluster, the faster the re-protection.

CVM “Failure”

A “CVM failure” can be characterized as a CVM power action causing the CVM to be temporarily unavailable.  The system is designed to transparently handle these gracefully.  In the event of a failure, I/Os will be re-directed to other CVMs within the cluster.  The mechanism for this will vary by hypervisor.

The rolling upgrade process actually leverages this capability as it will upgrade one CVM at a time, iterating through the cluster.

VM impact:

  • HA event: No
  • Failed I/Os: No
  • Latency: Potentially higher given I/Os over the network

In the event of a “CVM failure” the I/O which were previously being served from the down CVM will be forwarded to other CVMs throughout the cluster.  ESXi and Hyper-V handle this via a process called CVM Autopathing which leverages (like “happy”), where it will modify the routes to forward traffic going to the internal address ( to the external IP of other CVMs throughout the cluster.  This way the datastore remains in-tact, just the CVM responsible for serving the I/Os is remote.

Once the local CVM comes back up and is stable the route would be removed and the local CVM would take over all new I/Os.

In the case of KVM, iSCSI multi-pathing is leveraged where the primary path is the local CVM and the 2 other paths would be remote.  In the event where the primary path fails one of the other paths will become active.

Similar to Autopathing with ESXi and Hyper-V, when the local CVM comes back online, it’ll take over as the primary path.

Node Failure

VM Impact:

  • HA event: Yes
  • Failed I/Os: No
  • Latency: No impact

In the event of a node failure a VM HA event will occur restarting the VMs on other nodes throughout the virtualization cluster.  Once restarted the VMs will continue to perform I/Os as usual which will be handled by their local CVMs.

Similar to the disk failure scenario above, the same process will take place to re-protect the data, just for the full node (all associated disks).

In the event where the node remains down for a prolonged period of time, the down CVM will be removed from the metadata ring.  It will be joined back into the ring after it has been up and stable for a duration of time.

Disk Balancing

For a video explanation you can watch the following video below:

NDFS is designed to be a very dynamic platform which can react to various workloads as well as allow heterogeneous node types: compute heavy (3050, etc.) and storage heavy (60X0, etc.) to be mixed in a single cluster.  Ensuring uniform distribution of data is an important item when mixing nodes with larger storage capacities. NDFS has a native feature called disk balancing which is used to ensure uniform distribution of data throughout the cluster.  Disk balancing works on a node’s utilization of its local storage capacity and is integrated with NDFS ILM.  Its goal is to keep utilization uniform among nodes once the utilization has breached a certain threshold. Below we show an example of a mixed cluster (3050 + 6050) in a “unbalanced” state: NDFS_Diskbalancing_unbalanced Disk balancing leverages the NDFS Curator framework and is run as a scheduled process as well as when a threshold has been breached (eg. local node capacity utilization > n %).  In the case where the data is not balanced Curator will determine which data needs to be moved and will distribute the tasks to nodes in the cluster. In the case where the node types are homogeneous (eg. 3050) utilization should be fairly uniform. However, if there are certain VMs running on a node which are writing much more data than others there can become a skew in the per node capacity utilization.  In this case disk balancing would run and move the coldest data on that node to other nodes in the cluster. In the case where the node types are heterogeneous (eg. 3050 + 6020/50/70), or where a node may be used in a “storage only” mode (not running any VMs), there will likely be a requirement to move data. Below we show an example the mixed cluster after disk balancing has been run in a “balanced” state: NDFS_Diskbalancing_balanced In some scenarios customers might run some nodes in a “storage only” state where only the CVM will run on the node whose primary purpose is bulk storage capacity.  In this case the full nodes memory can be added to the CVM to provide a much larger read cache. Below we show an example of how a storage only node would look in a mixed cluster with disk balancing moving data to it from the active VM nodes: NDFS_Diskbalancing_storage

Software-Defined Controller Architecture

As mentioned above (likely numerous times), the Nutanix platform is a software based solution which ships as a bundled software + hardware appliance.  The controller VM is where the vast majority of the Nutanix software and logic sits and was designed from the beginning to be an extensible and pluggable architecture. A key benefit to being software defined and not relying upon any hardware offloads or constructs is around extensibility.  Like with any product life cycle there will always be advancements and new features which are introduced.  By not relying on any custom ASIC/FPGA or hardware capabilities, Nutanix can develop and deploy these new features through a simple software update.  This means that the deployment of a new feature (say deduplication) can be deployed by upgrading the current version of the Nutanix software.  This also allows newer generation features to be deployed on legacy hardware models. For example, say you’re running a workload running an older version of Nutanix software on a prior generation hardware platform (eg. 2400).  The running software version doesn’t provide deduplication capabilities which your workload could benefit greatly from.  To get these features you perform a rolling upgrade of the Nutanix software version while the workload is running, and whala you now have deduplication.  It’s really that easy. Similar to features, the ability to create new “adapters” or interfaces into NDFS is another key capability.  When the product first shipped it solely supported iSCSI for I/O from the hypervisor, this has now grown to include NFS and SMB.  In the future there is the ability to create new adapters for various workloads and hypervisors (HDFS, etc.).  And again, all deployed via a software update. This is contrary to mostly all legacy infrastructures as a hardware upgrade or software purchase was normally required to get the “latest and greatest” features.  With Nutanix it’s different, since all features are deployed in software they can run on any hardware platform, any hypervisor and be deployed through simple software upgrades. Below we show a logical representation of what this software-defined controller framework looks like: SD_Controller_Arch2

Storage Tiering and Prioritization

The Disk Balancing section above talked about how storage capacity was pooled among all nodes in a Nutanix cluster and that ILM would be used to keep hot data local.  A similar concept applies to disk tiering in which the cluster’s SSD and HDD tiers are cluster wide and NDFS ILM is responsible for triggering data movement events. A local node’s SSD tier is always the highest priority tier for all I/O generated by VMs running on that node, however all of the cluster’s SSD resources are made available to all nodes within the cluster.  The SSD tier will always offer the highest performance and is a very important thing to manage for hybrid arrays. The tier prioritization can be classified at a high-level by the following: NDFS_Tier_HighLevel2 Specific types of resources (eg. SSD, HDD, etc.) are pooled together and form a cluster wide storage tier.  This means that any node within the cluster can leverage the full tier capacity, regardless if it is local or not. Below we show a high level example of how this pooled tiering looks: NDFS_Tier_Pooling A common question is what happens when a local node’s SSD becomes full?  As mentioned in the Disk Balancing section a key concept is trying to keep uniform utilization of devices within disk tiers.  In the case where a local node’s SSD utilization is high, disk balancing will kick in to move the coldest data on the local SSDs to the other SSDs throughout the cluster.  This will free up space on the local SSD to allow the local node to write to SSD locally instead of going over the network.  A key point to mention is that all CVMs and SSDs are used for this remote I/O to eliminate any potential bottlenecks and remediate some of the hit by performing I/O over the network. NDFS_Tier_Utilization2 The other case is when the overall tier utilization breaches a specific threshold [curator_tier_usage_ilm_threshold_percent (Default=75)] where NDFS ILM will kick in and as part of a Curator job will down-migrate data from the SSD tier to the HDD tier.  This will bring utilization within the threshold mentioned above or free up space by the following amount [curator_tier_free_up_percent_by_ilm (Default=15)], whichever is greater. The data for down-migration is chosen using last access time. In the case where the SSD tier utilization is 95%, 20% of the data in the SSD tier will be moved to the HDD tier (95% –> 75%).  However, if the utilization was 80% only 15% of the data would be moved to the HDD tier using the minimum tier free up amount. NDFS_Tier_DownMigration NDFS ILM will constantly monitor the I/O patterns and (down/up)-migrate data as necessary as well as bring the hottest data local regardless of tier.

Storage Layers and Monitoring

The Nutanix platform monitors storage at multiple layers throughout the stack ranging from the VM/Guest OS all the way down to the physical disk devices.  Knowing the various tiers and how these relate is important whenever monitoring the solution and allows you to get full visibility of how the ops relate. Below we show the various layers of where operations are monitored and the relative granularity which are explained below: NDFS_MetricsTiers3

Virtual Machine Layer
  • Key Role: Metrics reported by the hypervisor for the VM
  • Description: Virtual Machine or guest level metrics are pulled directly from the hypervisor and represent the performance the VM is seeing and is indicative of the I/O performance the application is seeing.
  • When to use: When troubleshooting or looking for VM level detail
Hypervisor Layer
  • Key Role: Metrics reported by the Hypervisor(s)
  • Description: Hypervisor level metrics are pulled directly from the hypervisor and represent the most accurate metrics the hypervisor(s) are seeing.  This data can be viewed for one of more hypervisor node(s) or the aggregate cluster.  This layer will provide the most accurate data in terms of what performance the platform is seeing and should be leveraged in most cases.  In certain scenarios the hypervisor may combine or split operations coming from VMs which can show the difference in metrics reported by the VM and hypervisor.  These numbers will also include cache hits served by the Nutanix CVMs.
  • When to use: Most common cases as this will provide the most detailed and valuable metrics
Controller Layer
  • Key Role: Metrics reported by the Nutanix Controller(s)
  • Description: Controller level metrics are pulled directly from the Nutanix Controller VMs (eg. Stargate 2009 page) and represent what the Nutanix front-end is seeing from NFS/SMB/iSCSI or any back-end operations (eg. ILM, disk balancing, etc.).  This data can be viewed for one of more Controller VM(s) or the aggregate cluster.  The metrics seen by the Controller Layer should normally match those seen by the hypervisor layer, however will include any backend operations (eg. ILM, disk balancing). These numbers will also include cache hits served by memory.  In certain cases metrics like (IOPS) might not match asthe NFS / SMB / iSCSI client might split a large IO into multiple smaller IOPS.  However, metrics like bandwidth should match.
  • When to use: Similar to the hypervisor layer, can be used to show how much backend operation is taking place
Disk Layer
  • Key Role: Metrics reported by the Disk Device(s)
  • Description: Disk level metrics are pulled directly from the physical disk devices (via the CVM) and represent what the back-end is seeing.  This includes data hitting the OpLog or Extent Store where an I/O is performed on the disk.  This data can be viewed for one of more disk(s), the disk(s) for a particular node or the aggregate disks in the cluster.  In common cases it is expected that the disk ops should match the number of incoming writes as well as reads not served from the memory portion of the cache.  Any reads being served by the memory portion of the cache will not be counted here as the op is not hitting the disk device.
  • When to use: When looking to see how many ops are served from cache or hitting the disks

APIs and Interfaces

Core to any dynamic or “Software Defined” environment, Nutanix provides a vast array of interfaces allowing for simple programmability and interfacing. Here are the main interfaces:

  • NCLI
  • Scripting interfaces – more coming here soon :)

Core to this is the REST API which exposes every capability and data point of the Prism UI and allows for orchestration or automation tools to easily drive Nutanix action.  This enables tools like VMware’s vCAC or Microsoft’s System Center Orchestrator to easily create custom workflows for Nutanix. Also, this means that any 3rd party developer could create their own custom UI and pull in Nutanix data via REST. Below we show a small snippet of the Nutanix REST API explorer which allows developers to see the API and format: RestAPI Operations can be expanded to display details and examples of the REST call: RestAPI2

Availability Domains

For a video explanation you can watch the following video below:

Availability Domains aka node/block/rack awareness is a key struct for distributed systems to abide by for determining component and data placement.  NDFS is currently node and block aware, however this will increase to rack aware as cluster sizes grow.  Nutanix refers to a “block” as the chassis which contains either one, two or four server “nodes”. NOTE: at minimum of 3 blocks must be utilized for block awareness to be activated, otherwise node awareness will be defaulted to.  It is recommended to utilized uniformly populated blocks to ensure block awareness is enabled.  Common scenarios and the awareness level utilized can be found at the bottom of this section.  The 3 block requirement is due to ensure quorum. For example a 3450 would be a block which holds 4 nodes.  The reason for distributing roles or data across blocks to ensure if a block fails or needs maintenance the system can continue to run without interruption.  NOTE: Within a block the redundant PSU and fans are the only shared components Awareness can be broken into a few key focus areas:

  • Data (The VM data)
  • Metadata (Cassandra)
  • Configuration Data (Zookeeper)

With NDFS data replicas will be written to other blocks in the cluster to ensure that in the case of a block failure or planned downtime, the data remains available.  This is true for both RF2 and RF3 scenarios as well as in the case of a block failure. An easy comparison would be “node awareness” where a replica would need to be replicated to another node which will provide protection in the case of a node failure.  Block awareness further enhances this by providing data availability assurances in the case of block outages. Below we show how the replica placement would work in a 3 block deployment: NDFS_BlockAwareness_DataNorm In the case of a block failure, block awareness will be maintained and the re-replicated blocks will be replicated to other blocks within the cluster: NDFS_BlockAwareness_DataFail2


As mentioned in the Scalable Metadata section above, Nutanix leverages a heavily modified Cassandra platform to store metadata and other essential information.  Cassandra leverages a ring-like structure and replicates to n number of peers within the ring to ensure data consistency and availability. Below we show an example of the Cassandra ring for a 12 node cluster: NDFS_CassandraRing_12Node3 Cassandra peer replication iterates through nodes in a clockwise manner throughout the ring.  With block awareness the peers are distributed among the blocks to ensure no two peers are on the same block. Below we show an example node layout translating the ring above into the block based layout: NDFS_CassandraRing_BlockLayout_Write2 With this block aware nature, in the event of a block failure there will still be at least two copies of the data (with Metadata RF3 – In larger clusters RF5 can be leveraged). Below we show an example of all of the nodes replication topology to form the ring (yes – its a little busy): NDFS_CassandraRing_BlockLayout_Full

Configuration Data

Nutanix leverages Zookeeper to store essential configuration data for the cluster.  This role is also distributed in a block aware manner to ensure availability in the case of a block failure. Below we show an example layout showing 3 Zookeeper nodes distributed in a block aware manner: NDFS_Zookeeper_BlockLayout In the event of a block outage, meaning on of the Zookeeper nodes will be gone, the Zookeeper role would be transferred to another node in the cluster as shown below: NDFS_Zookeeper_BlockLayout_Fail   Below we breakdown some common scenarios and what level of awareness will be utilized:

  • < 3 blocks –> NODE awareness
  • 3+ blocks uniformly populated –> BLOCK + NODE awareness
  • 3+ blocks not uniformly populated
    • If SSD or HDD tier variance between blocks is > max variance –> NODE awareness
      • Example: 2 x 3450 + 1 x 3150
    • If SSD and HDD tier variance between blocks is < max variance  –> BLOCK + NODE awareness
      • Example: 2 x 3450 + 1 x 3350
    • NOTE: max tier variance is calculated as: 100 / (RF+1)
      • Eg. 33% for RF2 or 25% for RF3

Snapshots & Clones

For a video explanation you can watch the following video below:

NDFS provides native support for offloaded snapshots and clones which can be leveraged via VAAI, ODX, ncli, REST, Prism, etc.  Both the snapshots and clones leverage the redirect-on-write algorithm which is the most effective and efficient. As explained in the Data Structure section above, a virtual machine consists of files (vmdk/vhdx) which are vDisks on the Nutanix platform.  A vDisk is composed of extents which are logically contiguous chunks of data, which are stored within extent groups which are physically contiguous data  stored as files on the storage devices. When a snapshot or clone is taken the base vDisk is marked immutable and another vDisk is created as read/write.  At this point both vDisks have the same block map, which is a metadata mapping of the vDisk to its corresponding extents. Contrary to traditional approaches which require traversal of the snapshot chain, which can add read latency, each vDisk has its own block map.  This eliminates any of the overhead normally seen by large snapshot chain depths and allows you to take continuous snapshots without any performance impact. Below we show an example of how this works when a snapshot is taken (NOTE: I need to give some credit to NTAP as a base for these diagrams as I thought their representation was the clearest): NDFS_Snap3Stagev2   The same method applies when a snapshot or clone of a previously snapped or cloned vDisk is performed: NDFS_SnapMulti The same methods are used for both snapshots and/or clones of a VM or vDisk(s).  When a VM or vDisk is cloned the current block map is locked and the clones are created.  These updates are metadata only so no I/O actually takes place.  The same method applies for clones of clones; essentially the previously cloned VM acts as the “Base vDisk” and upon cloning that block map is locked and two “clones” are created: one for the VM being cloned and another for the new clone.  They both inherit the prior block map and any new writes/updates would take place on their individual block maps. NDFS_CloneBase   As mentioned prior each VM/vDisk has its own individual block map.  So in the above example, all of the clones from the base VM would now own their block map and any write/update would occur there.  Below we show an example of what this looks like: NDFS_CloneWriteIO   Any subsequent clones or snapshots of a VM/vDisk would cause the original block map to be locked and would create a new one for R/W access.

 Multi-Site Disaster Recovery

For a video explanation you can watch the following video below:

Nutanix provides native DR and replication capabilities which build upon the same features explained in the Snapshots & Clones section.  Cerebro is the component responsible for managing the DR and replication in NDFS.  Cerebro runs on every node and a Cerebro master is elected (similar to NFS master) and is responsible for managing replication tasks.  In the event the CVM acting as Cerebro master fails, another is elected and assumes the role.  The Cerebro page can be found on <CVM IP>:2020. The DR function can be broken down into a few key focus areas:

  • Replication Topologies
  • Replication Lifecycle
  • Implementation Constructs
  • Global Deduplication
Replication Topologies

Traditionally there are a few key replication topologies: Site to site, Hub and spoke and Full and/or Partial mesh.  Contrary to traditional solutions which only allow for Site to site or hub and spoke, Nutanix provides a fully mesh or flexible many-to-many model. NDFS_DR_topo Essentially this allows the admin to determine a replication capability that meets their needs.

Replication Lifecycle – new!

Nutanix replication leverages the Cerebro service mentioned above.  The Cerebro service is broken into a “Cerebro Master” which is a dynamically elected CVM and Cerebro Slaves which run on every CVM.  In the event where the CVM acting as the “Cerebro Master” fails a new “Master” is elected.

The Cerebro Master is responsible for managing task delegation to the local Cerebro Slaves as well as coordinating with remote Cerebro Master(s) when remote replication is occurring.

During a replication the Cerebro Master will figure out which data needs to be replicated, and delegate the replication tasks to the Cerebro Slaves which will then tell Stargate which data to replicate and to where.

Below we show a representation of this architecture:



It is also possible to configure a remote site with a proxy which will be used as a bridgehead for all coordination and replication traffic coming from a cluster.

Pro tip: When using a remote site configured with a proxy, always utilize the cluster IP as that will always be hosted by the Prism Leader and available even if CVM(s) go down.

Below we show a representation of the replication architecture using a proxy:



In certain scenarios it is also possible to configure a remote site using a SSH tunnel where all traffic will flow between two CVMs.

Pro tip: This should only be used for non-production scenarios and the cluster IPs should be used to ensure availability.

Below we show a representation of the replication architecture using a SSH tunnel:



Implementation Constructs

Within Nutanix DR there are a few key constructs which are explained below:

Remote Site
  • Key Role: A remote Nutanix cluster
  • Description: A remote Nutanix cluster which can be leveraged as a target for backup or DR purposes.
  • Pro tip: Ensure the target site has ample capacity (compute/storage) to handle a full site failure.  In certain cases replication/DR between racks within a single site can also make sense.
Protection Domain (PD)
  • Key Role: Macro group of VMs and/or files to protect
  • Description: A group of VMs and/or files to be replicated together on a desired schedule.  A PD can protect a full container or you can select individual VMs and/or files
  • Pro tip: Create multiple PDs for various services tiers driven by a desired RPO/RTO.  For file distribution (eg. golden images, ISOs, etc.) you can create a PD with the files to replication.
Consistency Group (CG)
  • Key Role: Subset- of VMs/files in PD to be crash consistent
  • Description: VMs and/or files which are part of a Protection Domain which need to be snapshotted in a crash consistent manner.  This ensures that when VMs/files are recovered they come up in a consistent state.  A protection domain can have multiple consistency groups.
  • Pro tip: Group dependent application or service VMs in a consistency group to ensure they are recovered in a consistent state (eg. App and DB)
Replication Schedule
  • Key Role: Snapshot and replication schedule
  • Description: Snapshot and replication schedule for VMs in a particular PD and CG
  • Pro tip: The snapshot schedule should be equal to your desired RPO
Retention Policy
  • Key Role: Number of local and remote snapshots to keep
  • Description: The retention policy defines the number of local and remote snapshots to retain.  NOTE: A remote site must be configured for a remote retention/replication policy to be configured.
  • Pro tip: The retention policy should equal the number of restore points required per VM/file

Below we show a logical representation of the relationship between a PD, CG and VM/Files for a single site: NDFS_PD_Structure It’s important to mention that a full container can be protected for simplicity, however the platform provides the ability to protect down to the granularity of a single VM and/or file level.

Global Deduplication

As explained in the Elastic Dedup Engine section above, NDFS has the ability to deduplicate data by just updating metadata pointers. The same concept is applied to the DR and replication feature.  Before sending data over the wire, NDFS will query remote site and check whether or not the fingerprint(s) already exist on the target (meaning the data already exists).  If so, no data will be shipped over the wire and only a metadata update will occur. For data which doesn’t exist on the target the data will be compressed and sent to the target site.  At this point the data exists on both sites is usable for deduplication. Below we show an example three site deployment where each site contains one of more protection domains (PD): NDFS_DR_Dedup

Cloud Connect

Building upon the native DR / replication capabilities of NDFS, Cloud Connect extends this capability into cloud providers (currently AWS).  NOTE: this feature is currently limited to just backup / replication.

Very similar to creating a remote site to be used for native DR / replication a “cloud remote site” is just created.  When a new cloud remote site is created, Nutanix will automatically spin up an instance in EC2 (currently m1.xlarge) to be used as the endpoint.

The Amazon Machine Image (AMI) running in AWS is based upon the same NOS code-base leveraged for locally running clusters.  This means that all of the native replication capabilities (e.g. global deduplication, delta based replications, etc.) can be leveraged.

In the case where multiple Nutanix clusters are leveraging Cloud Connect, they can either A) share the same AMI instance running in the region or B) spin up a new instance.

Below we show a logical representation of an AWS based “remote site” used for Cloud Connect:


Since an AWS based remote site is similar to any other Nutanix remote site, a cluster can replicate to multiple regions if higher availability is required (e.g. data availability in the case of a full region outage):


The same replication / retention policies are leveraged for data replicated using Cloud Connect.  As data / snapshots become stale, or expire, the Nutanix CVM in AWS will clean up data as necessary.

If replication isn’t frequently occurring (e.g. daily or weekly), the platform can be configured to power up the AWS CVM(s) prior to a scheduled replication and down after a replication has completed.

Data that is replicated to any AWS region can also be pulled down and restored to any existing, or newly created, Nutanix cluster which has the AWS remote site(s) configured:



Important Pages

These are advanced Nutanix pages besides the standard user interface that allow you to monitor detailed stats and metrics.  The URLs are formatted in the following way: http://<Nutanix CVM IP/DNS>:<Port/path (mentioned below)>  Example: http://MyCVM-A:2009  NOTE: if you’re on a different subnet IPtables will need to be disabled on the CVM to access the pages.

# 2009 Page
  • This is a Stargate page used to monitor the back end storage system and should only be used by advanced users.  I’ll have a post that explains the 2009 pages and things to look for.
# 2009/latency Page
  • This is a Stargate page used to monitor the back end latency
# 2009/h/traces Page
  • This is the Stargate page used to monitor activity traces for operations
# 2009/h/vars Page
  • This is the Stargate page used to monitor various counters
# 2010 Page
  • This is the Curator page which is used for monitoring curator runs
# 2010/master/control Page
  • This is the Curator control page which is used to manually start Curator jobs
# 2011 Page
  • This is the Chronos page which monitors jobs and tasks scheduled by curator
# 2020 Page
  •  This is the Cerebro page which monitors the protection domains, replication status and DR
# 2020/h/traces Page
  • This is the Cerebro page used to monitor activity traces for PD operations and replication

Cluster Commands

# Check cluster status

 # Check local CVM service status

 # Nutanix cluster upgrade

 # Restart cluster service from CLI

 # Start cluster service from CLI

 # Restart local service from CLI

 # Start local service from CLI

 # Cluster add node from cmdline

 # Find number of vDisks

 # Find cluster id

 # Open port

# Check for Shadow Clones

 # Reset Latency Page Stats

 # Find Number of vDisks

 # Start Curator scan from CLI

 # Compact ring

 # Find NOS version

 # Find CVM version

 # Manually fingerprint vDisk(s)

 # Echo Factory_Config.json for all cluster nodes

  # Upgrade a single Nutanix node’s NOS version

 # Install Nutanix Cluster Check (NCC)

 # Run Nutanix Cluster Check (NCC)


NOTE: All of these actions can be performed via the HTML5 GUI.  I just use these commands as part of my bash scripting to automate tasks.

# Add subnet to NFS whitelist

# Display Nutanix Version

 # Display hidden NCLI options

# List Storage Pools

 # List containers

 # Create container

 # List VMs

 # List public keys

 # Add public key

 # Remove public key

# Create protection domain

 # Create remote site

 # Create protection domain for all VMs in container

 # Create protection domain with specified VMs

# Create protection domain for NDFS files (aka vDisk)

 # Create snapshot of protection domain

 # Create snapshot and replication schedule to remote site

 # List replication status

# Migrate protection domain to remote site

 # Activate protection domain

 # Enable NDFS Shadow Clones

# Enable Dedup for vDisk

PowerShell CMDlets

The below will cover the Nutanix PowerShell CMDlets, how to use them and some general background on Windows PowerShell.


Windows PowerShell is a powerful shell (hence the name ;P) and scripting language built on the .NET framework.  It is a very simple to use language and is built to be intuitive and interactive.  Within PowerShell there are a few key constructs/Items:


CMDlets are commands or .NET classes which perform a particular operation.  They are usually conformed to the Getter/Setter methodology and typically use a <Verb>-<Noun> based structure.  For example: Get-Process, Set-Partition, etc.

Piping or Pipelining

Piping is an important construct in PowerShell (similar to it’s use in Linux) and can greatly simplify things when used correctly.  With piping you’re essentially taking the output of one section of the pipeline and using that as input to the next section of the pipeline.  The pipeline can be as long as required (assuming there remains output which is being fed to the next section of the pipe). A very simple example could be getting the current processes, finding those that match a particular trait or filter and then sorting them:

Piping can also be used in place of for-each, for example:

More information on piping can be found HERE

Key Object Types

Below are a few of the key object types in PowerShell.  You can easily get the object type by using the .getType() method, for example: $someVariable.getType() will return the objects type. Variable


Hash Table

Useful commands

Core Nutanix CMDlets and Usage

# Download Nutanix CMDlets Installer The Nutanix CMDlets can be downloaded directly from the Prism UI (post 4.0.1) and can be found on the drop down in the upper right hand corner: CMDletsInstaller2 # Install Nutanix Snappin (Note: Needs to be run after the CMDlets installer has been run, we’re working to make this part of the installer)

# Load Nutanix Snappin

# List Nutanix CMDlets

# Connect to a Nutanix Cluster

# Connect to Multiple Nutanix Clusters

# Get Nutanix VMs matching a certain search string

# Get Nutanix vDisks

# Get Nutanix Containers

# Get Nutanix Protection Domains

# Get Nutanix Consistency Groups

Example Scripts:

Much more to come here! :)

Metrics & Thresholds

The below will cover specific metrics and thresholds on the Nutanix back end.  More updates to these coming shortly!

2009 Stargate – Overview


Start timeThe start time of the Stargate service
Build versionThe build version currently running
Build last commit dateThe last commit date of the build
Stargate handleThe Stargate handle
iSCSI handleThe iSCSI handle
SVM idThe SVM id of Stargate
Incarnation id
Highest allocated opid
Highest contiguous completed opid
Extent cache hitsThe % of read requests served directly from the in-memory extent cache
Extent cache usageThe MB size of the extent cache
Content cache hitsThe % of read requests served directly from the content cache
Content cache flash pagein pct
Content cache memory usageThe MB size of the in-memory content cache
Content cache flash usageThe MB size of the SSD content cache
QoS Queue (size/admitted)The admission control queue size and number of admitted ops
Oplog QoS queue (size/admitted)The oplog queue size and number of admitted ops
NFS Flush Queue (size/admitted)
NFS cache usage
 2009 Stargate – Cluster State


SVM IdThe Id of the Controller
IP:portThe IP:port of the Stargate handle
SSD-PCIeThe SSD-PCIe devices and size/utilization
SSD-SATAThe SSD-SATA devices and size/utilization
DAS-SATAThe HDD-SATA devices  and size/utilization
Container IdThe Id of the container
Container NameThe Name of the container
Max capacity (GB) – Storage poolThe Max capacity of the storage pool
Max capacity (GB) – ContainerThe Max capacity of the container (will normally match the storage pool size)
Reservation (GB) – Total across vdisksThe reservation in GB across vdisks
Reservation (GB) – Admin provisioned
Container usage (GB) – TotalThe total usage in GB per container
Container usage (GB) – ReservedThe reservation used in GB per container
Container usage (GB) – Garbage
Unreserved available (GB) – ContainerThe available capacity in GB per container
Unreserved available (GB) – Storage poolThe available capacity in GB for the storage pool
2009 Stargate – NFS Slave


Vdisk NameThe name of the Vdisk on NDFS
Unstable data – KB
Unstable data – Ops/s
Unstable data – KB/s
Outstanding Ops – ReadThe number of outstanding read ops for the Vdisk
Outstanding Ops – WriteThe number of outstanding write ops for the Vdisk
Ops/s – ReadThe number of current read operations per second for the Vdisk
Ops/s – WriteThe number of current write operations per second for the Vdisk
Ops/s – ErrorThe number of current error (failed) operations per second for the Vdisk
KB/s – ReadThe read throughput in KB/s for the Vdisk
KB/s – WriteThe write throughput in KB/s for the Vdisk
Avg latency (usec) – ReadThe average read op latecy in micro seconds for the Vdisk
Avg latency (usec) – WriteThe average write op latecy in micro seconds for the Vdisk
Avg op sizeThe average op size in bytes for the Vdisk
Avg outstandingThe average outstanding ops for the Vdisk
% busyThe % busy of the Vdisk
Container NameThe name of the container
Outstanding Ops – ReadThe number of outstanding read ops for the container
Outstanding Ops – WriteThe number of outstanding write ops for the container
Outstanding Ops – NS lookupThe number of oustanding NFS lookup ops for the container
Outstanding Ops – NS updateThe number of outstanding NFS update ops for the container
Ops/s – ReadThe number of current read operations per second for the container
Ops/s – WriteThe number of current write operations per second for the container
Ops/s – NS lookupThe number of current NFS lookup ops for the container
Ops/s – NS updateThe number of current NFS update ops for the container
Ops/s – ErrorThe number of current error (failed) operations per second for the container
KB/s – ReadThe read throughput in KB/s for the container
KB/s – WriteThe write throughput in KB/s for the container
Avg latency (usec) – ReadThe average read op latecy in micro seconds for the container
Avg latency (usec) – WriteThe average write op latecy in micro seconds for the container
Avg latency (usec) – NS lookupThe average NFS lookup latency in micro seconds for the container
Avg latency (usec) – NS updateThe average NFS lookup update in micro seconds for the container
Avg op sizeThe average op size in bytes for the container
Avg outstandingThe average outstanding ops for the container
% busyThe % busy of the container
2009 Stargate – Hosted VDisks


Vdisk IdThe Id of the Vdisk on NDFS
Vdisk NameThe name of the Vdisk on NDFS
Usage (GB)The usage in GB per Vdisk
Dedup (GB)
Oplog – KBThe size of the Oplog for the Vdisk
Oplog – FragmentsThe number of fragments of the Oplog for the Vdisk
Oplog – Ops/sThe number of current opeations per second for the Vdisk
Oplog – KB/sThe throughput in KB/s for the Vdisk
Outstanding Ops – ReadThe number of outstanding read ops for the Vdisk
Outstanding Ops – WriteThe number of outstanding write ops for the Vdisk
Outstanding Ops – EstoreThe number of outstanding ops to the extent store for the Vdisk
Ops/s – ReadThe number of current read operations per second for the Vdisk
Ops/s – WriteThe number of current write operations per second for the Vdisk
Ops/s – ErrorThe number of current error (failed) operations per second for the Vdisk
Ops/s – Random
KB/s – ReadThe read throughput in KB/s for the Vdisk
KB/s – WriteThe write throughput in KB/s for the Vdisk
Avg latency (usec)The average op latency in micro seconds for the Vdisk
Avg op sizeThe average op size in bytes for the Vdisk
Avg qlenThe average queue length for the Vdisk
% busy
2009 Stargate – Extent Store


Disk IdThe disk id of the physical device
Mount pointThe mount point of the physical device
Outstanding Ops – QoS QueueThe number of (primary/secondary) ops for the device
Outstanding Ops – ReadThe number of outstanding read ops for the device
Outstanding Ops – WriteThe number of outstanding write ops for the device
Outstanding Ops – Replicate
Outstanding Ops – Read Replica
Ops/s – ReadThe number of current read operations per second for the device
Ops/s – WriteThe number of current write operations per second for the device
Ops/s – ErrorThe number of current error (failed) operations per second for the device
Ops/s – Random
KB/s – ReadThe read throughput in KB/s for the device
KB/s – WriteThe write throughput in KB/s for the device
Avg latency (usec)The average op latency in micro seconds for the device
Avg op sizeThe average op size in bytes for the device
Avg qlenThe average queue length for the device
Avg qdelayThe average queue delay for the device
% busy
Size (GB)
Total usage (GB)The total usage in GB for the device
Unshared usage (GB)
Dedup usage (GB)
Garbage (GB)
EgroupsThe number of extent groups for the device
Corrupt EgroupsThe number of corrupt (bad) extent groups for the device


Coming soon :)


# Find cluster error logs

 # Find cluster fatal logs

Book of vSphere


To be input

How It Works

Array Offloads – VAAI

The Nutanix platform supports the VMware APIs for Arry Integration (VAAI) which allows the hypervisor to offload certain tasks to the array.  This is much more efficient as the hypervisor doesn’t need to be the “man in the middle”. Nutanix currently supports the VAAI primitives for NAS including the ‘full file clone’, ‘fast file clone’ and ‘reserve space’ primitives.  Here’s a good article explaining the various primitives: LINK.  For both the full and fast file clones a NDFS “fast clone” is done meaning a writable snapshot (using re-direct on write) for each clone is created.  Each of these clones has its own block map meaning that chain depth isn’t anything to worry about. The following will determine whether or not VAAI will be used for specific scenarios:

  • Clone VM with Snapshot –> VAAI will NOT be used
  • Clone VM without Snapshot which is Powered Off –> VAAI WILL be used
  • Clone VM to a different Datastore/Container –> VAAI will NOT be used
  • Clone VM which is Powered On  –> VAAI will NOT be used

These scenarios apply to VMware View:

  • View Full Clone (Template with Snapshot) –> VAAI will NOT be used
  • View Full Clone (Template w/o Snapshot) –> VAAI WILL be used
  • View Linked Clone (VCAI) –> VAAI WILL be used

You can validate VAAI operations are taking place by using the ‘NFS Adapter’ Activity Traces page.

CVM Autopathing

Reliability and resiliency is a key, if not the most important, piece to NDFS.  Being a distributed system NDFS is built to handle component, service and CVM failures.  In this section I’ll cover how CVM “failures” are handled (I’ll cover how we handle component failures in future update).  A CVM “failure” could include a user powering down the CVM, a CVM rolling upgrade, or any event which might bring down the CVM. NDFS has a feature called autopathing where when a local CVM becomes unavailable the I/Os are then transparently handled by other CVMs in the cluster. The hypervisor and CVM communicate using a private network on a dedicated vSwitch (more on this above).  This means that for all storage I/Os these are happening to the internal IP addresses on the CVM (  The external IP address of the CVM is used for remote replication and for CVM communication.

Below we show an example of what this looks like: Node_net_IO   In the event of a local CVM failure the local addresses previously hosted by the local CVM is unavailable.  NDFS will automatically detect this outage and will redirect these I/Os to another CVM in the cluster over 10GbE.  The re-routing is done transparently to the hypervisor and VMs running on the host.  This means that even if a CVM is powered down the VMs will still continue to be able to perform I/Os to NDFS.  NDFS is also self-healing meaning it will detect the CVM has been powered off and will automatically reboot or power-on the local CVM.  Once the local CVM is back up and available, traffic will then seamlessly be transferred back and served by the local CVM. Below we show a graphical representation of how this looks for a failed CVM: IO_Autopath2


To be input

Important Pages

To be input

Command Reference

# ESXi cluster upgrade

Performing a rolling reboot of ESXi hosts: For PowerCLI on automated hosts reboots, SEE HERE

# Restart ESXi host services

 # Display ESXi host nics in ‘Up’ state

 # Display ESXi host 10GbE nics and status

 # Display ESXi host active adapters

 # Display ESXi host routing tables

# Check if VAAI is enabled on datastore

 # Set VIB acceptance level to community supported

 # Install VIB

 # Check ESXi ramdisk space

 # Clear pynfs logs

Metrics & Thresholds

To be input


To be input

Book of Hyper-V


To be input

How It Works

Array Offloads – ODX

The Nutanix platform supports the Microsoft Offloaded Data Transfers (ODX) which allows the hypervisor to offload certain tasks to the array.  This is much more efficient as the hypervisor doesn’t need to be the “man in the middle”. Nutanix currently supports the ODX primitives for SMB which include full copy and zeroing operations.  However contrary to VAAI which has a “fast file” clone operation (using writable snapshots) the ODX primitives do not have an equivalent and perform a full copy.  Given this, it is more efficient to rely on the native NDFS clones which can currently be invoked via nCLI, REST, or Powershell CMDlets Currently ODX IS invoked for the following operations:

  • In VM or VM to VM file copy on NDFS SMB share
  • SMB share file copy
  • Deploy template from SCVMM Library (NDFS SMB share) – NOTE: Shares must be added to the SCVMM cluster using short names (eg. not FQDN).  An easy way to force this is to add an entry into the hosts file for the cluster (eg.     nutanix-130).

ODX is NOT invoked for the following operations:

  • Clone VM through SCVMM
  • Deploy template from SCVMM Library (non-NDFS SMB Share)
  • XenDesktop Clone Deployment

You can validate ODX operations are taking place by using the ‘NFS Adapter’ Activity Traces page (yes, I said NFS, even though this is being performed via SMB).  The operations activity show will be ‘NfsSlaveVaaiCopyDataOp‘ when copying a vDisk and ‘NfsSlaveVaaiWriteZerosOp‘ when zeroing out a disk.


To be input

Important Pages

To be input

Command Reference

# Execute command on multiple remote hosts

 # Check available VMQ Offloads

# Disable VMQ for VMs matching a specific prefix

 # Enable VMQ for VMs matching a certain prefix

 # Power-On VMs matching a certain prefix

 # Shutdown VMs matching a certain prefix

 # Stop VMs matching a certain prefix

 # Get Hyper-V host RSS settings

# Check Winsh and WinRM connectivity


Metrics & Thresholds

To be input


To be input


  1. 09-04-2013 | Initial Version
  2. 09-04-2013 | Updated with components section
  3. 09-05-2013 | Updated with I/O path overview section
  4. 09-09-2013 | Updated with converged architecture section
  5. 09-11-2013 | Updated with data structure section
  6. 09-24-2013 | Updated with data protection section
  7. 09-30-2013 | Updated with data locality section
  8. 10-01-2013 | Updated with shadow clones section
  9. 10-07-2013 | Updated with scalable metadata section
  10. 10-11-2013 | Updated with elastic dedupe engine
  11. 11-01-2013 | Updated with networking and I/O section
  12. 11-07-2013 | Updated with CVM autopathing section
  13. 01-23-2014 | Updated with new content structure and layout
  14. 02-10-2014 | Updated with storage layers and monitoring
  15. 02-18-2014 | Updated with array offloads sections
  16. 03-12-2014 | Updated with genesis
  17. 03-17-2014 | Updated spelling and grammar
  18. 03-19-2014 | Updated with apis and interfaces section
  19. 03-25-2014 | Updated script block formatting
  20. 03-26-2014 | Updated with dr & protection domain NCLI commands
  21. 04-15-2014 | Updated with failure domains section and 4.0 updates
  22. 04-23-2014 | Updated with command to echo factory_config.json
  23. 05-28-2014 | Updated with snapshots and clones section
  24. 06-09-2014 | Updated with multi-site disaster recovery section
  25. 06-10-2014 | Updated cluster components graphic
  26. 07-23-2014 | Updated with new PowerShell CMDlets section
  27. 08-05-2014 | Updated with new drive breakdown section
  28. 09-08-2014 | Updated storage layers and monitoring section
  29. 09-15-2014 | Updated with new compression section
  30. 09-18-2014 | Updated with book of web-scale section
  31. 10-21-2014 | Updated with ordered pd restore script
  32. 11-13-2014 | Updated with videos for multiple sections
  33. 11-18-2014 | Updated with videos for snapshots, DR and availability zones
  34. 11-21-2014 | Updated with compression and deduplication videos
  35. 01-23-2015 | Updated networking and I/O video
  36. 01-26-2015 | Updated shadow clones video
  37. 03-06-2015 | Updated with data path resiliency section
  38. 04-03-2015 | Updated content cache for disk breakdown
  39. 04-14-2015 | Updated with cloud connect section
  40. 04-21-2015 | Updated DR section with replication section

Legal Mumbo Jumbo

Copyright © Steven Poitras, The Nutanix Bible and, 2014. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Steven Poitras and with appropriate and specific direction to the original content.