Thursday, 26 December 2019

vSAN 6.x Cluster Rebalancing

vSAN Cluster rebalancing evenly distributes resources across the cluster to maintain consistent performance and availability.

If you remove capacity devices from the vSAN cluster, the disk groups might become unbalanced. If you add hosts or capacity devices to the vSAN cluster, the disk groups might become unbalanced. Rebalancing also occurs when you place a vSAN host in maintenance mode.

vSAN Cluster rebalancing is done using two methods.
  • Automatic Rebalance 
  • Manual Rebalance 
Automatic Rebalance:

When any capacity device in your cluster reaches 80 percent full, vSAN automatically rebalances the cluster.

Run the following RVC commands to monitor the rebalance operation in the cluster:
  • vsan.check_limits
Verifies whether the disk space use is balanced in the cluster.
  • vsan.whatif_host_failures
Analyzes the current capacity use per host, interprets whether a single host failure can force the cluster to run out of space for reprotection, and analyzes how a host failure might impact cluster capacity, cache reservation, and cluster components. The physical capacity use shown as the command output is the average use of all devices in the vSAN cluster.
  • vsan.resync_dashboard
Monitors any rebuild tasks in the cluster.

Manual Rebalance:

You can manually rebalance through the cluster health check, or by using RVC commands.

If the vSAN disk balance health check fails, you can initiate a manual rebalance. Under Cluster health, access the vSAN Disk Balance health check, and click the Rebalance Disks button.

Use the following RVC commands to manually rebalance the cluster:

vsan.check_limits - Verifies whether any capacity device in the vSAN cluster is approaching the 80 percent threshold limit. <cluster> - Manually starts the rebalance operation. When you run the command, vSAN scans the cluster for the current distribution of components, and begins to balance the distribution of components in the cluster. Use the command options to specify how long to run the rebalance operation in the cluster, and how much data to move each hour for each vSAN host. For more information about the command options for managing the rebalance operation in the vSAN cluster, see the RVC Command Reference Guide.


When you manually rebalance the disks, the operation runs for the selected time period, until no more data needs to be moved. The default time period is 24 hours. If no data is being moved, vSAN ends the rebalancing task.

Using Web/vSphere clients 

Navigate to the vSAN cluster.
Click the Monitor tab.

vSphere Client:
  • Under vSAN, select Health. 
  • Expand the Cluster health category, and select vSAN Disk Balance. You can review the disk balance of the hosts. 
  • In the vSAN Disk Balance page, click Disk Balance. 

vSphere Web Client:
  • Click vSAN. 
  • Under vSAN, click Health. 
  • Expand the Cluster health category, and select vSAN Disk Balance. 
  • You can review the disk balance of the hosts. 
  • Click the Proactive Rebalance Disks button to rebalance your cluster.

Tuesday, 26 February 2019

vSAN Architecture Components

We are going to discuss vSAN architecture components in this post. Generally, vSAN is easy to work with using management tools like vSphere client, Ruby Console or Web client and it simplifies administrative tasks greatly. This where it is important from troubleshooting point of view to know how things work under the hood. vSAN operations are managed by these components behind the scenes.

vSAN architecture Components:

Image: VMware

  • Cluster Membership, Monitoring, and Directory Services (CMMDS):
    • CMMDS provides the topology and object configuration information to the CLOM and DOM.
    • CMMDS discovers, maintains, and establishes a cluster of networked node members
    • CMMDS defines the cluster roles: Master, Backup, and Agent
    • CMMDS selects the owners of the objects.
    • CMMDS inventories all items, such as hosts, networks and devices
    • CMMDS stores object metadata information, such as policy-related information on an in-memory database
  • Cluster-Level Object Manager (CLOM):
    • The CLOM process runs on each ESXi host in the vSAN cluster.
    • CLOM process validates if the objects can be created based on policies assigned and resources available in the vSAN cluster.
    • CLOM also defines the creation and migration of objects.
    • CLOM distributes loads evenly across vSAN nodes.
    • CLOM also manages proactive and reactive re-balance.
    • CLOM is responsible for object compliance.
  • Distributed Object Manager (DOM):
    • DOM runs on each ESXi host in the cluster. 
    • DOM receives instructions from the CLOM and other DOMs running on other hosts in the cluster.
    • DOM communicates with LSOM and instructs it to create local components of an object.
    • Each object in a vSAN cluster has a DOM owner and a DOM client.
    • There is one DOM owner that exists per object and it determines which processes are allowed to send I/O to the object.
    • The DOM client performs the I/O to an object on behalf of a particular virtual machine and runs on every node that contains components.
    • DOM services on ESXi hosts in vSAN cluster communicate with each other to co-ordinate the creation of components.
    • DOM re-synchronize objects during a recovery.
  • Local Log Structured Object Manager (LSOM):
    • LSOM creates the local components as instructed by the DOM.
    • LSOM performs the encryption process for the vSAN datastore when enabled.
    • LSOM interacts directly with the solid-state and magnetic devices.
    • LSOM performs solid-state drive log recovery when the vSAN node boots up.
    • LSOM reports unhealthy storage and network devices.
    • LSOM performs I/O retries on failing devices.
    • LSOM provides read and write buffering.
  • Reliable Datagram Transport (RDT):
    • RDT is the network protocol for the transmission of vSAN traffic over vSAN network.

Tuesday, 8 January 2019

vROPS 6.6 architecture components

In vROPS 6.0, new platform design was introduced by VMware to meet below listed goals.

  • Treat all solutions equally and manage both VMware and third party solutions.
  • Highly scalable platform with minimal reconfigurations and redesign requirements
  • Monitoring solution with native high availability
Following diagram shows components of vRealize Operations Manager 6.6.

Let's talk about each of these components in detail.

  • It maintains the vROPS Services/Daemons.
  • Watchdog attempts to restart any vROPS daemon in case it is in failed state.
  • vcops-watchdog python script runs every 5 mins to check vROPS services.
  • Watchdog service checks include:
    • PID file of service
    • Service status
Apache2 HTTPD:
  • Provides backend platform for tomcat instances which provides vROPS UIs.
User Interfaces:
  • In vROPS 6.6, user interface is broken into two components
    • Product UI
    • Admin UI
Product UI:
  • Hosted by Pivotal tc Server (Apache Web server)
  • Can be accessed using https://<Nodename>/ui/login.action
  • Except remote collector role, this UI is present in all server roles (Master, Replica, Data Collector)
  • Primary purpose of product UI is to make GemFIRE calls to controller API to access data and create views, Dashboards or reports.
Admin UI:
  • Hosted by Pivotal tc server (Apache Web server)
  • Used to perform administrative tasks using HTTP REST calls to admin API
  • Can be accessed using https://<NodeName>/admin 
Suite API:
  • It is a public facing API
  • Used for Automation or scripting of common tasks.
  • Also used by vROPS for administrative tasks.
  • Responsible for pulling inventory and metric data from configured sources using data adaptors.
  • After collecting data, collector contacts GemFIRE locator to locate one or more Controller Cache servers.
  • Then Collector connects to one or more Controller cache servers and sends collected data.
  • The collector sends heartbeat to controller every 30 seconds via HeartbeatThread process (Max 25 data collection threads) which runs on collector.
GemFIRE Locator:
  • Runs on Master node and Replica node.
  • On data collector and remote collector, GemFIRE runs as client process.
  • VMware vFabric GemFIRE is an in-memory, low latency data grid.
  • Runs in same JVM as the controller and analytics.
  • Scales as needed when nodes are added to cluster.
  • GemFIRE allows caching, processing, and retrieval of metrics
  • Dependent on GemFIRE locator
  • It's a sub-process of analytics process.
  • Monitors collector status every 1 minute.
  • Controller node runs HeartbeatServer thread which processes heartbeats from collector.
  • Responsible for co-ordinating activities between cluster members.
  • Manages storage and retrieval of inventory objects within system.
  • Leverages MAPReduce function (Google also uses same function for search results) for selective query and faster results.
  • Analytics layer is responsible for:
    • Metric calculations
    • Dynamic threshold
    • Alerts and Alarms
    • Storage and retrieval of metrics from Persistence layer
    • Root Cause Analysis
    • HIS metadata calculations and object relationship data.
  • Analytics works with GemFIRE, Controller and persistence layer 
  • Responsible for generating SMTP/SNMP alerts on Master and replica node.
Persistence layer:
  • Also known as Database layer.
  • This layer consists of series of databases each performing different functions as per role.
  • There are five primary database services 
    • Cassandra DB:
      • Introduced in vROPS 6.1
      • It is a Apache Cassandra DB
      • Replaces Global xDB in earlier versions
      • Stores all settings that are applied globally (CONTENT folder).
      • Designed to handle large structured data across multiple nodes.
      • Provides HA with no single point of failure
      • Highly scalable.
      • No Sharding is used by this DB
      • Stores below content
        • User preferences and configuration
        • Alerts Definition
        • Customizations
        • Dashboards, Policies and view
        • Reports, licensing
        • Shard maps
        • Activities
    • Central DB:
      • It is a PostgreSQL DB
      • Also called as REPL
      • Sharding is used by this DB
      • Exists only on Master node and Replica Node when HA is enabled
      • Accessible via port 5433
      • located at /storage/db/vcops/vpostgres/repl
      • Stores resource inventory information only.
    • Alerts/HIS DB:
      • Also called as Data
      • It is a PostgreSQL DB
      • Stores Alerts and Alarms history, history of resource property data, and history of resource relationship. 
      • Exists on Master, Replica and Data nodes
      • Accessible via port 5432
      • Sharding is used by this DB.
      • Located at /storage/db/vcops/vpostgres/data
    • FSDB:
      • FSDB is a GemFIRE Server which runs inside analytics JVM.
      • FSDB contains all raw time series metrics and super metrics data for resources.
      • It stores data collected by data adapters.
      • Also stores data calculated or generated after analysis
      • FSDB uses sharing for distributing data for new objects
      • FSDB is available on Master, Replica and Data collector nodes in vROPS
      • Sharding is used by this DB
    • CaSa DB:
      • Also called as HSQL
      • It is a small, flat, JSON-Based, in-memory DB
      • Used by CaSA for cluster administration 
      • Sharding is not used by CaSA DB

Popular Posts This Week