percona · catalinaadam · Jan 10, 2025 · Jan 13, 2025
@@ -1,35 +1,61 @@
-# MongoDB Cluster Summary
+# MongoDB Sharded Cluster Summary
 
 ![!image](../../images/PMM_MongoDB_Cluster_Summary.jpg)
 
-## Current Connections Per Shard
+## Overview
 
-TCP connections (Incoming) in mongod processes.
+Displays essential data for individual nodes, such as their role, CPU usage, memory consumption, disk space, network traffic, uptime, and the current MongoDB version.
 
-## Total Connections
+## Node States
+Shows the state timeline of MongoDB replica set members during the selected time range. Each node's state (PRIMARY, SECONDARY, ARBITER, etc.) is color-coded for easy monitoring, with green indicating healthy states and red showing potential issues. 
 
-Incoming connections to mongos nodes.
+Use this to track role changes and identify stability problems across your replica set.
 
-## Cursors Per Shard
+## Collection Details
 
-The Cursor is a MongoDB Collection of the document which is returned upon the find method execution.
+### Size of Collections in Shards
+Visualizes the storage size distribution across MongoDB collections in different shards, excluding system databases. Use this metric to monitor space utilization across collections and plan capacity based on storage growth patterns in your MongoDB cluster.
 
-## Mongos Cursors
+### Number of Collections in Shards
+Displays the total number of collections per database across different shards in your MongoDB cluster, excluding system databases. 
 
-The Cursor is a MongoDB Collection of the document which is returned upon the find method execution.
+Use this to track collection growth and identify databases that may need optimization based on their collection count.
 
-## Operations Per Shard
+## Connections
 
-Ops/sec, classified by legacy wire protocol type (`query`, `insert`, `update`, `delete`, `getmore`).
+### Current Connections Per Shard
+Displays the current number of incoming TCP connections for each MongoDB shard, showing trends over time with mean, maximum, and minimum values.
 
-## Total Mongos Operations
+Use this to monitor connection patterns and ensure your MongoDB cluster maintains healthy connection levels across all shards.
 
-Ops/sec, classified by legacy wire protocol type (`query`, `insert`, `update`, `delete`, `getmore`).
+### Available Connections
+Tracks the number of available MongoDB connections across your replica sets over time, with statistical breakdowns. 
 
-## Change Log Events
+Use this metric to monitor connection capacity and ensure your MongoDB cluster maintains sufficient connection availability for client requests.
 
-Count, over last 10 minutes, of all types of configuration db changelog events.
+## Chunks in Shards
 
-## Oplog Range by Set
+### Amount of Chunks in Shards
+Displays the number of chunks distributed across each shard in your MongoDB cluster, excluding system databases. Use this to monitor data distribution and identify potential balancing needs across your sharded cluster.
 
-Timespan 'window' between oldest and newest ops in the Oplog collection.
+### Dynamic of Chunks
+Shows the rate of change in chunk distribution across MongoDB shards over time, with statistical breakdowns for each shard. Use this to monitor chunk migration patterns and ensure proper data balancing across your sharded cluster.
+
+### Chunks Move Events
+Displays the frequency of chunk movement operations between shards in your MongoDB cluster over time. Use this metric to track balancing activity and identify periods of high chunk migration that might impact cluster performance.
+
+### Chunks Split Events
+Shows the rate at which chunks are being split across your MongoDB sharded cluster due to size growth. Use this metric to identify when collections grow rapidly and determine if you need to rebalance or optimize shard keys.
+
+## Replication
+
+### Replication Lag by Shard
+Tracks the maximum replication delay (in seconds) between primary and secondary nodes for each shard in your MongoDB cluster. 
+
+Use this to monitor replication health and detect when secondaries fall too far behind their primary nodes.
+
+### Oplog Range by Shard
+Shows the time window between the oldest and newest operations in the MongoDB oplog for each shard. Use this to monitor oplog capacity and ensure there's enough history for replica set members to sync after maintenance or failures.
+
+### Oplog GB/Hour
+Shows the hourly oplog data volume written to cache by the MongoDB primary server. Use this metric to monitor write intensity, plan storage capacity, and identify periods of high write activity.
@@ -2,26 +2,86 @@
 
 ![!image](../../images/PMM_MongoDB_ReplSet_Summary.jpg)
 
-## Replication Lag
+## Overview
+Displays essential data for individual nodes, such as their role, CPU usage, memory consumption, disk space, network traffic, uptime, and the current MongoDB version.
 
-MongoDB replication lag occurs when the secondary node cannot replicate data fast enough to keep up with the rate that data is being written to the primary node. It could be caused by something as simple as network latency, packet loss within your network, or a routing issue.
+## Node States
+Shows the state timeline of MongoDB replica set members during the selected time range. Each node's state (PRIMARY, SECONDARY, ARBITER, etc.) is color-coded for easy monitoring, with green indicating healthy states and red showing potential issues. Use this to track role changes and identify stability problems across your replica set.
 
-## Operations - by service name
+## Details
 
-Operations are classified by legacy wire protocol type (insert, update, and delete only).
+### Command Operations
+Shows the rate of MongoDB operations per second, including both regular and replicated operations (query, insert, update, delete, getmore), as well as document deletions by TTL indexes. Use this metric to monitor database activity patterns and identify potential performance bottlenecks.
 
-## Max Member Ping Time - by service name
+### Top Hottest Collections by Read
+Shows the five MongoDB collections with the highest read operations per second. Use this to identify your most frequently accessed collections and optimize their performance.
 
-This metric can show a correlation with the replication lag value.
+### Top Hottest Collections by Write
+Shows the five MongoDB collections with the highest write operations (inserts, updates, and deletes) per second. Use this to identify your most frequently modified collections and optimize their write performance.
 
-## Max Heartbeat Time
+### Query Efficiency
+Shows the ratio of documents or index entries scanned versus documents returned. A ratio of 1 indicates optimal query performance where each scanned document matches the query criteria. 
 
-Time span between now and last heartbeat from replicaset members.
+Higher values suggest less efficient queries that scan many documents to find matches. Use this to identify queries that might need index optimization.
 
-## Elections
+### Queued Operations
+Shows the number of operations waiting because the database is busy with other operations. Use this to identify when MongoDB operations are being delayed due to resource conflicts.
 
-Count of elections. Usually zero; 1 count by each healthy node will appear in each election. Happens when the primary role changes due to either normal maintenance or trouble events.
+### Reads & Writes
+Shows both active and queued read/write operations in your MongoDB deployment. Use this to monitor database activity and identify when operations are being delayed due to high load.
 
-## Oplog Recovery Window - by service name
+### Connections
+Shows the number of current and available MongoDB connections. Use this to monitor connection usage and ensure your deployment has sufficient capacity for new client connections.
 
-Timespan 'window' between newest and the oldest op in the Oplog collection.
+### Query Execution Times
+Shows the average latency in microseconds (µs) for read, write, and command operations. Use this metric to monitor query performance and identify slow operations that may need optimization.
+
+## Collection Details
+
+### Size of Collections
+Shows storage size of MongoDB collections across different databases. Use this to monitor database growth and plan storage capacity needs.
+
+### Number of Collections
+Shows the total number of collections in each MongoDB database. Use this to track database organization and growth patterns.
+
+## Replication
+
+### Replication Lag
+Shows how many seconds Secondary nodes are behind the Primary in replicating data. Higher values indicate potential issues with network latency or system resources. The red threshold line at 10 seconds helps identify when lag requires attention.
+
+### Oplog Recovery Window 
+Shows the time range (in seconds) between the newest and oldest operations in the oplog. Use this to ensure sufficient history is maintained for recovery and secondary synchronization.
+
+### Oplog GB/Hour 
+Shows hourly data volume written to cache from the MongoDB oplog by the Primary server. Use this to track oplog growth, plan storage needs, and detect high-write periods. Values are displayed in bytes with hourly intervals.
+
+## Performance
+
+### Flow Control
+Shows the frequency and duration (in microseconds) of MongoDB write throttling. Use this to understand when your deployment is slowing down writes to keep replication lag under control.
+
+### WiredTiger Concurrency Tickets Available
+Shows how many more read and write operations your MongoDB deployment can handle simultaneously. Use this to monitor database concurrency limits and potential bottlenecks.
+
+## Nodes Summary
+
+### Nodes Overview
+Shows key system metrics for each node: uptime, load average, memory usage, disk space, and more. Use this table to monitor the health and resource utilization of your infrastructure at a glance.
+
+## CPU Usage
+Shows CPU utilization as a percentage of total capacity, broken down by user and system activity. Use this to monitor CPU load and identify potential performance bottlenecks.
+
+## CPU Saturation
+
+### CPU Saturation and Max Core Usage
+Shows how heavily your CPU is loaded with waiting processes and maximum core utilization. Use this to identify when your system needs more CPU capacity or when processes are competing for CPU time.
+
+## Disk I/O and Swap Activity
+Shows disk I/O operations (reads/writes) and memory swap activity for each MongoDB node, measuring data flow between storage and RAM. 
+
+Use this metric to monitor storage performance, detect memory pressure, and identify when MongoDB's working set may exceed available RAM.
+
+##  Network Traffic
+Shows inbound and outbound network traffic for each MongoDB node, measuring data flow in bytes per second. 
+
+Use this metric to monitor bandwidth usage, identify unusual traffic patterns, and detect potential network bottlenecks that could affect replication performance.