02 Jul 13:18

amnonh

6dab752

Branch 2.4

New

Adding Scylla 3.1 dashboards #594
Adding the error Dashboard #576
Add a graph for ad-hoc queries #610
Allow changing aggregation modes #520
State which metrics are coordinators and which replicas #521
Allow external directory for grafana #597
genconfig using nodetool improved #613
Version per node table #151
Add a warning on low disks pace on root partition #627
Add alert when a node is leaving the cluster #623

Bug Fixes

cql-optimization shows N/A for cross dc when there is no reads in a DC #608
wrong metrics for compaction shares #604
the group by shard is incorect #337
scylla manager metric scylla_manager_cluster_cql_rtt_ms renamed #637
Manager 1.4 dashboard is broken because of cql metrics rename #661
cql optimization cross shard is off and should be remove #659

Assets 2

19 May 12:51

amnonh

scylla-monitoring-2.3.1

dcbfd65

Scylla Branch 2.3.1

Bug fixes:

Compactions I/O Queue delay by Shard should be filtered by mountpoint as well #588
scylla manager panels not showing information when there are multiple keyspaces #602
wrong metrics for compaction shares #604

Assets 2

29 Apr 06:31

amnonh

scylla-monitoring-2.3

90e3142

Branch 2.3

New in Scylla Monitoring Stack 2.3

Scylla enterprise dashboards for 2019.1 (#538)
Scylla manager dashboard for 1.4 (#557)
Add cross_shard_ops panel to cql optimization (#553)
Dashboards are precompiled in the release
Cluster name in use is shown in the dashboard (#533)
genconfig.py with multi dc support (#513)
Add a storage usage over time panel (#466)
Upgrade prometheus to 2.7.2 (#456)
Show more information about compaction (#491)
Alertmanager and Prometheus alerts can be configured from the command line
Warn users when starting docker as root and make grafana volume sharable
Add a disk usage over time graph (#466)
prometheus data directory to accept relative path (#527)

Bug Fixes

Prometheus.rules.yaml NoCQL rule looks for cql_status metric (#541)
not all 2018.1 uses cluster and dc (#540)

Assets 2

19 Mar 08:44

amnonh

scylla-monitoring-2.2

835208d

Branch 2.2

New In 2.2

CQL optimization dashboard (#471)
Unified target files for Scylla and node_exporter (#378)
Per machine (node_exporter related) dashboard added to Enterprise (#495)
Prometheus container uses the current user ID and group (#487)
Kill-all kills Prometheus instances gracefully (#438)
Start-all.sh now supports --version flag (#374)
Remove the version from the dashboard names (#486)
Dashboard loaded from API should have overwrite true (#474)
Update alertmanager to 0.16 (#478)
Bug Fixes

Moved the node_exporter relabeling to metric_relabeling (#497)

Fixed units in foreground writes (#463)
manager dashboard was missing UUID (#505)

Assets 2

10 Feb 11:08

amnonh

scylla-monitoring-2.1

f441801

Branch 2.1

Main changes:

Move to Grafana 5
Use local file for configuration and provisioning
Minor bug fixes

Assets 2

26 Dec 08:07

amnonh

scylla-monitoring-2.0

5698229

Branch 2.0

scylla-monitoring-2.0

missing closing bracket in dropped view updates

Assets 2

12 Aug 08:41

amnonh

scylla-monitoring-1.1.0

217bd65

Branch 1.1.0 Pre-release

Pre-release

disk usage should be per node (#360)

This series set the disk pie-chart usage to be per node, so the repeated
pannel, would show the per server usage.

Signed-off-by: Amnon Heiman <[email protected]>

Assets 2

05 Jul 14:37

amnonh

scylla-monitoring-1.0.0

2fa144d

scylla-monitoring-1.0.0

Adding a new cpu dashboard (#336)

* Adding a new cpu dashboard

Replaces: enhance per server dashboard with useful metrics

Adding a new dashboard that specialized in CPU load
 - Adding a graph with foreground CPU utilization. That is the CPU used by
   request processing, excluding compaction, flushes and other things. The reason for that is that users are usually scared of spikes. Even if we tell them that
   spikes are fine because they are the result of isolatable background processes,
   it is hard to *prove* that without further analysis. This graph will help.

 - time spent in violations: A lot of the latency issues we have, especially in
   higher percentiles come from task quota violations. We have a metric for this
   now and it will help us correlate latency spikes in time

 - Client connections: in the past few months, this is *THE* top metric we
   have been looking at to detect problems. It harms us a lot that it is not
   part of the main dashboard.

In the process of doing the above, I am also doing my best to document the new
graphs. The text will appear in the tooltip in the top left corner of the graph.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New

Bug Fixes

Releases: scylladb/scylla-monitoring

Branch 2.4

New

Bug Fixes

Scylla Branch 2.3.1

Branch 2.3

Branch 2.2

Branch 2.1

Branch 2.0

Branch 1.1.0

scylla-monitoring-1.0.0