Skip to content

Commit 31fa429

Browse files
committedMar 20, 2023
Convert some RST pages to markdown
1 parent 7f947f7 commit 31fa429

File tree

6 files changed

+155
-192
lines changed

6 files changed

+155
-192
lines changed
 

‎README.md

+1
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ References
3232
* Alabaster https://github.com/bitprophet/alabaster
3333
* Pygments syntax highlighter https://pygments.org/languages/
3434
* MyST Parser https://myst-parser.readthedocs.io/en/latest/intro.html
35+
* https://raw.githubusercontent.com/executablebooks/MyST-Parser/602470ebdaf81fbea999fcc0f0cf1b8e7784ec15/tests/test_renderers/fixtures/sphinx_roles.md
3536

3637
License
3738
-------

‎source/about.md

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
## About
2+
3+
Alerta started at [The Guardian](https://www.theguardian.com/media/2011/jan/28/wikileaks-julian-assange-alan-rusbridger)
4+
out of necessity as a replacement for a [legacy monitoring tool](https://www.quest.com/foglight/)
5+
but only after exhaustively evaluating [all](https://www.solarwinds.com/)
6+
[credible](https://www.nagios.org/) [alternatives](https://www.zabbix.com/)
7+
first.
8+
9+
Initially all we wanted was to be able to create alert thresholds against the
10+
hundreds of thousands of [Ganglia metrics](https://github.com/ganglia/monitor-core/wiki)
11+
collected for the website and view the alerts in a web console ie. a Ganglia
12+
"alerter". Not having a proper name for this
13+
[metrics and monitoring system](https://www.theguardian.com/info/developer-blog/2012/oct/04/winning-the-metrics-battle)
14+
the working name of "an alerter" stuck and a simple homophone was chosen
15+
to aid future Google searches.
16+
17+
In the end, the thresholding of metrics proved very difficult to scale so we
18+
eventually split the project in two and metric thresholding was given to Riemann
19+
(see [riemann-config](https://github.com/guardian/riemann-config)) and the
20+
alert correlation, de-duplication and visualisation became the "Alerta" project.
21+
22+
Over the years the project has evolved to meet the constantly changing needs of
23+
the [Guardian developer teams](https://www.theguardian.com/info/2021/oct/29/running-a-post-decade-innovation-retrospective)
24+
as they moved to a more agile, dynamic, "[swimlaned](http://akfpartners.com/growth-blog/fault-isolative-architectures-or-swimlaning)"
25+
architecture which has meant, for the operations team, a shift from static,
26+
self-hosted infrastructure to an internal OpenStack cloud to then finally an external
27+
cloud service.
28+
29+
In that time certain key features of Alerta have been deprecated as requirements
30+
changed (eg. the message bus, Ganglia, Riemann) and others added (eg. OAuth2 login,
31+
CloudWatch, Pingdom, PagerDuty integration). In the process it has been slimmed
32+
down to fewer core components making it easier to understand, deploy and manage.
33+
34+
As a result, Alerta is now quite different to the somewhat "over engineered" initial
35+
solution but the core concepts of being a flexible, easy-to-use tool remain and
36+
it is now a "cloud-ready" solution adapted to the challenges of a fast changing
37+
environment.

‎source/about.rst

-45
This file was deleted.

‎source/conventions.md

+99
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Conventions
2+
3+
Always favour convention over configuration. And any configuration
4+
should have sensible defaults.
5+
6+
## Naming Conventions
7+
8+
### Resources
9+
10+
The key alert attribute name of `resource` was specifically chosen
11+
so as not to be host centric. A resource *can* be a hostname, but it
12+
might also be an EC2 instance ID, a Docker container ID or some other
13+
type of non-host unique identifier.
14+
15+
### Environments & Services
16+
17+
The environment attribute is used to [namespace](https://en.wikipedia.org/wiki/Namespace)
18+
the alert resource. This allows you to have two resources with the same
19+
name (eg. `web01`) but that are differentiated by their environments
20+
(eg. `Production` and `Development`).
21+
22+
Choose a set of environments and enforce them. ie. `PROD`, `DEV`
23+
or `Production`, `Development` but not both. The same for services
24+
eg. `MobileAPI`, `Mobile-API` and `mobile api` are all valid
25+
but needlessly different and impossible to query for consistently
26+
or generate aggregate metrics for.
27+
28+
Note that the **_service attribute is a list_** because it is common
29+
for infrastructure (ie. a resource) to be used by more than one service.
30+
That is, if a component failure occurs that problem could cause an
31+
outage in multiple services.
32+
33+
### Event Names
34+
35+
It can be useful to define a convention when it comes to naming
36+
events. Possible options are:
37+
38+
* Camel case - `DiskUtilHigh`
39+
* Hierarchy - `NW:INTERFACE:DOWN`
40+
* SNMP - `cpuAlarmHigh`
41+
42+
Querying for all Disk utilisation alerts using the `alerta` CLI
43+
is then relatively straight-forward::
44+
45+
$ alerta query --filter event=~DiskUtil
46+
47+
### Event Groups
48+
49+
Another consideration is to ensure you make use of the event group
50+
which gives you the ability to group related alerts.
51+
52+
Some suggested event groups with possible events are listed below.
53+
54+
| Event Groups | Events (examples) |
55+
|--------------------|--------------------------------------------|
56+
| `Service` | failures with entire services |
57+
| `Application` | errors from application logs |
58+
| `OS` | disk space, time sync failing |
59+
| `Performance` | system load, swap utilisation high |
60+
| `Configuration` | config mgmt tool alerts eg. Puppet or Chef |
61+
| `Web` | web server errors |
62+
| `Syslog` | unix system log messages |
63+
| `Hardware` | hardware errors |
64+
| `Storage` | NFS, SAN, NAS storage infrastructure |
65+
| `Database` | database errors, table space utilisation |
66+
| `Security` | security/authorization messages |
67+
| `Network` | network devices and infrastructure |
68+
| `Cloud` | cloud-based services or infrastructure |
69+
70+
Querying for all performance-related alerts using the `alerta` CLI
71+
could then become::
72+
73+
$ alerta query --filter group=Performance
74+
75+
### Severity Levels
76+
77+
Agree on a subset of [severity levels](api/alert.rst#alert-severities) and
78+
be consistent with what they mean. For example, if severity levels are used
79+
consistently then integrating with a paging or email system becomes easier.
80+
81+
| Severity | Service Level | Notification |
82+
|--------------|----------------------------------|--------------------------------|
83+
| `critical` | service unavailable | immediate page out |
84+
| `major` | service impaired still available | page during business hours |
85+
| `minor` | component failure | email only |
86+
| `warning` | everything else | consolidate into daily email |
87+
88+
## Enforcing Conventions
89+
90+
Once a set of naming conventions are agreed, they can be enforced by
91+
using a simple "pre-receive" plugin, similar to a [`git` hook](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks).
92+
93+
A full working example called [reject][reject] can be found in the plugins
94+
directory of the project code repository and is installed by default.
95+
The server configuration settings {envvar}`ORIGIN_BLACKLIST` and
96+
{envvar}`ALLOWED_ENVIRONMENTS` can be used to tailor it for your
97+
circumstances or it can be disabled completely.
98+
99+
[reject]: https://github.com/alerta/alerta/blob/master/alerta/plugins/reject.py

‎source/conventions.rst

-130
This file was deleted.

‎source/index.rst

+18-17
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,14 @@ The ``alerta`` command-line tool can also be used to generate alerts.
4545
The required API key is ``demo-key``.
4646

4747
.. toctree::
48-
:caption: Introduction
48+
:caption: Basics
4949
:maxdepth: 2
5050
:hidden:
5151

5252
quick-start
53-
design
5453
server
55-
webui
5654
cli
55+
webui
5756
configuration
5857
authentication
5958
authorization
@@ -76,16 +75,32 @@ The required API key is ``demo-key``.
7675
:maxdepth: 2
7776
:hidden:
7877

78+
design
7979
conventions
8080
development
8181
Tutorials <tutorials>
8282
resources
83+
faq
84+
release-notes
85+
86+
.. toctree::
87+
:caption: API
88+
:glob:
89+
:maxdepth: 2
90+
:hidden:
8391

8492
api/reference
8593
api/query-syntax
8694
api/alert
8795
api/heartbeat
8896

97+
.. toctree::
98+
:caption: More
99+
:maxdepth: 2
100+
:hidden:
101+
102+
about
103+
89104
Contribute
90105
----------
91106

@@ -102,25 +117,11 @@ Support
102117
* :ref:`Frequently Asked Questions <faq>`
103118
* Issue Tracker: https://github.com/alerta/alerta/issues
104119

105-
.. toctree::
106-
:caption: More
107-
:maxdepth: 1
108-
:hidden:
109-
110-
faq
111-
112120
License
113121
-------
114122

115123
This project is licensed under the Apache license, Version 2.0 .
116124

117-
.. toctree::
118-
:maxdepth: 2
119-
:hidden:
120-
121-
release-notes
122-
about
123-
124125
Indices and tables
125126
==================
126127

0 commit comments

Comments
 (0)