Skip to content

Commit 8402746

Browse files
authored
Add more dev tools and update readme (#27)
This PR adds 1. A gRPC code generation script (`make gen`) 2. A cleaner script (`make clean`) 3. A sample data downloader (`make get-dataset`) 4. A major update of README.md after OSPP and GSOC submission. 5. Include Log-analysis architecture/design diagrams. Signed-off-by: Superskyyy <[email protected]>
1 parent eae743d commit 8402746

14 files changed

+466
-60
lines changed

Diff for: Makefile

+3-9
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,14 @@ endif
2424
$(VENV):
2525
python3 -m venv $(VENV_DIR)
2626
poetry run python -m pip install --upgrade pip
27+
poetry install --sync
2728

2829
all: gen get-dataset prune-dataset lint license clean
2930

3031
.PHONY: all
3132

3233
gen:
33-
poetry run python -m pip install grpcio-tools
34-
poetry run python -m tools.grpc_code_gen
34+
poetry run python -m tools.grpc_gen
3535

3636
#argument indicates a dataset name defined in sample_data_manager.py
3737
get-dataset:
@@ -53,11 +53,5 @@ lint-fix: lint-setup
5353
$(VENV)/unify -r --in-place .
5454
$(VENV)/flynt -tc -v .
5555

56-
# todo make this work on windows
5756
clean:
58-
find . -name "*.egg-info" -exec rm -r {} +
59-
find . -name "dist" -exec rm -r {} +
60-
find . -name "build" -exec rm -r {} +
61-
find . -name "__pycache__" -exec rm -r {} +
62-
find . -name ".pytest_cache" -exec rm -r {} +
63-
find . -name "*.pyc" -exec rm -r {} +
57+
poetry run python -m tools.cleaner

Diff for: README.md

+55-47
Original file line numberDiff line numberDiff line change
@@ -1,80 +1,88 @@
11
# SkyWalking AIOps Engine
2-
**An AIOps Engine for Observability.**
32

4-
A usable open-source AIOps framework for the domain of cloud computing observability.
3+
*A practical open-source AIOps engine for the
4+
era of cloud computing.*
55

6-
### Why this project matters?
7-
We could answer this from the following progressive questions:
8-
1. Are there existing algorithms for telemetry data?
6+
## Why do we build this project?
7+
8+
**We strongly believe that this project will bring value
9+
to AIOps practitioners and researchers.**
10+
<details>
11+
<summary>Towards better Observability</summary>
12+
We could reason this from the following progressive questions:
13+
14+
1. Are there existing algorithms for telemetry data?
915
- **Abundant.**
1016

11-
2. Are the existing algorithms empirically verified?
12-
13-
- **Most proposed algorithms are not empirically verified**
1417

15-
3. Are there AIOps tools that embed machine learning algorithms?
18+
2. Are the existing algorithms empirically verified?
19+
20+
- **Most algorithms are not verified in production**
21+
22+
23+
3. Are there practical AIOps frameworks?
1624
- **Limited, often out of maintenance or commercialized.**
17-
18-
4. Are there open-source AIOps solutions that integrates with popular backends?
25+
26+
27+
4. Are there open-source AIOps solutions that offers Out-of-Box integrations?
1928
- **Hardly any.**
2029

30+
2131
5. Why would I need that?
2232
1. For developers & organizations curious for AIOps:
23-
- a. Just install and start using it, saves budget, saves head-scratching.
33+
- a. Just install and start using it, saves budget, prevents head-scratching.
2434
- b. Treat this project as a good (or bad) reference for your own AIOps pipeline.
2535
2. For researchers in the AIOps domain:
2636
- a. For software engineering researchers - sample for AIOps evolution and empirical study.
2737
- b. For algorithm researchers - playground for new algorithms, solid case studies.
28-
2938

30-
The above is where we place the value of this project, though our current aim is to become the official AIOps engine
31-
of [Apache SkyWalking](https://github.com/apache/skywalking), each component could be easily swapped given its
32-
plugable design.
39+
</details>
40+
41+
42+
Click the above section to find out where we place the value of this project,
43+
though our current aim is to become the official AIOps engine
44+
of [Apache SkyWalking](https://github.com/apache/skywalking),
45+
each component could be easily swapped, extended and scaled to fit your own needs.
3346

3447
### Current Goal
3548

36-
At the current stage, it serves as an **anomaly detection** engine, in the future, we will also explore root cause analysis and
37-
automatic problem recovery.
49+
At the current stage, it targets at Logs and Metrics analysis,
50+
in the future, we will also explore root cause analysis and
51+
automatic problem recovery based on Traces.
3852

39-
This is also the tentative repository for OSPP 2022 and GSOC 2022 student project outcomes.
53+
This is also the repository for
54+
OSPP 2022 and GSOC 2022 student research outcomes.
4055

41-
Project `Exploration of Advanced Metrics Anomaly Detection & Alerts with Machine Learning in Apache SkyWalking`
56+
1. `Exploration of Advanced Metrics Anomaly Detection & Alerts with Machine Learning in Apache SkyWalking`
4257

43-
Project `Log Outlier Detection in Apache SkyWalking`
58+
2. `Log Outlier Detection in Apache SkyWalking`
4459

4560
### Architecture
4661

47-
**TBA**
62+
**Log Clustering and Log Trend Analysis**
4863

49-
**Data pulling:**
64+
![img.png](docs/static/log-clustering-arch.png)
5065

51-
The current data pulling and retention rely on a common set of ingestion methods, with a
52-
first focus on SkyWalking OAP GraphQL and static file loader. We maintain a local storage for processed data.
66+
![img_1.png](docs/static/log-trend-analysis-arch.png)
5367

54-
**Alert component:**
68+
**Metric Anomaly Detection and Visualizations**
5569

56-
An anomaly does not directly trigger an alert, it
57-
goes through a tolerance mechanism.
70+
TBD - Soon to be added
5871

5972
### Roadmap
6073

61-
Phase 0 (current)
62-
1. [ ] Implement essential development infrastructure.
63-
2. [ ] Implement naive algorithms as baseline & pipline POC (on existing datasets).
64-
3. [ ] Implement a SkyWalking `GraphQLDataLoaderProvider` to test data pulling.
65-
66-
Phase 1 (summer -> fall 2022, OSPP & GSOC period)
67-
1. [ ] Implement the remaining core default providers.
68-
2. [ ] **Research and implement algorithms with OSPP & GSOC students.**
69-
3. [ ] Integrate with Apache Airflow for orchestration.
70-
5. [ ] Evaluation based on benchmark microservices systems (anomaly injection).
71-
6. [ ] MVP ready without UI-side changes.
72-
73-
Phase 2 (fall -> end of 2022)
74-
1. [ ] Join as an Apache SkyWalking subproject.
75-
2. [ ] Integrate with SkyWalking Backend & rule-based alert module.
76-
3. [ ] Propose and request SkyWalking UI-side changes.
77-
4. [ ] First release for end-user testing.
78-
79-
Phase Next
74+
For the details of our progress, please refer to our project dashboard
75+
[Here](https://github.com/SkyAPM/aiops-engine-for-skywalking/projects?query=is%3Aopen).
76+
77+
Phase Current (fall -> end of 2022)
78+
79+
0. [ ] Finish POC stage and start implementing dashboards for first stage users. (demo purposes)
80+
1. [ ] Real-world data testing and chaos engineering benchmark experiments.
81+
2. [ ] Join Apache Software Foundation as an Apache SkyWalking subproject.
82+
3. [ ] Integrate with SkyWalking Backend (Export analytics results to SkyWalking)
83+
4. [ ] Propose and request SkyWalking UI-side changes.
84+
5. [ ] First release for SkyWalking end-user testing.
85+
86+
Phase Next
87+
8088
1.[ ] Towards production-ready.

Diff for: sample_data_gaia/README.md renamed to assets/README.md

+5-2
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
11
# GAIA Dataset
2+
23
**Sample dataset can be downloaded from**
34

45
https://github.com/CloudWise-OpenSource/GAIA-DataSet
56

67
**We don't host dataset in this repo because of its size and GPL2.0 license.**
78

8-
To evaluate the models, please download the dataset and unzip each subset of
9-
`Companion_Data` into this directory.
9+
**To get the data from source, simply run `make get-data` in the root directory.**
10+
11+
To evaluate the models, the above command will download the dataset and populate each subset of
12+
`Companion_Data` into this directory.
1013

1114
After that, this folder should be the following exact structure:
1215

Diff for: docs/en/contribute/Datasets.md

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Get a range of common datasets for testing and development
2+
3+
At the root of project, run `make get-dataset name=<name>` to get them,
4+
the datasets will be extracted to the `assets/datasets` folder.
5+
6+
Use the following `names` to download the batch of the datasets you need:
7+
8+
1. `gaia`: the [GAIA](https://github.com/CloudWise-OpenSource/GAIA-DataSet) dataset.
9+
- 4+ GB with log, trace and metric data.
10+
2. `log_s`: small [LogHub](https://github.com/logpai/loghub) datasets.
11+
1. SSH.tar.gz: (Server)
12+
2. Hadoop.tar.gz: (Distributed system)
13+
3. Apache.tar.gz: (Server)
14+
4. HealthApp.tar.gz: (Mobile application)
15+
5. Zookeeper.tar.gz: (Distributed system)
16+
6. HPC.tar.gz: (Supercomputer)
17+
3. `log_m`: medium [LogHub](https://github.com/logpai/loghub) datasets.
18+
1. Android.tar.gz = 1,555,005 logs (183MB Mobile system)
19+
2. BGL.tar.gz = 4,747,963 logs (700MB Supercomputer)
20+
3. Spark.tar.gz = 33,236,604 logs (2.7GB Distributed system)
21+
4. `log_l`: large [LogHub](https://github.com/logpai/loghub) datasets.
22+
1. HDFS_2.tar.gz = 71,118,073 logs (16GB Distributed system)
23+
2. Thunderbird.tar.gz = 211,212,192 logs (30GB Supercomputer)
24+
25+
**Note large dataset require substantial disk space and memory to extract**
26+
27+
## To remove the datasets/zip/tar files
28+
29+
If you want to keep all zip/tar files after extracting, pass additional `save=True`
30+
to `make get-dataset name=log_m save=True` .
31+
32+
If you want to remove all datasets, run `make prune-dataset`
33+

Diff for: docs/static/log-clustering-arch.png

109 KB
Loading

Diff for: docs/static/log-trend-analysis-arch.png

49.3 KB
Loading

Diff for: poetry.lock

+61-2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Diff for: pyproject.toml

+1
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ redis = { extras = ["hiredis"], version = "^4.3.4" }
5151
pyzstd = "^0.15.3"
5252
dynaconf = "^3.1.9"
5353
cachetools = "^5.2.0"
54+
grpcio-tools = ">=1.42.0"
5455

5556
[tool.poetry.dev-dependencies]
5657
PySnooper = "^1.1.1"

Diff for: tools/README.md

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Convenient developer tools
2+
3+
Tools in this folder should only be run via the `make <target>` command.
4+
5+
1. grpc_gen.py => generate grpc code from proto files at any depth.
6+
2. get_data.py => download and extract some sample datasets from the web.
7+
3. cleaner.py => cleans up things like pycache and local installation manifests.
8+
9+
## Future
10+
11+
The above tools will be replaced will Poetry-based scripts in the future (A dev CLI)

Diff for: tools/__init__.py

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Copyright 2022 SkyAPM org
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.

Diff for: tools/cleaner.py

+36
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Copyright 2022 SkyAPM org
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
import os
15+
import shutil
16+
17+
18+
def find_and_clean(folders_to_remove: list, root='.') -> None:
19+
"""
20+
Find and clean all files in the given folder list
21+
:param folders_to_remove: list of directories to remove
22+
:param root: from which folder to start searching, default current
23+
:return:
24+
"""
25+
exclude: set = {'.venv'}
26+
for path, dirs, _ in os.walk(root):
27+
dirs[:] = [d for d in dirs if d not in exclude]
28+
for folder in folders_to_remove:
29+
if any(folder in d for d in dirs):
30+
shutil.rmtree(removed := os.path.join(path, folder))
31+
print(f'Removed {removed}')
32+
33+
34+
if __name__ == '__main__':
35+
find_and_clean(folders_to_remove=['__pycache__', 'generated', 'build', 'dist', 'egg-info', 'pytest_cache', '.pyc'],
36+
root='.')

0 commit comments

Comments
 (0)