Skip to content

Commit

Permalink
Merge pull request #77 from pygod-team/dev
Browse files Browse the repository at this point in the history
major refactor for 0.4.0
  • Loading branch information
YingtongDou authored May 12, 2023
2 parents d72fec7 + 2334b7c commit 4d9b473
Show file tree
Hide file tree
Showing 122 changed files with 7,349 additions and 9,182 deletions.
7 changes: 4 additions & 3 deletions .github/workflows/testing-cron.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
python-version: ["3.7", "3.8", "3.9", "3.10"]
python-version: ["3.8", "3.9", "3.10"]

steps:
- uses: actions/checkout@v3
Expand All @@ -28,8 +28,9 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements_ci.txt
pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cpu.html
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install torch_geometric
pip install torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.0.0+cpu.html
pip install pytest
pip install coverage
pip install coveralls
Expand Down
7 changes: 4 additions & 3 deletions .github/workflows/testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
python-version: ["3.7", "3.8", "3.9", "3.10"]
python-version: ["3.8", "3.9", "3.10"]

steps:
- uses: actions/checkout@v3
Expand All @@ -33,8 +33,9 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements_ci.txt
pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cpu.html
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install torch_geometric
pip install torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.0.0+cpu.html
pip install pytest
pip install coverage
pip install coveralls
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ instance/
# Sphinx documentation
docs/_build/
docs/tutorials/
docs/html/
generated/

# PyBuilder
.pybuilder/
Expand Down
6 changes: 5 additions & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@ version: 2
sphinx:
configuration: docs/conf.py

build:
os: ubuntu-22.04
tools:
python: "3.8"

python:
version: 3.8
install:
- requirements: docs/requirements.txt
26 changes: 4 additions & 22 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Contribute to PyGOD
===================

This guide will tell how to contribute to pyGOD at the beginning stage.
This guide will tell how to contribute to PyGOD at the beginning stage.
This guide may change subject to the development process.


Expand Down Expand Up @@ -45,26 +45,8 @@ Development Environment

To prevent the problems induced by inconsistent versions of dependencies, following requirements are suggested.

- python>=3.6
- torch>=1.10.1
- torch_geometry>=2.0.3
- python>=3.8
- torch>=2.0.0
- torch_geometry>=2.3.0

Please follow the `installation guide <https://docs.pygod.org/en/latest/install.html>`_ for more details.


Contributing New Models
-----------------------

To contribute a new model, simply

1. Make a new file with the name of your model (say ``awesome-gnn.py``) within the directory ``pygod/models``.

2. Populate it with your work, a minimal example file to demonstrate its effectiveness, such like `dominant example <https://docs.pygod.org/en/latest/tutorials/intro.html#sphx-glr-tutorials-intro-py>`_.

3. Add a corresponding test file. See `test repo <https://github.com/pygod-team/pygod/tree/main/pygod/test>`_ for example.

4. Run the entire test folder to make sure nothing is broken locally.

5. Make a pull request once you are done to the **dev branch**. Brief explain your development.

6. We will review your PR if the tests are successful :)
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
BSD 2-Clause License

Copyright (c) 2021, pygod-team
Copyright (c) 2023, pygod-team
All rights reserved.

Redistribution and use in source and binary forms, with or without
Expand Down
135 changes: 51 additions & 84 deletions README.rst

Large diffs are not rendered by default.

22 changes: 12 additions & 10 deletions benchmark/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# PyGOD Benchmark

Official implementation of paper [BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs](https://arxiv.org/abs/2206.10071). Our datasets are publicly available in the [data repository](https://github.com/pygod-team/data). **Please star, watch, and fork us for the active updates!**
Official implementation of paper [BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs](https://proceedings.neurips.cc/paper_files/paper/2022/hash/acc1ec4a9c780006c9aafd595104816b-Abstract-Datasets_and_Benchmarks.html). Our datasets are publicly available in the [data repository](https://github.com/pygod-team/data). **Please star, watch, and fork us for the active updates!**

## Usage

Expand Down Expand Up @@ -52,21 +52,23 @@ optional arguments:

For DGraph, we are not able to load the dataset automatically, because of the authors' restrictions. To reproduce the results, the dataset is publicly available [here](https://dgraph.xinye.com/dataset), and we detect the outliers on the whole graph and evaluate only on the test set. As for the GPU memory consumption experiments, we use pytorch_memlab to measure the peak of the active bytes. See [pytorch_memlab](https://github.com/Stonesjtu/pytorch_memlab) for more details.

## Citing us
## Cite us

Our [paper](https://arxiv.org/abs/2206.10071) is available on arxiv. If you use PyGOD in a scientific publication, we would appreciate citations to the following paper:
Our [benchmark paper](https://proceedings.neurips.cc/paper_files/paper/2022/hash/acc1ec4a9c780006c9aafd595104816b-Abstract-Datasets_and_Benchmarks.html) is publicly available. If you use BOND in a scientific publication, we would appreciate citations to the following paper:

```
@article{liu2022bond,
author = {Liu, Kay and Dou, Yingtong and Zhao, Yue and Ding, Xueying and Hu, Xiyang and Zhang, Ruitong and Ding, Kaize and Chen, Canyu and Peng, Hao and Shu, Kai and Sun, Lichao and Li, Jundong and Chen, George H. and Jia, Zhihao and Yu, Philip S.},
title = {BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs},
journal = {arXiv preprint arXiv:2206.10071},
year = {2022},
}
@article{liu2022bond,
title={Bond: Benchmarking unsupervised outlier node detection on static attributed graphs},
author={Liu, Kay and Dou, Yingtong and Zhao, Yue and Ding, Xueying and Hu, Xiyang and Zhang, Ruitong and Ding, Kaize and Chen, Canyu and Peng, Hao and Shu, Kai and Sun, Lichao and Li, Jundong and Chen, George H. and Jia, Zhihao and Yu, Philip S.},
journal={Advances in Neural Information Processing Systems},
volume={35},
pages={27021--27035},
year={2022}
}
```

or:

```
Liu, K., Dou, Y., Zhao, Y., Ding, X., Hu, X., Zhang, R., Ding, K., Chen, C., Peng, H., Shu, K., Sun, L., Li, J., Chen, G.H., Jia, Z., and Yu, P.S. 2022. BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs. arXiv preprint arXiv:2206.10071.
Liu, K., Dou, Y., Zhao, Y., Ding, X., Hu, X., Zhang, R., Ding, K., Chen, C., Peng, H., Shu, K. and Sun, L., Li, J., Chen, G.H., Jia, Z., and Yu, P.S. 2022. Bond: Benchmarking unsupervised outlier node detection on static attributed graphs. Advances in Neural Information Processing Systems, 35, pp.27021-27035.
```
18 changes: 11 additions & 7 deletions benchmark/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import torch
import argparse
import warnings
from pygod.metrics import *
from pygod.metric import *
from pygod.utils.utility import load_data
from utils import init_model

Expand All @@ -24,7 +24,7 @@ def main(args):
y = data.y.bool()
k = sum(y)

if np.isnan(score).any():
if torch.isnan(score).any():
warnings.warn('contains NaN, skip one trial.')
continue

Expand All @@ -35,11 +35,15 @@ def main(args):
print(args.dataset + " " + model.__class__.__name__ + " " +
"AUC: {:.4f}±{:.4f} ({:.4f})\t"
"AP: {:.4f}±{:.4f} ({:.4f})\t"
"Recall: {:.4f}±{:.4f} ({:.4f})".format(np.mean(auc), np.std(auc),
np.max(auc), np.mean(ap),
np.std(ap), np.max(ap),
np.mean(rec), np.std(rec),
np.max(rec)))
"Recall: {:.4f}±{:.4f} ({:.4f})".format(torch.mean(auc),
torch.std(auc),
torch.max(auc),
torch.mean(ap),
torch.std(ap),
torch.max(ap),
torch.mean(rec),
torch.std(rec),
torch.max(rec)))


if __name__ == '__main__':
Expand Down
5 changes: 2 additions & 3 deletions benchmark/time.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,8 @@
import shutil
import argparse
import warnings
import numpy as np
from utils import init_model
from pygod.metrics import eval_roc_auc
from pygod.metric import eval_roc_auc
from pygod.utils import load_data
from torch_geometric.utils import remove_isolated_nodes

Expand Down Expand Up @@ -37,7 +36,7 @@ def main(args):
y = data.y.bool()[mask]
auc = eval_roc_auc(y, score)

if np.isnan(score).any():
if torch.isnan(score).any():
warnings.warn('contains NaN, skip one trial.')
continue

Expand Down
16 changes: 8 additions & 8 deletions benchmark/type.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import torch
import argparse
import warnings
from pygod.metrics import *
from pygod.metric import *
from pygod.utils import load_data
from utils import init_model

Expand All @@ -27,7 +27,7 @@ def main(args):
ys = data.y >> 1 & 1
kc, ks = sum(yc), sum(ys)

if np.isnan(score).any():
if torch.isnan(score).any():
warnings.warn('contains NaN, skip one trial.')
continue

Expand All @@ -44,12 +44,12 @@ def main(args):
"AP: {:.4f}±{:.4f} ({:.4f})\tRecall: {:.4f}±{:.4f} ({:.4f})\n"
"Structural: AUC: {:.4f}±{:.4f} ({:.4f})\t"
"AP: {:.4f}±{:.4f} ({:.4f})\tRecall: {:.4f}±{:.4f} ({:.4f})"
.format(np.mean(aucc), np.std(aucc), np.max(aucc),
np.mean(apc), np.std(apc), np.max(apc),
np.mean(recc), np.std(recc), np.max(recc),
np.mean(aucs), np.std(aucs), np.max(aucs),
np.mean(aps), np.std(aps), np.max(aps),
np.mean(recs), np.std(recs), np.max(recs)))
.format(torch.mean(aucc), torch.std(aucc), torch.max(aucc),
torch.mean(apc), torch.std(apc), torch.max(apc),
torch.mean(recc), torch.std(recc), torch.max(recc),
torch.mean(aucs), torch.std(aucs), torch.max(aucs),
torch.mean(aps), torch.std(aps), torch.max(aps),
torch.mean(recs), torch.std(recs), torch.max(recs)))


if __name__ == '__main__':
Expand Down
34 changes: 18 additions & 16 deletions benchmark/utils.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from random import choice
from pygod.models import *
from pygod.detector import *
from pyod.models.lof import LOF
from torch_geometric.nn import MLP
from sklearn.ensemble import IsolationForest


Expand Down Expand Up @@ -103,14 +104,14 @@ def init_model(args):
batch_size=batch_size,
num_neigh=num_neigh)
elif model_name == 'gcnae':
return GCNAE(hid_dim=choice(hid_dim),
weight_decay=weight_decay,
dropout=choice(dropout),
lr=choice(lr),
epoch=epoch,
gpu=gpu,
batch_size=batch_size,
num_neigh=num_neigh)
return GAE(hid_dim=choice(hid_dim),
weight_decay=weight_decay,
dropout=choice(dropout),
lr=choice(lr),
epoch=epoch,
gpu=gpu,
batch_size=batch_size,
num_neigh=num_neigh)
elif model_name == 'guide':
return GUIDE(a_hid=choice(hid_dim),
s_hid=choice([4, 5, 6]),
Expand All @@ -124,13 +125,14 @@ def init_model(args):
num_neigh=num_neigh,
cache_dir='./tmp')
elif model_name == "mlpae":
return MLPAE(hid_dim=choice(hid_dim),
weight_decay=weight_decay,
dropout=choice(dropout),
lr=choice(lr),
epoch=epoch,
gpu=gpu,
batch_size=batch_size)
return GAE(hid_dim=choice(hid_dim),
weight_decay=weight_decay,
dropout=choice(dropout),
lr=choice(lr),
epoch=epoch,
gpu=gpu,
batch_size=batch_size,
backbone=MLP)
elif model_name == 'lof':
return LOF()
elif model_name == 'if':
Expand Down
15 changes: 15 additions & 0 deletions docs/_templates/detector.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
.. role:: hidden
:class: hidden-section
.. currentmodule:: {{ module }}

{{ name | underline}}

{% if objname == "ANOMALOUS" or objname == "ONE" or objname == "Radar" or objname == "SCAN"%}
.. autoclass:: {{ name }}
:show-inheritance:
:members: fit, predict
{% else %}
.. autoclass:: {{ name }}
:show-inheritance:
:members: fit, predict, emb
{% endif %}
9 changes: 9 additions & 0 deletions docs/_templates/nn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. role:: hidden
:class: hidden-section
.. currentmodule:: {{ module }}

{{ name | underline}}

.. autoclass:: {{ name }}
:show-inheritance:
:members: forward, loss_func, process_graph
39 changes: 21 additions & 18 deletions docs/api_cc.rst
Original file line number Diff line number Diff line change
@@ -1,37 +1,40 @@
API CheatSheet
==============

The following APIs are applicable for all detector models for easy use.
The following APIs are applicable for all detectors for easy use.

* :func:`pygod.models.base.BaseDetector.fit`: Fit detector. y is ignored in unsupervised methods.
* :func:`pygod.models.base.BaseDetector.decision_function`: Predict raw anomaly scores of PyG Graph G using the fitted detector
* :func:`pygod.detector.Detector.fit`: Fit detector.
* :func:`pygod.detector.Detector.decision_function`: Predict raw anomaly scores of PyG data using the fitted detector

Key Attributes of a fitted model:
Key Attributes of a fitted detector:

* :attr:`pygod.models.base.BaseDetector.decision_scores_`: The outlier scores of the training data. The higher, the more abnormal.
Outliers tend to have higher scores.
* :attr:`pygod.models.base.BaseDetector.labels_`: The binary labels of the training data. 0 stands for inliers and 1 for outliers/anomalies.
* :attr:`pygod.detector.Detector.decision_score_`: The outlier scores of the input data. Outliers tend to have higher scores.
* :attr:`pygod.detector.Detector.label_`: The binary labels of the input data. 0 stands for inliers and 1 for outliers.

For the inductive setting:

* :func:`pygod.models.base.BaseDetector.predict`: Predict if a particular sample is an outlier or not using the fitted detector.
* :func:`pygod.models.base.BaseDetector.predict_proba`: Predict the probability of a sample being outlier using the fitted detector.
* :func:`pygod.models.base.BaseDetector.predict_confidence`: Predict the model's sample-wise confidence (available in predict and predict_proba).

* :func:`pygod.detector.Detector.predict`: Predict if a particular sample is an outlier or not using the fitted detector.

**Input of PyGOD**: Please pass in a `PyTorch Geometric (PyG) <https://www.pyg.org/>`_ data object.
See `PyG data processing examples <https://pytorch-geometric.readthedocs.io/en/latest/notes/introduction.html#data-handling-of-graphs>`_.

* :func:`pygod.models.base.BaseDetector.process_graph` (you do not need to call this explicitly): Process the raw PyG data object into a tuple of sub data objects needed for the underlying model.


See base class definition below:
Base Detector
-------------

pygod.models.base module
------------------------
``Detector`` is the abstract class for all detectors:

.. automodule:: pygod.models.base
.. autoclass:: pygod.detector.Detector
:members:
:undoc-members:
:show-inheritance:
:inherited-members:
:inherited-members:

Deep Detector
-------------

By inherit ``Detector`` class, we also provide base deep detector class for deep learning based detectors to ease the implementation.

.. autoclass:: pygod.detector.DeepDetector
:members: emb, init_model, forward_model, process_graph
:undoc-members: fit, decision_function, predict
Loading

0 comments on commit 4d9b473

Please sign in to comment.