Name		Name	Last commit message	Last commit date
parent directory ..
data		data
1-partition.ipynb		1-partition.ipynb
2-metapaths.ipynb		2-metapaths.ipynb
3-extract.ipynb		3-extract.ipynb
4-matrixfy.ipynb		4-matrixfy.ipynb
5-primary-aucs.ipynb		5-primary-aucs.ipynb
5.5-transplit-DWPCs.ipynb		5.5-transplit-DWPCs.ipynb
5.6-model.ipynb		5.6-model.ipynb
6-pyvisualize.ipynb		6-pyvisualize.ipynb
6-rvisualize.ipynb		6-rvisualize.ipynb
7-transform.ipynb		7-transform.ipynb
8-model-performances.ipynb		8-model-performances.ipynb
99-permuted-DWPC-distributions.ipynb		99-permuted-DWPC-distributions.ipynb
README.md		README.md
pipeline.sh		pipeline.sh
servers.json		servers.json

README.md

Stage 1: all features on a subset of observations

Here's a description of this stage quoting from the project report:

The all-features stage assesses feature performance and does not require computing features for all negatives. Here we selected a random subset of 3,020 (4 × 755) negatives. Little error was introduced by this optimization, since the predominant limitation to performance assessment was the small number of positives (755) rather than negatives.

Datasets

Here are some of the notable datasets:

data/metapaths.json contains information on and queries for each metapath used to generate features.
data/dwpc.tsv.bz2 contains is a tidy (long) TSV of the output from each DWPC query performed including path count (PC), degree-weighted path count (DWPC), and query runtime.

For documentation requests, open a GitHub Issue. Documentation pull requests also welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

all-features

all-features

README.md

Stage 1: all features on a subset of observations

Datasets

Files

all-features

Directory actions

More options

Directory actions

More options

Latest commit

History

all-features

Folders and files

parent directory

README.md

Stage 1: all features on a subset of observations

Datasets