Here's a description of this stage quoting from the project report:
The all-features stage assesses feature performance and does not require computing features for all negatives. Here we selected a random subset of 3,020 (4 × 755) negatives. Little error was introduced by this optimization, since the predominant limitation to performance assessment was the small number of positives (755) rather than negatives.
Here are some of the notable datasets:
data/metapaths.json
contains information on and queries for each metapath used to generate features.data/dwpc.tsv.bz2
contains is a tidy (long) TSV of the output from each DWPC query performed including path count (PC), degree-weighted path count (DWPC), and query runtime.
For documentation requests, open a GitHub Issue. Documentation pull requests also welcome.