Tensorflow-based implementation of Ridle (Relation - Instance Distribution Learning) for supervised node classification on (directed) relational graphs, as published in CIKM 2021.
PCA projections for the learned entity representations. Popular classes from cross-domain KGs were selected forvisualization. Ridle (top) allows for a better separation of the instances into their respective classes in comparison to thestate-of-the-art RDF2Vec approach (bottom).
To use this package, you must install the following dependencies first:
- python (>=3.7)
- Tensorflow (>=2.1)
- numpy (>=1.18)
- pandas (>=1.2.3)
To use Ridle, you must provide your graph data as a pkl file in the format S-P-O) in the folder dataset. Examples are given in the folder dataset. You can learn the representations on umls using Ridle, you can use the following command, specifying the dataset with the argument --dataset. This file loads the umls knowledge graph graph and learns a representation using Ridle, exploiting a target distribution over the usage of relations. The representations are saved in a csv in the same folder as the dataset.
python learn_representation.py --dataset umls
Afterwards you can evaluate the representations learned by Ridle to predict instance types. The evaluation is based on 10-fold cross-validation. The results are saved in a csv.
python evaluate_instance_type.py --dataset umls
The file run.sh is given to combine both commands in order to immediately evualuate the representations. It applies the method on the knowledge graph umls and stores the embeddings in a csv. Afterwards, it applies the learned representations for instance type prediction. The results are saved in a csv. Due to the size of DBpedia and Wikidata and the limited space for uploading the datasets, we uploaded subsets including Books_Wikidata, umls and Songs_DBpedia for which experiments were conducted.
sh run.sh
The following image shows the results reported in the Paper. Considering the cross-domain KGs (cf. Table 2a), Ridle signifi-cantly outperforms the state-of-the-art methods with respect to the metric F1-macro. Considering the performance of the approaches in the category-specific KGs (cf. Table 2b), we can conclude that Ridle achieves competitive performance in comparison to the best approaches. The experimental results showed that, on average, Ridle outperforms current state-of-the-art models in several KGs, which sets a new baseline in the tasks of predicting instance type assertions.
If you use this code for predicting instance type assertions in Knowledge Graphs as part of your project or paper, please cite the following work:
@inproceedings{weller2021ridle,
author = {Weller, Tobias and Acosta, Maribel},
title = {Predicting Instance Type Assertions in Knowledge Graphs Using Stochastic Neural Networks},
year = {2021},
isbn = {9781450384469},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3459637.3482377},
doi = {10.1145/3459637.3482377},
booktitle = {Proceedings of the 30th ACM International Conference on Information & Knowledge Management},
pages = {2111–2118},
numpages = {8},
keywords = {knowledge graphs, stochastic networks, entity classification, entity type prediction},
location = {Virtual Event, Queensland, Australia},
series = {CIKM '21}
}
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.