PhrasIS-baselines

Repository for "PhrasIS: phrase inference and similarity benchmark" paper

PhrasIS is a dataset of Phrase pairs with Inference and Similarity annotations for the evaluation of semantic representations. The dataset is analyzed, showing the relation between inference labels and similarity scores, and is evaluated with several well-known techniques obtaining satisfactory performance.

Requirements

Python 3
NumPy
SciPy
Pandas
NLTK
Seaborn
heatmapz

All dependencies in exception of heatmapz are installed when setting the conda environment

To install heatmapz please run pip3 install heatmapz

Usage

Experiments can be run either in Python 3 using the given conda environment or launching the 00.launchColab ipython notebook.

Running the following command will create an environment called phrasis with all required dependencies:

conda env create -f environment.yml

To activate the environment run the following command:

conda activate phrasis

To deactivate the environment run the following command:

conda deactivate

To run the ipython notebook file, clone or import the github repo in google colab or local installation of jupyter notebook and run the file

Dataset

PhrasIS dataset can be found in ./dataset

Features

We compute a bunch of lexical and onthology based features, including :

jaccard overlap
length differences
wordnet similarity features: lch, jcn, wup, ...

Models

We compute several models, including:

Machine Learning models: DecisionTree, LogisticRegression, SVM, ...
Ensemble methods: Random Forest, ExtraTree, Bagging, ...
Kernel Ridge

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
dataset		dataset
src		src
.gitignore		.gitignore
00.launchColab.ipynb		00.launchColab.ipynb
00.launchPreprocess.py		00.launchPreprocess.py
01.launchCorrelations.py		01.launchCorrelations.py
02.launchML.py		02.launchML.py
02.launchML_others.py		02.launchML_others.py
03.launch_embedding_baselines.py		03.launch_embedding_baselines.py
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhrasIS-baselines

Requirements

Usage

Dataset

Features

Models

Word Embedding

About

Releases

Packages

Contributors 2

Languages

License

ilopezgazpio/PhrasIS-baselines

Folders and files

Latest commit

History

Repository files navigation

PhrasIS-baselines

Requirements

Usage

Dataset

Features

Models

Word Embedding

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages