Word-Vectors

Overview

This repository implements different architectures for training word embeddings. The architectures include Continuous Bag-of-Words (CBOW), skip-gram, and Global Vectors for Word Representation (GloVe). Wikipedia articles is used as training data, while the Google Analogy dataset and the WordSim353 dataset is used for validating the word embeddings.

Continuous Bag-of-Words (CBOW) architecture implementation
Skip-gram architecture implementation
Global Vectors for Word Representation (GloVe) architecture implementation

Setup

Install required python version 3.11
Install required packages pip install -r source/requirements.txt (We recommend using virtual environment, follow guide under Virtual Environment Setup below and skip this step)
Run program python source/main.py

Virtual Environment Setup

Windows

Get the package pip install virtualenv
Create a new empty instance of python environment py -3.11 -m venv ./.venv
Activate the environment source .venv/Scripts/activate
Install the packages required by this project pip install -r source/requirements.txt

Linux

Get the package pip install virtualenv
Create a new empty instance of python environment python -m venv ./.venv
Activate the environment source .venv/bin/activate
Install the packages required by this project pip install -r source/requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Word-Vectors

Overview

Setup

Virtual Environment Setup

Windows

Linux

Files

README.md

Latest commit

History

README.md

File metadata and controls

Word-Vectors

Overview

Setup

Virtual Environment Setup

Windows

Linux