POS-Tagging

Introduction

Part-Of-Speech (POS) tagging is the process of assigning a part-of-speech tag (Noun, Verb, Adjective, etc.) to each word in an input text. In other words, the main objective is to identify which grammatical category do each word in given test belong to. POS Tagging is difficult because some words can represent more than one part of speech at different times, i.e. they are ambiguous in nature. Consider the following examples:

The whole team played well. adverb

You are doing well for yourself. adjective

Well, this is a lot of work. interjection

The well is dry. noun

Tears were beginning to well in her eyes. verb

For all these statements, the same word well assumes different parts of speech. Hence, we use Hidden Markov Model which is a probabilistic model along with Viterbi Algorithm to assign parts of speech tags.

Domains Explored

Machine Learning, Natural Language Processing, Dynamic Programming

Results

Accuracy of the POS Tagging Model using Viterbi algorithm is 0.9531. The accuracy of the model is determined by comparing it with true labels in /data/test.pos.

Click here to get detailed description for all Parts-of-Speech Tags.

Output 1

I have one apple and three oranges

Output 2

Who is the president of USA?

Output 3

India is my country of residence

Documentation

For Documentation, click here or refer /documentation/README.md

File Structure

👨‍💻POS-Tagging
 ┣ 📂assets                            // Contains all the reference gifs, images
 ┣ 📂components                        // Header Files
 ┃ ┣ 📄data.cpp
 ┃ ┣ 📄data.hpp
 ┃ ┣ 📄tokenize.cpp
 ┃ ┣ 📄tokenize.hpp
 ┃ ┣ 📄viterbi.cpp
 ┃ ┣ 📄viterbi.hpp
 ┃ ┣ 📄results.cpp
 ┃ ┣ 📄results.hpp
 ┣ 📂data                              // Dataset
 ┃ ┣ 📄dataset.pos
 ┃ ┣ 📄sample.pos
 ┃ ┣ 📄test.pos
 ┣ 📂documentation                     // Notes & Documentation for project
 ┃ ┣ 📄notes.pdf
 ┃ ┣ 📄README.md
 ┣ 📂Miscellaneous                     // .ipynb implementation
 ┃ ┣ 📄POS-Tagging-C2_W2_Assignment
 ┣ 📄main.cpp
 ┣ 📄README.md

Project Workflow

Getting Started

Prerequisites

To download and use this code, the minimum requirements are:

g++ : The GNU C++ compiler, available as part of the GNU Compiler Collection (GCC) or Any C++ Compiler
Windows 7 or later (64-bit), Any modern Linux distribution (e.g., Ubuntu, Debian, Fedora, Arch Linux)
Microsoft VS Code or any other IDE

Installation

Clone the project by typing the following command in your Terminal/CommandPrompt

git clone https://github.com/PritK99/POS-Tagging.git

Navigate to the MazeBlaze-v2.1 folder

cd POS-Tagging

Usage

Once the requirements are satisfied, you can easily build and run the project on your machine. Use the following commands to

Build the code:

g++ .\main.cpp .\components\data.cpp .\components\tokenize.cpp .\components\viterbi.cpp .\components\results.cpp

Run the executable

./a.out (For Linux)

or

./a (For Windows)

Acknowledgements and References

Natural Language Processing with Probabilistic Models by DeepLearning.AI
YouTube video by Serrano.Academy explaining Hidden Markov Model and Viterbi Algorithm

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
assets		assets
components		components
data		data
documentation		documentation
miscellaneous		miscellaneous
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

POS-Tagging

Table of Contents

Introduction

Domains Explored

Results

Output 1

Output 2

Output 3

Documentation

File Structure

Project Workflow

Getting Started

Prerequisites

Installation

Usage

Acknowledgements and References

License

About

Releases

Packages

Languages

License

PritK99/POS-Tagging

Folders and files

Latest commit

History

Repository files navigation

POS-Tagging

Table of Contents

Introduction

Domains Explored

Results

Output 1

Output 2

Output 3

Documentation

File Structure

Project Workflow

Getting Started

Prerequisites

Installation

Usage

Acknowledgements and References

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages