Direct-Neighbor-VRD

This repository contains the official implementation of the paper titled Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural Network.

Abstract

This paper introduces a novel approach for information extraction (IE) from visually rich documents (VRD) by employing a directed weighted graph representation. This approach enhances performance by capturing relationships among VRD components using directed weighted graphs, as opposed to traditional methods based on Euclidean distance. The IE task is treated as a node classification problem, with graph convolutional networks (GCNs) processing the VRD graphs. Evaluations conducted on five real-world datasets demonstrate the efficacy and alignment with established norms.

Dependencies

To run the code, you need the following libraries:

DGL (Deep Graph Library)
PyTorch (Deep Learning Framework)
Python (Programming Language)
NetworkX (Graph Library)
OpenCV-Python (Computer Vision Library)

You can install these dependencies using pip:

pip install -r requirements.txt

Usage

Building the Graph-based Dataset

To build a graph-based dataset, use the following command:

python builder.py build -d <dataset>

This command creates a graph-based dataset for node classification for a specific dataset.

Optional Arguments:

-d DATASET, --dataset DATASET: Choose the dataset to use. Options are XFUND, FUNSD, SROIE, Wildreceipt, or CORD.
-n MAX_NODE, --max_node MAX_NODE: Maximum number of nodes per node (edges per node). Default is 6.

Example:

python builder.py build -d CORD

Training the Model

To train the model, use the following command:

python train.py -h

Arguments:

-d DATANAME, --dataname DATANAME: Select the dataset for model training. Options are FUNSD, SROIE, Wildreceipt, or CORD.
-p PATH, --path PATH: Path to the dataset for model training.
-hs HIDDEN_SIZE, --hidden_size HIDDEN_SIZE: GCN hidden size. Default is 32.
-hl HIDDEN_LAYERS, --hidden_layers HIDDEN_LAYERS: Number of GCN hidden layers. Default is 20.
-lr LEARNING_RATE, --learning_rate LEARNING_RATE: Learning rate. Default is 0.01.
-e EPOCHS, --epochs EPOCHS: Number of epochs. Default is 200.

Example:

python train.py -d CORD -hs 64 -hl 128

Acknowledgments

We acknowledge the contributions of the authors of the paper and the developers of the libraries used in this project.

Name		Name	Last commit message	Last commit date
Latest commit History 362 Commits
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
args.py		args.py
build_graphs.sh		build_graphs.sh
builder.py		builder.py
main.py		main.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
train.sh		train.sh
train_cnn_for_classification.py		train_cnn_for_classification.py
train_word_embedding.py		train_word_embedding.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Direct-Neighbor-VRD

Abstract

Dependencies

Usage

Building the Graph-based Dataset

Training the Model

Acknowledgments

About

Releases

Packages

Languages

License

HamzaGbada/direct-neighbor-vrd

Folders and files

Latest commit

History

Repository files navigation

Direct-Neighbor-VRD

Abstract

Dependencies

Usage

Building the Graph-based Dataset

Training the Model

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages