lawsuits-summarizer

A lawsuit is long (5-100 pages) document detailing a legal case. The goal of this project is to train a summarization model and expose an API endpoint such that given a lawsuit document returns its short (less than 500 words) summary. The summary should contain all important details and be written in a language such that a person not trained in the legal domain can understand.

Challenges.

Document length: lawsuits far exceeding the standard document length (512-16384 tokens) that can be processed by a pretained language model.
Domain specificity: The summarization model needs not only to be able to handle long documents but also to comprehend the legal languageg

Solutions.

Employing extractive summarization before abstractive summarization to reduce the documents length
Intermediate training and finetuning on a domian specific dataset to master the domain language.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
application.py		application.py
endpoints.py		endpoints.py
summarize.ini		summarize.ini
summarize.py		summarize.py
tests.py		tests.py
utils.py		utils.py
validations.py		validations.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lawsuits-summarizer

Challenges.

Solutions.

About

Releases

Packages

Languages

Directorman9/lawsuits-summarizer

Folders and files

Latest commit

History

Repository files navigation

lawsuits-summarizer

Challenges.

Solutions.

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages