claimex

A command line tool for extracting structured information from argument-rich/claim-rich PDF documents

Requirements

Go 1.25.4+
Python 3.11 >=
Docker 28.1.1+

How It Works

Search for any topic via SearXNG
Specify a number of files to aggregate
Files are processed using a spaCy Span Categorizer (SpanCat) model trained on ~1500 silver labels to detect and extract claim spans
View analysis for each document returned in JSON format

Pipeline Results

Spans

Sources
- Who made the claim
Claim Verbs
- The verb used to make the claim
Claim Modifiers
- Modifier(s) that indicate the strength/degree with which the claim was made
Claim Contents
- The claim being made

Other Info

Origin Document
- The document spans were extracted from
Origin Sentence
- The sentence that contains a given span
Claim Density Score
- A value representing how claim-heavy a given document is
Confidence Score
- How confident the model was at predicting a given span

NOTE: File sizes of up to ~200MB are recommended for optimal performance

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
cmd		cmd
internal		internal
scripts		scripts
services		services
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SETUP.md		SETUP.md
claimex-demo.gif		claimex-demo.gif
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

claimex

Requirements

How It Works

Pipeline Results

Spans

Other Info

NOTE: File sizes of up to ~200MB are recommended for optimal performance

Getting Started

Follow the steps in `SETUP.md` to ensure all tools, modules, and other dependencies are installed and setup correctly.

Powered By

spaCy for NLP

SearXNG for search

MinIO for object storage

About

Uh oh!

Releases

Packages

Languages

License

sam8beard/claim-extraction

Folders and files

Latest commit

History

Repository files navigation

claimex

Requirements

How It Works

Pipeline Results

Spans

Other Info

NOTE: File sizes of up to ~200MB are recommended for optimal performance

Getting Started

Follow the steps in SETUP.md to ensure all tools, modules, and other dependencies are installed and setup correctly.

Powered By

spaCy for NLP

SearXNG for search

MinIO for object storage

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Follow the steps in `SETUP.md` to ensure all tools, modules, and other dependencies are installed and setup correctly.

Packages