Skip to content

Rajackar/Email-parcel-information-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mail-Tracker

An IMAP email watcher that automatically detects parcel tracking numbers from incoming emails. It connects to your mailbox via IMAP (designed for use with Proton Bridge, but works with any IMAP server), scans emails for tracking codes, and outputs structured JSON events.

Focused on carriers commonly used in Europe / The Netherlands.

Supported Carriers

Carrier Example Code Priority
PostNL 3SDEVC1234567 95
UPS 1Z999AA10123456784 90
DHL Express JD123456789012 85
DHL Parcel NL JVGL12345678901234 80
DPD 01234567890123 75
FedEx 7489123456789012345 70
Budbee BUD-ABC123456 60
GLS GLS123456789 50

Carriers and their regex patterns are fully configurable via carriers.json.

How It Works

  1. Connects to your IMAP server on a configurable interval
  2. Searches for unread emails (or all emails if PROCESS_READ=true)
  3. Extracts visible text from plain-text or HTML email bodies (invisible elements like scripts and tracking pixels are stripped)
  4. Matches text against carrier-specific regex patterns from carriers.json
  5. Outputs a JSON event per detected tracking number to stdout and optionally to a JSONL file
  6. Tracks processed UIDs in a state file to avoid duplicate processing

Quick Start

Docker Compose (recommended)

  1. Copy .env.example to .env and fill in your credentials:
IMAP_SERVER=192.168.1.20
IMAP_PORT=1143
USE_SSL=false
USE_STARTTLS=true
EMAIL_ACCOUNT=your@email.com
EMAIL_PASSWORD=your-bridge-password
MAILBOXES=INBOX,Folders/Receipts
CHECK_INTERVAL=30
CONNECT_TIMEOUT=10
CARRIERS_FILE=/app/carriers.json
STATE_FILE=/app/state/processed_uids.json
OUTPUT_JSONL=/app/state/events.jsonl
MARK_SEEN=false
PROCESS_READ=false
PYTHONUNBUFFERED=1
TZ=Europe/Amsterdam
KEYWORDS=track,tracking,pakket,package,shipment,bezorging,levering,verzending,parcel
  1. Start the container:
docker compose up -d

Standalone

pip install -r requirements.txt
export EMAIL_ACCOUNT="your@email.com"
export EMAIL_PASSWORD="your-password"
python watcher.py

Configuration

All configuration is done via environment variables:

Connection

Variable Default Description
IMAP_SERVER 192.168.1.20 IMAP server hostname or IP
IMAP_PORT 1143 IMAP server port
USE_SSL false Use IMAP4_SSL (port 993)
USE_STARTTLS true Use STARTTLS (ignored if USE_SSL=true)
EMAIL_ACCOUNT IMAP username
EMAIL_PASSWORD IMAP password
CONNECT_TIMEOUT 10 Connection timeout in seconds

Scanning

Variable Default Description
MAILBOXES INBOX,Folders/Receipts Comma-separated list of IMAP folders to scan
CHECK_INTERVAL 30 Seconds between poll cycles
PROCESS_READ false If true, process all emails (read and unread). If false, only unread.
MARK_SEEN false If true, mark processed emails as read. If false, leave them as-is.
KEYWORDS track,tracking,pakket,... Comma-separated keywords that boost detection confidence

Output

Variable Default Description
STATE_FILE /app/state/processed_uids.json Path to the processed UIDs state file
CARRIERS_FILE /app/carriers.json Path to the carrier patterns config
OUTPUT_JSONL /app/state/events.jsonl Path to the JSONL output file (set empty to disable)
LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)

Output Format

Each detected tracking number produces a JSON event on stdout and in the JSONL file:

{
  "ts": "2026-02-06T15:42:00+01:00",
  "mailbox": "Folders/Receipts",
  "uid": "110",
  "subject": "Verzendbevestiging order 12345",
  "from": "\"Shop\" <info@shop.nl>",
  "carrier": "PostNL",
  "tracking_number": "3SDEVC1234567",
  "normalized": "3SDEVC1234567",
  "confidence": 1.0,
  "candidates": [
    {"carrier": "PostNL", "normalized": "3SDEVC1234567", "priority": 95, "confidence": 1.0}
  ]
}

Customizing Carriers

Edit carriers.json to add, remove, or adjust carriers. Each carrier entry supports:

{
  "CarrierName": {
    "patterns": ["\\bREGEX_PATTERN\\b"],
    "normalize": {"strip_spaces": true, "uppercase": true},
    "priority": 85,
    "example": "EXAMPLE123"
  }
}
  • patterns — Array of regex patterns to match tracking codes
  • normalize — Normalization rules applied to matched values
  • priority — Higher priority carriers win when multiple carriers match the same code (0–100)
  • example — Example tracking number for reference

Project Structure

├── watcher.py           # Main IMAP watcher and tracking detection
├── carriers.json        # Carrier regex patterns configuration
├── docker-compose.yml   # Docker Compose setup
├── requirements.txt     # Python dependencies
└── state/
    ├── processed_uids.json   # Tracks which emails have been processed
    └── events.jsonl          # Detected tracking events output

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages