Skip to content

kusl/MyImapDownloader

Repository files navigation

MyImapDownloader

Build and Test .NET 10 License: AGPL v3

A high-performance, cross-platform command-line tool for archiving emails from IMAP servers. Built with .NET 10, featuring SQLite-backed indexing, intelligent delta syncing, and robust resilience patterns.


Notice: This project contains code generated by Large Language Models such as Claude and Gemini. All code is experimental whether explicitly stated or not.


Table of Contents

Key Features

Feature Description
Delta Sync Uses IMAP UIDs and SQLite indexing to fetch only new messages since the last run
Read-Only Operations Opens IMAP folders in FolderAccess.ReadOnly mode—never modifies or deletes server data
Robust Deduplication Message-ID based deduplication with O(1) SQLite lookups before any network fetch
Self-Healing Index Automatically detects database corruption and rebuilds from .meta.json sidecar files
Resilience Patterns Exponential backoff (up to 5 minutes) and circuit breaker via Polly
OpenTelemetry Native Distributed tracing, metrics, and structured logging exported to JSONL files
Cross-Platform Runs natively on Windows, Linux, and macOS

Safety Guarantees

This application never deletes emails. The codebase is designed purely for archival and backup:

  • IMAP folders are opened in read-only mode (FolderAccess.ReadOnly)
  • No delete, move, or flag-modification commands exist in the codebase
  • Local archives are append-only—existing .eml files are never overwritten or removed
  • Even if the remote server demands deletion, this tool will not comply

Installation

Prerequisites

Build from Source

git clone https://github.com/collabskus/MyImapDownloader.git
cd MyImapDownloader
dotnet build -c Release

Run

# Linux/macOS
./MyImapDownloader/bin/Release/net10.0/MyImapDownloader \
  -s imap.gmail.com -u [email protected] -p "app-password" -o ~/EmailArchive

# Windows
.\MyImapDownloader\bin\Release\net10.0\MyImapDownloader.exe `
  -s imap.gmail.com -u [email protected] -p "app-password" -o C:\EmailArchive

Usage

Command-Line Options

Option Short Default Description
--server -s required IMAP server address
--username -u required Email account username
--password -p required Account password or App Password
--port -r 993 IMAP port
--output -o EmailArchive Output directory for archived emails
--all-folders -a false Sync all folders, not just INBOX
--start-date Filter: download emails after this date (yyyy-MM-dd)
--end-date Filter: download emails before this date (yyyy-MM-dd)
--verbose -v false Enable verbose/debug logging

Examples

# Download INBOX only
dotnet run -- -s imap.gmail.com -u [email protected] -p "app-password" -o ~/EmailArchive

# Download all folders with date range
dotnet run -- -s imap.gmail.com -u [email protected] -p "app-password" \
  -o ~/EmailArchive --all-folders --start-date 2020-01-01

# Custom output directory with verbose logging
dotnet run -- -s imap.gmail.com -u [email protected] -p "app-password" \
  -o ~/Documents/hikingfan_at_gmail_dot_com -v

Configuration

Gmail Setup

  1. Enable 2-Step Verification
  2. Generate an App Password
  3. Use the 16-character app password with -p

Other IMAP Providers

Provider Server Port Notes
Gmail imap.gmail.com 993 Requires App Password
Outlook/Office 365 outlook.office365.com 993 May require App Password
Yahoo Mail imap.mail.yahoo.com 993 Requires App Password
Fastmail imap.fastmail.com 993 Supports regular password
ProtonMail 127.0.0.1 1143 Via ProtonMail Bridge

Application Settings (appsettings.json)

{
  "Telemetry": {
    "ServiceName": "MyImapDownloader",
    "ServiceVersion": "1.0.0",
    "OutputDirectory": "telemetry",
    "MaxFileSizeMB": 25,
    "EnableTracing": true,
    "EnableMetrics": true,
    "EnableLogging": true,
    "FlushIntervalSeconds": 5,
    "MetricsExportIntervalSeconds": 15
  }
}

Architecture & Storage

Output Structure

EmailArchive/
├── index.v1.db                    # SQLite index (deduplication + sync state)
├── INBOX/
│   ├── cur/                       # Downloaded messages
│   │   ├── 1702900000.abc123.mypc:2,S.eml
│   │   ├── 1702900000.abc123.mypc:2,S.eml.meta.json
│   │   └── ...
│   ├── new/                       # (Reserved for future use)
│   └── tmp/                       # Atomic write staging area
├── Sent/
│   └── cur/
│       └── ...
└── Archive/
    └── cur/
        └── ...

SQLite Database Schema

The index.v1.db file contains two tables:

-- Tracks all archived messages for deduplication
CREATE TABLE Messages (
    MessageId TEXT PRIMARY KEY,
    Folder TEXT NOT NULL,
    ImportedAt TEXT NOT NULL
);

-- Tracks sync state for delta downloads
CREATE TABLE SyncState (
    Folder TEXT PRIMARY KEY,
    LastUid INTEGER NOT NULL,
    UidValidity INTEGER NOT NULL
);

Delta Sync Strategy

  1. Checkpoint Loading: On startup, retrieves LastUid and UidValidity for each folder from SQLite
  2. UID Search: Queries server for UID > LastUid only—skips already-archived messages
  3. Header-First Verification: Fetches envelope metadata before downloading body; checks Message-ID against index
  4. Streaming Download: Streams email body directly to disk (minimal RAM usage)
  5. Atomic Write: Writes to tmp/, then moves to cur/ for crash safety
  6. Checkpoint Update: Updates LastUid in database after each successful batch

Self-Healing Recovery

If the SQLite database is corrupted or missing:

  1. Corrupt database is moved to index.v1.db.corrupt.<timestamp>
  2. Fresh database is created with schema
  3. All existing .meta.json files are scanned
  4. Index is rebuilt from sidecar metadata
  5. Sync continues without re-downloading existing emails

Telemetry & Observability

Telemetry is written to XDG-compliant directories (or fallback locations) in JSONL format:

~/.local/share/MyImapDownloader/telemetry/
├── traces/
│   └── traces_2025-12-24_0001.jsonl
├── metrics/
│   └── metrics_2025-12-24_0001.jsonl
└── logs/
    └── logs_2025-12-24_0001.jsonl

Instrumented Spans

Span Name Description
EmailArchiveSession Root span for entire application run
DownloadEmails IMAP connection and folder enumeration
ProcessFolder Per-folder delta sync processing
ProcessEmail Individual email download
SaveStream Disk write and metadata extraction
RebuildIndex Database recovery operation

Metrics

Metric Type Description
storage.files.written Counter Total .eml files written
storage.bytes.written Counter Total bytes written to disk
storage.write.latency Histogram Write operation duration (ms)
emails.downloaded Counter Successfully downloaded emails
emails.skipped Counter Duplicates skipped
emails.errors Counter Download failures

Development

Build & Test

# Build
dotnet build

# Run tests
dotnet test

# Run with verbose output
dotnet run -- -s imap.example.com -u user -p pass -v

Project Structure

MyImapDownloader/
├── Directory.Build.props          # Shared build properties
├── Directory.Packages.props       # Centralized package versions
├── MyImapDownloader/              # Main application
│   ├── Program.cs                 # Entry point
│   ├── EmailDownloadService.cs    # IMAP sync logic
│   ├── EmailStorageService.cs     # SQLite + file storage
│   └── Telemetry/                 # OpenTelemetry exporters
└── MyImapDownloader.Tests/        # TUnit test suite

Key Dependencies

Package Purpose
MailKit IMAP client
Microsoft.Data.Sqlite SQLite database
Polly Resilience patterns
OpenTelemetry Observability
CommandLineParser CLI argument parsing
TUnit Testing framework

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

This means:

  • You can use, modify, and distribute this software
  • If you modify and deploy it as a network service, you must release your source code
  • All derivative works must also be licensed under AGPL-3.0

See the LICENSE file for the complete license text.


Built with ❤️ using .NET 10

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •