Skip to content

sivaraj-v/mediaflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MediaFlow

MediaFlow is a deterministic, filesystem-native media processing pipeline designed to bring long-term order, correctness, and cost control to photo and video libraries.

It treats photos and videos as digital records, not social content. MediaFlow does not try to understand or rank your memories. It ensures that what you already own remains chronologically correct, auditable, and storage-efficient over time.

MediaFlow Cloud Overview


🚀 Installation

Prerequisites

  • Node.js 18+ (Download)
  • npm (comes with Node.js)

Option A: Install via npm (Recommended for Users)

Run the following command to install MediaFlow globally:

npm install -g @sivaraj-v/mediaflow

Verify installation:

mediaflow --version
# Should output: 1.0.0

Option B: Build from Source (For Developers)

If you want to contribute or modify the code:

# 1. Clone the repository
git clone https://github.com/sivaraj-v/mediaflow.git
cd mediaflow

# 2. Install dependencies
npm install

# 3. Build the project (bundles everything to dist/)
npm run build

# 4. Link for local development
npm link

Note: After linking, changes to dist/ are immediately reflected in the mediaflow command.


Why MediaFlow Exists

Modern media libraries grow automatically, not intentionally.

Photos and videos are continuously added by:

  • Cameras and burst modes
  • Messaging apps
  • Screenshots
  • Minor edits and exports
  • Cloud sync tools

Over time this creates silent entropy.

In real libraries, 30–60% of storage is typically consumed by:

  • Near-duplicate images
  • Low-value noise
  • Metadata corruption
  • Untracked derivatives

This leads to:

  • Rising cloud and backup costs
  • Slower backups and restores
  • Difficulty finding meaningful memories
  • Loss of trust in the archive itself

MediaFlow exists to stop this decay.


Design Principles

MediaFlow is built on a few strict principles:

  • Deterministic behavior (same input → same output)
  • Filesystem-native (no app lock-in)
  • No AI inference or probabilistic decisions
  • No irreversible or destructive operations
  • Full metadata preservation
  • Every decision must be explainable

If MediaFlow cannot decide safely, it does nothing.


Media Processing Pipeline

1. Organize by Date — Chronology Repair

Organize by Date

The Problem

Dates are frequently corrupted when:

  • Media is shared via messaging apps
  • Files are copied between devices
  • Cloud services rewrite timestamps

Example:

IMG_20181225_092311.jpg
Filesystem date: 2023-06-14
EXIF DateTimeOriginal: 2018-12-25 09:23:11

Most gallery apps will incorrectly place this photo in 2023.

How MediaFlow Decides

MediaFlow applies a Zero Fallback policy to ensure entries are organized by when they were truly captured on device:

  1. Trust ONLY EXIF:DateTimeOriginal.
  2. If missing or invalid, the file is placed in unknown/.

Explicitly Ignored: CreateDate, FileCreateDate, FileModifyDate, birthtime, and mtime.

Result:

Files/
└── 2018-dec/
└── 2023-jun/
└── unknown/

This restores historical correctness.

2. Group Similar Images — Duplicate & Burst Control

Group Similar Images

Real-World Input

IMG_4021.jpg
IMG_4022.jpg
IMG_4023.jpg
IMG_4021(1).jpg       # Messaging app duplicate
IMG_4021_EDIT.jpg     # Minor crop or filter

These files are visually similar but not identical.

How Grouping Works (Non-AI)

MediaFlow computes independent classical signals:

Signal Purpose
Perceptual hash Near-duplicate detection
Color histogram Visual similarity
Edge density Structural similarity
Filename distance Burst and app duplication

Decision rule: A group is created only if 3 out of 4 signals agree.

Output (Non-Destructive)

IMG_4021.group/
├── IMG_4021.jpg
├── IMG_4022.jpg
├── IMG_4023.jpg
├── IMG_4021(1).jpg
└── IMG_4021_EDIT.jpg

Nothing is deleted. No automatic “best photo” selection.

3. Optimize Images — Storage With Intent

Optimize Images

Reality of Existing Tools

Many photo tools rewrite EXIF metadata, reset timestamps, or apply compression without visibility. This is risky for long-term archives.

MediaFlow Optimization (Rating-Aware)

MediaFlow respects existing ratings and applies explicit rules:

  1. Rating-Aware Processing: Star rating represents user intent, not internal codec logic.
  2. Lossless Guarantee: ★5 always remains lossless (direct copy).
  3. Explicit Metadata Tiers: Every processed image receives a searchable XMP tag (e.g., Quality/high).
  4. Deterministic Profiles: Tables are the single source of truth for all formats.

JPEG / JPG

★ Star Intent Quality Chroma Subsampling Metadata Tag
★5 Lossless Quality/excellent
★4 High 88 4:4:4 Quality/high
★3 Balanced 75 4:2:0 Quality/standard
★2 Strong 60 4:2:0 Quality/low
★1 Aggressive 45 4:2:0 Quality/aggressive

PNG

★ Star Intent Palette Colours Compression Metadata Tag
★4-5 Lossless No Level 9 Quality/high
★3 Balanced Yes 256 Level 9 Quality/standard
★2 Strong Yes 128 Level 9 Quality/low
★1 Aggressive Yes 64 Level 9 Quality/aggressive

WebP (Modern, mixed content)

★ Star Intent Mode Quality Metadata Tag
★5 Lossless Lossless Quality/excellent
★4 High Lossy 85 Quality/high
★3 Balanced Lossy 70 Quality/standard
★2 Strong Lossy 55 Quality/low
★1 Aggressive Lossy 40 Quality/aggressive

AVIF (Next-generation)

★ Star Intent Quality (CQ) Speed Metadata Tag
★5 Lossless Lossless Slow Quality/excellent
★4 High 60 Medium Quality/high
★3 Balanced 45 Medium Quality/standard
★2 Strong 35 Slow Quality/low
★1 Aggressive 28 Slow Quality/aggressive

Note: Unsupported formats or unrated files (0 stars) follow the "Balanced" (★3) profile unless manual mode is used.


4. Semantic Organize — Signal vs Noise

Semantic Organize

Output Structure

photos/
└── originals/
    └── IMG_1023.jpg

derived/
├── edits/
├── screenshots/
└── messaging/

videos/
└── VID_20200101.mp4

Nothing is deleted. Everything remains traceable.


Usage & Commands

📖 Usage & Commands

❓ Getting Help

View all commands and options:

mediaflow --help

View help for a specific command:

mediaflow organize --help

# View help for the new config management system
mediaflow config --help

1. Organize by Date

Restores chronological order based on when the media was truly captured on device (using ONLY EXIF:DateTimeOriginal). No filesystem fallbacks.

mediaflow by-date \
  --source "C:\Photos\Camera Roll" \
  --destination "C:\Photos\Organized" \
  --format "year-month" \
  --action move

2. Group Similar Images

Groups bursts and duplicates without deleting them.

mediaflow group \
  --source "C:\Photos\Party" \
  --destination "C:\Photos\Party_Grouped"

3. Optimize Images

Compresses images based on star ratings (XMP) or a forced target size.

# Mode A: Rating-based (Default)
mediaflow optimize \
  --source "C:\Photos\Raw" \
  --destination "C:\Photos\Optimized" \
  --report

# Mode B: Size-based (Force all images to KB target)
mediaflow optimize \
  --source "C:\Photos\Raw" \
  --destination "C:\Photos\Optimized" \
  --target 200

4. Set Image Rating

Sets XMP star ratings on a single file or an entire folder recursively.

mediaflow rating \
  --source "C:\Photos\Holiday\IMG_001.jpg" \
  --star 5

# Or on an entire folder (recursive)
mediaflow rating \
  --source "C:\Photos\Holiday" \
  --star 4

5. General Organization

Sorts Screenshots, WhatsApp images, and Trash.

mediaflow organize \
  --source "C:\Downloads" \
  --destination "C:\Photos\Cleanup" \
  --mode whatsapp \
  --action move

How MediaFlow Differs From Existing Tools

Open-Source Tools

  • Excellent at single tasks
  • UI- or app-centric
  • No end-to-end archival logic

Image Optimizers

  • Stateless
  • Often destructive
  • No context or reversibility

Cloud & Gallery Platforms

  • AI-driven decisions
  • Irreversible behavior
  • Optimized for engagement, not ownership

MediaFlow adds the missing layer: deterministic, filesystem-level media hygiene.


⚙️ Configuration

MediaFlow uses a .mediaflowrc file to store your preferences. This file can be placed in your project directory (local) or your home directory (global).

🛠️ Configuration Management

1. Initialize Configuration

Quickly create a new .mediaflowrc with default settings.

# Create in current directory
mediaflow config init

# Create in home directory
mediaflow config init --global

# Overwrite existing config
mediaflow config init --force

2. Config Doctor (Diagnosis)

Validate your JSON syntax and check for deprecated or invalid keys.

mediaflow config doctor

📋 Configuration Schema

{
  "defaults": {
    "action": "copy",
    "format": "month"
  },
  "organize": {
    "mode": "whatsapp"
  },
  "group": {
    "cleanupAlt": true
  }
}

🔄 Configuration Status

Item Status
defaults.optimizeMode ❌ Removed
lossy / lossless ❌ Deprecated
Optimizer config ❌ None (CLI Driven)
Hidden defaults ❌ None
Future-proof ✅ Yes

Note: The optimizer is now strictly CLI-driven or deterministic via ratings to prevent accidental mass-compression via hidden config settings.

Author

Sivaraj.v
Connect on LinkedIn


License

Apache License 2.0


Final Note

Media libraries are not feeds. They are long-lived records.

MediaFlow exists to keep those records:

  • Correct
  • Affordable
  • Understandable
  • Trustworthy

Measured in decades—not product cycles.

About

Mediaflow is a command-line tool that organizes media into structured folders, groups files by date or custom rules, and performs lossless optimization. It streamlines large-media workflows with predictable, configurable operations.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors