Skip to content

Repomix silently skips files containing undecodable characters (�) or malformed UTF-8 input #752

@m0j0mada

Description

@m0j0mada

Description

Repomix skips input files that contain undecodable or malformed characters—such as the Unicode replacement character (�, U+FFFD)—without logging an error or warning. This creates a false impression that all files were successfully processed when in fact some were silently excluded.

# Create a clean UTF-8 file
echo 'test' > file.txt
file -I file.txt
# => text/plain; charset=us-ascii

# Run Repomix (file is processed)
repomix
# => Total Files: 1 files

# Append a problematic character
echo '' >> file.txt
file -I file.txt
# => text/plain; charset=utf-8

# Run Repomix again (file silently skipped)
repomix
📦 Repomix v1.2.1

✔ Packing completed successfully!

📈 Top 5 Files by Token Count:
──────────────────────────────────────────────────

🔎 Security Check:
──────────────────
✔ No suspicious files detected.

📊 Pack Summary:
────────────────
  Total Files: 0 files
 Total Tokens: 323 tokens
  Total Chars: 1,540 chars
       Output: repomix-output.md
     Security: ✔ No suspicious files detected

🎉 All Done!
Your repository has been successfully packed.

Expected Behavior:
Repomix should emit a clear error or warning when a file is skipped due to encoding or character issues, ideally with the filename and line number.

Actual Behavior:
The file is excluded without notice, and the file count drops unexpectedly. No error, warning, or debug message is shown.

Usage Context

Repomix CLI

Repomix Version

v1.2.1

Node.js Version

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or requestreleased

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions