FileFusion is a powerful command-line tool designed to concatenate and process files in a format optimized for Large Language Models (LLMs).
Installation β’ Quick Start β’ Features β’ Documentation β’ Examples
- Features
- Quick Start
- Installation
- Basic Usage
- Configuration
- Advanced Features
- Examples
- Uninstallation
- License
FileFusion streamlines your file processing workflow with:
-
π¦ Multiple Output Formats
- Support for XML, JSON, and YAML
- Preserved file metadata and structure
- Configurable output formatting
-
π― Smart Pattern Matching
- Powerful glob patterns for precise file selection
- Flexible include/exclude rules
- Detailed pattern guide
-
β‘οΈ High Performance
- Concurrent file processing
- Efficient memory usage
- Automatic file splitting for large outputs
-
π Advanced Size Control
- Individual file size limits
- Total output size management
- Automatic output splitting
- Detailed size reporting
-
π§Ή Intelligent Code Cleaning
- Multi-language support
- Comment preservation options
- Code structure optimization
- Whitespace management
-
π Reliability & Safety
- Atomic write operations
- Thorough error checking
- Dry run support
- Symlink handling
Get started with FileFusion in three simple steps:
-
Install:
curl -fsSL https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh | bash
-
Process current directory:
filefusion
-
Process specific files:
filefusion --pattern "*.{js,py}" --clean -o output.xml /path/to/project
Using curl:
curl -fsSL https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh | bash
Using wget:
wget -qO- https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh | bash
# Download and inspect the script first
curl -fsSL https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh > install.sh
chmod +x install.sh
./install.sh
Using Go:
go install github.com/drgsn/filefusion/cmd/filefusion@latest
Or download the latest binary for your platform from the releases page.
To uninstall FileFusion:
- Remove the installation directory:
rm -rf ~/.filefusion
- Remove FileFusion from your shell configuration file. Depending on your shell and OS, edit one of these files:
- macOS Bash users:
~/.bash_profile
- Linux Bash users:
~/.bashrc
- Zsh users:
~/.zshrc
- Fish users:
~/.config/fish/config.fish
- Windows PowerShell users:
$HOME/Documents/PowerShell/Microsoft.PowerShell_profile.ps1
Look for and remove these two lines:
# FileFusion
export PATH="$PATH:$HOME/.filefusion"
If you installed using Go:
go clean -i github.com/drgsn/filefusion/cmd/filefusion
Setting | Default Value | Description |
---|---|---|
Pattern | *.go,*.json,*.yaml,*.yml |
Default file patterns to process |
Max File Size | 10MB | Maximum size for individual input files |
Max Output Size | 50MB | Maximum total size for all processed content |
Max Output File | 30KB | Maximum size per output file (auto-splits) |
Output Format | XML | Default output format when not specified |
Exclude Pattern | none | No files excluded by default |
Clean Mode | disabled | Code cleaning and optimization |
Dry Run | disabled | Preview files to be processed |
# Process current directory with defaults
filefusion
# Process specific directory
filefusion /path/to/project
# Process multiple directories
filefusion /path/to/project1 /path/to/project2
# Generate specific output format
filefusion -o output.json /path/to/project
# Generate XML output
filefusion -o output.xml /path/to/project
# Generate JSON output
filefusion -o output.json /path/to/project
# Generate YAML output
filefusion -o output.yaml /path/to/project
For detailed pattern matching examples and rules, please refer to our Pattern Guide.
Here are some common patterns:
Pattern | Description |
---|---|
*.go |
All Go files |
*.{go,proto} |
All Go and Proto files |
src/**/*.js |
All JavaScript files under src |
!vendor/** |
Exclude vendor directory |
**/*_test.go |
All Go test files |
# Process only Python and JavaScript files
filefusion --pattern "*.py,*.js" /path/to/project
# Process all source files
filefusion -p "*.go,*.rs,*.js,*.py,*.java" /path/to/project
# Include configuration files
filefusion -p "*.yaml,*.json,*.toml,*.ini" /path/to/project
# Exclude test files
filefusion --exclude "*_test.go,test/**" /path/to/project
# Exclude build and vendor directories
filefusion -e "build/**,vendor/**,node_modules/**" /path/to/project
# Complex exclusion
filefusion -e "**/*.test.js,**/*tests*/**,**/dist/**" /path/to/project
# Increase individual file size limit to 20MB
filefusion --max-file-size 20MB /path/to/project
# Increase total output size limit to 100MB
filefusion --max-output-size 100MB /path/to/project
# Set maximum size per output file to 50KB (splits into multiple files if exceeded)
filefusion --max-output-file-size 50KB /path/to/project
# Set all size limits and enable cleaning
filefusion --max-file-size 20MB --max-output-size 100MB --max-output-file-size 50KB --clean /path/to/project
Size limits accept suffixes: B
, KB
, MB
, GB
, TB
When the processed content exceeds max-output-file-size
, FileFusion automatically splits the output into multiple files with sequential numbering (e.g., output.1.xml, output.2.xml, output.3.xml).
FileFusion includes a powerful code cleaning engine that optimizes files for LLM processing while preserving functionality. The cleaner supports multiple programming languages and offers various optimization options.
- Go, Java, Python, Swift, Kotlin
- JavaScript, TypeScript, HTML, CSS
- C++, C#, PHP, Ruby
- SQL, Bash
Option | Description | Default |
---|---|---|
--clean |
Enable code cleaning | false |
--clean-remove-comments |
Remove all comments | true |
--clean-preserve-doc-comments |
Keep documentation comments | true |
--clean-remove-imports |
Remove import statements | false |
--clean-remove-logging |
Remove logging statements | true |
--clean-remove-getters-setters |
Remove getter/setter methods | true |
--clean-optimize-whitespace |
Optimize whitespace | true |
# Basic cleaning with default options
filefusion --clean input.go -o clean.xml
# Preserve all comments
filefusion --clean --clean-remove-comments=false input.py -o clean.xml
# Remove everything except essential code
filefusion --clean \
--clean-remove-comments \
--clean-preserve-doc-comments=false \
--clean-remove-logging \
--clean-remove-getters-setters \
input.java -o clean.xml
# Clean TypeScript while preserving docs
filefusion --clean \
--clean-preserve-doc-comments \
--clean-remove-logging \
--pattern "*.ts" \
src/ -o clean.xml
# Clean multiple languages in a project
filefusion --clean \
--pattern "*.{go,js,py}" \
--clean-preserve-doc-comments \
--clean-remove-logging \
project/ -o clean.xml
The cleaner automatically detects and handles language-specific patterns:
-
Logging Statements: Recognizes common logging patterns
- Go:
log.
,logger.
- JavaScript/TypeScript:
console.
,logger.
- Python:
logging.
,logger.
,print
- Java:
Logger.
,System.out.
,System.err.
- And more...
- Go:
-
Documentation: Preserves language-specific doc formats
- Go:
//
and/* */
doc comments - Python: Docstrings
- JavaScript/TypeScript: JSDoc
- Java: Javadoc
- Go:
-
Code Structure: Maintains language idioms while removing noise
- Preserves package/module structure
- Keeps essential imports
- Removes debug/test code
filefusion \
--pattern "*.go" \
--exclude "*_test.go,vendor/**" \
--output project.json \
--max-file-size 5MB \
/path/to/go/project
filefusion \
--pattern "*.js,*.ts,*.jsx,*.tsx,*.css,*.html" \
--exclude "node_modules/**,dist/**,build/**" \
--output web-project.xml \
/path/to/web/project
# Clean and optimize a Go project
filefusion \
--pattern "*.go" \
--exclude "*_test.go" \
--clean \
--clean-remove-comments \
--clean-remove-logging \
--output optimized.xml \
/path/to/go/project
# Clean TypeScript/JavaScript with preserved documentation
filefusion \
--pattern "*.ts,*.js" \
--clean \
--clean-preserve-doc-comments \
--clean-remove-logging \
--clean-optimize-whitespace \
--output web-optimized.xml \
/path/to/web/project
<?xml version="1.0" encoding="UTF-8"?>
<documents>
<document index="1">
<source>main.go</source>
<document_content>
package main
...
</document_content>
</document>
</documents>
{
"documents": [
{
"index": 1,
"source": "main.go",
"document_content": "package main\n..."
}
]
}
documents:
- index: 1
source: main.go
document_content: |
package main
...
-
Start with Dry Run
filefusion --dry-run /path/to/project
This shows which files will be processed without making changes.
-
Optimize for Large Projects
filefusion --max-output-file-size 1MB --clean /path/to/project
Use larger output file sizes and cleaning for better LLM processing.
-
Handle Large Codebases
filefusion --pattern "*.{go,js}" --exclude "test/**,vendor/**" /path/to/project
Use specific patterns and exclusions to focus on relevant files.
- Check if patterns match your file extensions
- Verify files exist in the specified directory
- Make sure patterns don't conflict with exclusions
- Increase
--max-output-size
- Use more specific patterns
- Split processing into multiple runs
- Check file permissions
- Verify file encodings (UTF-8 recommended)
- Ensure sufficient disk space
Mozilla Public License Version 2.0