Feature Request: Automatic Chunking (a.k.a. “YOLO Divide”) for Oversized Codebases

It would be immensely helpful if Repomix could automatically split large codebases into multiple output files once a certain token threshold (e.g. 128k) is reached. This “YOLO divide” approach would allow users to simply specify a cutoff and let Repomix handle the heavy lifting. The resulting “chunked” files could then be fed sequentially into AI tools that have strict context size limits.

---

### Use Case
I often want to provide my entire codebase to a large language model for analysis or troubleshooting, but the codebase exceeds the 128k token limit by a wide margin—sometimes up to ten times more. Manually chopping the output into smaller pieces is tedious. An automated chunking solution would streamline this process significantly.

### Desired Behavior
1. **Config Option**: A single setting, for example:
   ```json
   {
     "output": {
       "yoloDivideIntoChunksIfExceedToken": 128000
     }
   }
   ```
2. **Automatic Chunk Generation**:  
   - If the total token count surpasses the specified threshold, Repomix splits the output into separate files (e.g., `repomix-output-chunk-1.xml`, `repomix-output-chunk-2.xml`, etc.).
   - The exact method of splitting doesn’t matter much to me. Any reasonable approach (by file boundaries, lines, or token count) would be sufficient, as long as each chunk stays under the limit.

3. **Outcome**:  
   - Users can quickly copy and paste each chunk into their AI tool of choice (e.g., O1 Pro, ChatGPT, Claude, etc.) without worrying about hitting token limits.

### Why This Matters
- **Efficiency**: Eliminates manual slicing of the codebase output.  
- **Ease of Use**: Allows large repositories to be handled in one go, rather than requiring multiple runs or external scripts.  
- **Flexibility**: Users who just want “the whole codebase in the AI” can get a straightforward multi-file output to paste into their model in chunks.

---

Thank you for considering this request! An automatic chunking feature would be a game-changer for those of us dealing with large codebases and strict LLM context limits.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Automatic Chunking (a.k.a. “YOLO Divide”) for Oversized Codebases #424

Use Case

Desired Behavior

Why This Matters

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Automatic Chunking (a.k.a. “YOLO Divide”) for Oversized Codebases #424

Description

Use Case

Desired Behavior

Why This Matters

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions