Skip to content

Feature Request: Allow Specifying Custom File Formats for Processing #4

@pablotoledo

Description

@pablotoledo

As a user, I want to be able to specify custom file formats for processing when using the Readium tool. Currently, the tool has pre-defined filters and exclusions which do not always meet my specific needs.

I would like to provide examples of how this feature should be implemented in both command-line usage and library consumption:

Command-line Usage:

readium /path/to/directory --include-ext .docx,.xls

This example should ensure that only files with the .docx and .xls extensions are processed, disregarding any other exclusions or filters.

Library Consumption:

from readium import ReadConfig, Readium

# Configure the reader with custom file formats
config = ReadConfig(
    max_file_size=5 * 1024 * 1024,  # 5MB limit
    include_extensions={".docx", ".xls"}  # Custom extensions to include
)

# Initialize reader
reader = Readium(config)

# Process directory with custom file formats
summary, tree, content = reader.read_docs('/path/to/directory')

# Access results
print("Summary:", summary)
print("\nFile Tree:", tree)
print("\nContent:", content)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions