Thank you for your interest in contributing to DTAT OCR! This document provides guidelines and information for contributors.
-
Clone the repository:
git clone https://github.com/NotADevIAmaMeatPopsicle/DTAT-OCR.git cd DTAT-OCR -
Create a virtual environment:
uv venv --python 3.12 --seed source .venv/bin/activate # Linux/Mac .venv\Scripts\activate # Windows
-
Install dependencies:
uv pip install -r requirements.txt
-
Initialize the database:
python worker.py init
-
Run the development server:
python -m uvicorn api:app --host 0.0.0.0 --port 8000 --reload
# Process a test document
python worker.py process samples/sample_paper.pdf --json
# Check system health
curl http://localhost:8000/health- Check if the bug has already been reported in Issues
- If not, create a new issue with:
- Clear description of the bug
- Steps to reproduce
- Expected vs actual behavior
- System information (OS, Python version, GPU if applicable)
- Open an issue with the "enhancement" label
- Describe the feature and its use case
- Discuss implementation approach if you have ideas
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Make your changes
- Test your changes thoroughly
- Commit with clear messages:
git commit -m "Add feature X" - Push to your fork:
git push origin feature/my-feature - Open a Pull Request
- Keep PRs focused on a single change
- Update documentation if needed
- Add tests for new functionality
- Ensure all existing tests pass
- Follow the existing code style
For significant architectural changes, please:
- Read existing ADRs in
docs/adr/ - Create a new ADR using
docs/adr/template.md - Include the ADR in your PR for discussion
- Follow PEP 8 for Python code
- Use type hints where practical
- Keep functions focused and documented
- Prefer clarity over cleverness
By contributing, you agree that your contributions will be licensed under the MIT License.
Feel free to open an issue for any questions about contributing.