A tool for categorizing places and generating descriptions for a family activities application.
- AI-powered categorization of places
- Venue type classification (indoor/outdoor)
- Retail store detection
- Description generation for places
- Django web interface for managing data
- MongoDB integration for data storage
- Multi-endpoint support for AI model access
# Create a virtual environment
pyenv virtualenv dirmgr
pyenv activate dirmgr
# Install the package
pip install -e .
# Run Migrations
python src/dirsite/web/manage.py migrate# Install with development dependencies
pip install -e ".[dev]"
# Set up pre-commit hooks
pre-commit install
# To run pre-commit on all files (may catch things that aren't real issues)
pre-commit run --all-files# Start all services
docker-compose up -dThis project requires Python 3.8-3.12. It has been tested with Python 3.12.9.
The project is fully compatible with Python 3.12.9. When using Python 3.12, ensure you:
-
Create a virtual environment with Python 3.12:
python3.12 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the dependencies:
pip install -e ".[dev]" # Install with development dependencies
-
All tests are compatible with Python 3.12.9 and can be run using the commands in the Development - Running Tests section.
Create a .env file in the project root with the following variables:
MONGODB_CONNECTION_STRING=mongodb://username:password@hostname:port/
OLLAMA_ENDPOINTS=http://server1:11434/api/generate,http://server2:11434/api/generate
OLLAMA_MODEL=mistral:7b
python -m dirsite.cli categorize --test --sample-file data/examples/sample_place.jsonpython -m dirsite.cli generate-descriptions --process-allpython -m dirsite.web.manage runserver# Process places
./scripts/process_places.sh
# Generate descriptions
./scripts/generate_descriptions.sh
# Run Django server
./scripts/run_django.shThe project includes comprehensive unit and integration tests. You can run the tests in various ways:
# Run all tests with coverage reporting
pytest# Run only unit tests
pytest tests/unit/
# Run only integration tests
pytest tests/integration/# Run tests for categorizers
pytest tests/unit/categorizers/
# Run tests for database utilities
pytest tests/unit/db/
# Run tests for generators
pytest tests/unit/generators/# Run a specific test file
pytest tests/unit/categorizers/test_venue_type.py
# Run with verbose output
pytest tests/unit/categorizers/test_venue_type.py -v# Run a specific test function
pytest tests/unit/categorizers/test_venue_type.py::TestVenueTypeClassifier::test_classifier_name
# Run with trace-on-failure
pytest tests/unit/categorizers/test_venue_type.py::TestVenueTypeClassifier::test_classifier_name -v --trace# Print test output as they run (useful for troubleshooting)
pytest -v
# Show local variables in tracebacks
pytest --showlocals
# Run tests that match a keyword expression
pytest -k "venue_type"
# Stop after first failure
pytest -xvs
# Increase verbosity and show print statements
pytest -vv
# Generate HTML coverage report
pytest --cov=dirsite --cov-report=htmlYou can also manually test the categorization process using the provided CLI commands:
# Test with sample data
python -m dirsite.cli categorize --test --sample-file data/examples/sample_place.json --details-file data/examples/sample_place_detail.json
# Test description generation
python -m dirsite.cli generate-descriptions --test --sample-file data/examples/sample_place.json --details-file data/examples/sample_place_detail.jsonThe repo includes multiple example files for testing:
# Test with Best Buy example
python -m dirsite.cli categorize --test --sample-file data/examples/best_buy.json --details-file data/examples/best_buy_details.json
# Test with Marketplace example
python -m dirsite.cli categorize --test --sample-file data/examples/mkt_place.json --details-file data/examples/mkt_place_details.json# Run Django server
python -m dirsite.web.manage runserver
# Run Django checks
python -m dirsite.web.manage check
# List Django URLs
python -m dirsite.web.manage show_urls# Format code
black src/ tests/
# Lint code
flake8 src/ tests/
# Type check
mypy src/
# Build documentation
cd docs && make html-
src/dirsite/- Main packagecategorizers/- AI classification modulesgenerators/- Content generation modulesdb/- Database utilitiesutils/- Shared utilitiesweb/- Django web interface
-
data/- Data directoryraw/- Input dataprocessed/- Output dataexamples/- Example data
-
tests/- Test directoryunit/- Unit testsintegration/- Integration tests
-
docs/- Documentation -
scripts/- Utility scripts
MIT
Contributions are welcome! Please feel free to submit a Pull Request.