-
-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add standardized check commands to README #686
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Replace direct URL downloads with ScienceBase API client (sciencebasepy) - Remove niquests dependency in favor of native ScienceBase download handling - Enhance logging for better debugging of file downloads and extractions - Add explicit ZIP file cleanup after TIF extraction - Update dependencies to include sciencebasepy and setuptools
Restructures the species habitat analysis script: - Implement modular architecture with ScienceBaseClient, RasterSet, and HabitatDataProcessor classes for better maintainability - Integrate sciencebasepy for ZIP-based habitat map downloads from USGS ScienceBase, with automatic cleanup of temporary files - Add multi-format output support (CSV/Parquet/Arrow) with Arrow as default, using dictionary encoding for optimized storage and performance - Enhance metadata by including species common and scientific names from ScienceBase API - Add comprehensive CLI arguments for configuration and debug logging - Improve robustness with better error handling and type annotations
* Switch to using zipfile.Path for more Pythonic ZIP file handling * Enforce expectation of exactly one TIF file per ZIP * Add error handling for unexpected file counts
Refactors `species.py` to use TOML configuration (`_data/species.toml`) instead of `argparse`, improving flexibility and maintainability. Settings include `item_ids`, `vector_fp`, `output_dir`, `output_format`, and `debug`. Relative paths (e.g., `../data/us-10m.json`) are resolved relative to the TOML file, and basic validation is added. `RasterSet.extract_tifs_from_zips()` now uses `zipfile.Path` and enforces a single `.tif` file per ZIP, raising a `RuntimeError` otherwise. Type hinting fixes are also included.
- Rename species columns for consistency and clarity: - GAP_Species -> gap_species_code - percent_habitat -> habitat_yearround_pct - CommonName -> common_name - ScientificName -> scientific_name - Expand module docstring with detailed information about: - Data sources and resolution - Projection details - Habitat value meanings - Output format options - Improve code comments for future extensibility - reference habitat data source in species.toml - list alternative output format options in toml The changes prepare for potential future addition of seasonal habitat data (and summer/winter habitat data) while maintaining backward compatibility.
Enhance docstrings in the ScienceBaseClient class for clarity and completeness, specifically for the __init__, download_zip_files, and get_species_info methods. Add note about RuntimeWarning to analyze_habitat_rasters. Clarify error handling behavior in download_zip_files.
Document standardized check commands in README for local development. This provides a simple interim solution for running quality checks that match CI, without introducing additional dependencies. Closes #685
Closing this PR. The branch (feature/add-uv-test-task) includes commits that are part of another open PR. I'll be creating a new PR shortly with a clean history, containing only the README changes. This avoids rewriting shared history. I will update this comment with a link to the new PR once it's created. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Documents the standard quality check commands in the README to help contributors run the same checks locally that are performed in CI. This provides a simple, documented way to validate changes before submitting PRs, without introducing additional tooling dependencies.
Changes
Related Issues
Closes #685
Notes
uv run
as a task runner astral-sh/uv#5903