Skip to content

Conversation

@miso-belica
Copy link
Owner

No description provided.

@miso-belica miso-belica changed the title feat: Add Polish language support\n\nThis commit introduces Polish language support to the sumy library.\n\n- Adds 'polish.py' as a custom stemmer using 'pystempel'.\n- Includes 'polish.txt' for Polish stopwords.\n- Modifies '__init__.py' to register the new Polish stemmer and stopwords. Add Polish language support Dec 26, 2025
Manamama at HP Old and others added 7 commits December 28, 2025 16:05
…nguage support to the sumy library.\n\n- Adds 'polish.py' as a custom stemmer using 'pystempel'.\n- Includes 'polish.txt' for Polish stopwords.\n- Modifies '__init__.py' to register the new Polish stemmer and stopwords.
Added a section about recent changes to the README.
Add missing pieces to PR #224:
- Add test for Polish stemmer in test_stemmers.py
- Add Polish to setup.py extras_require with pystempel dependency
- Add Polish to setup.py classifiers
- Remove improper README change
- Clean up merge artifact files (.orig and .rej)

The Polish language support now includes:
- Polish stemmer using pystempel library
- Polish stopwords file
- Proper metadata and dependency configuration
- Test coverage
CRITICAL FIX: The Polish stemmer was importing pystempel at module level,
which caused the entire sumy library to fail if pystempel was not installed,
even for users who only needed English or other languages.

Changes:
- Refactor polish.py to use lazy import pattern (like greek.py)
- Move pystempel import inside stem_word() function
- Add helpful error message if pystempel is not installed
- Cache stemmer instance on function to avoid recreation overhead

Now pystempel is only required when actually using Polish language,
making it a true optional dependency like other special languages.

Tested:
- English works without pystempel installed ✓
- Polish gives clear error without pystempel ✓
- Polish works correctly with pystempel installed ✓
- Move try-except block inside hasattr check so import only happens once
- Update CHANGELOG.md with Polish language feature and fix

Performance improvement: pystempel import and initialization now only
happens on first use of Polish stemmer, not on every function call.
@miso-belica miso-belica force-pushed the claude/complete-polish-language-eXKDA branch from 94e4f15 to f169d02 Compare December 28, 2025 16:07
The pystempel library has a bug with Python 3.13+ where it tries to
import Resource from importlib.resources, which was removed in Python 3.13.

- Add pytest.skipif to skip Polish test on Python 3.13+
- Add comment in pyproject.toml noting the compatibility issue
- Polish language support works fine on Python 3.8-3.12

Reference: https://github.com/dzieciou/pystempel/issues/44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants