Skip to content

feat: update compatibility for transformers>=4.57.1 and vLLM 0.14#61

Open
avinsongw wants to merge 1 commit intogenlm:mainfrom
genwebcorp:feat/transformers-4.57-vllm-0.14-compat
Open

feat: update compatibility for transformers>=4.57.1 and vLLM 0.14#61
avinsongw wants to merge 1 commit intogenlm:mainfrom
genwebcorp:feat/transformers-4.57-vllm-0.14-compat

Conversation

@avinsongw
Copy link
Copy Markdown

Summary

  • Update dependency versions to support transformers>=4.57.1 and vLLM>=0.11.0
  • Fix deprecated DynamicCache.from_legacy_cache() API in HuggingFace backend
  • Major refactor of vLLM backend for V1 engine compatibility (vLLM 0.11+ removed V0 engine)

Changes

pyproject.toml

  • transformers>=4.57.1 (was >=4.36.0)
  • vllm>=0.11.0 (was >=0.6.6,<=0.10.0)

genlm/backend/llm/hf.py

  • Replace deprecated DynamicCache.from_legacy_cache(pasts) with DynamicCache(pasts)

genlm/backend/llm/vllm.py

  • Remove PassThroughLogitsProcessor class (V1 doesn't support per-request logits processors)
  • Use max_logprobs parameter to get full vocabulary logprobs instead
  • Update Counter import path (vllm.utils.counter)
  • Remove deprecated disable_async_output_proc and disable_log_requests params
  • Update tokenizer access for V1 API (async_llm_engine.tokenizer)
  • Update cleanup method (async_engine.shutdown())

tests/conftest.py

  • Add max_logprobs to ReferenceVirtualLM for V1 compatibility
  • Update tokenizer access
  • Simplify cleanup

Test plan

  • All HF backend tests pass
  • Cache/Trie/Vocabulary tests pass (47 passed, 3 skipped)
  • vLLM integration verified with standalone test (full vocab logprobs, properly normalized)

🤖 Generated with Claude Code

- Update pyproject.toml dependency versions:
  - transformers>=4.57.1 (was >=4.36.0)
  - vllm>=0.11.0 (was >=0.6.6,<=0.10.0)

- Fix deprecated DynamicCache.from_legacy_cache() in hf.py:
  - Use DynamicCache(pasts) directly instead

- Major refactor of vllm.py for V1 engine compatibility:
  - Remove PassThroughLogitsProcessor (V1 doesn't support per-request logits processors)
  - Use max_logprobs parameter to get full vocabulary logprobs
  - Update Counter import path (vllm.utils.counter)
  - Remove deprecated disable_async_output_proc and disable_log_requests params
  - Update tokenizer access for V1 API (async_llm_engine.tokenizer)
  - Update cleanup method (async_engine.shutdown())

- Update tests/conftest.py for vLLM V1 compatibility:
  - Add max_logprobs to ReferenceVirtualLM
  - Update tokenizer access
  - Simplify cleanup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant