Skip to content

Conversation

@akaashrp
Copy link
Contributor

@akaashrp akaashrp commented Nov 24, 2025

Changes

  • Migrate logit bias application, penalty application, and sampling to WebGPU kernels
  • Recompile models in binary-mlc-llm-libs after TVM FFI updates (v0_2_80)
  • Add enable_latency_breakdown field to extra_body to enable collection of statistics during sampleTokenFromLogits
  • Temporarily remove support for q0f32 models due to correctness issues
  • Upgrade versions of devDependencies to resolve issues with outdated packages
  • Add .nvmrc and migrate CI to node 24
  • Update eslint and jest configs

TVMjs / web-runtime

  • Upgrade to version 0.23.0-dev1 to maintain compatibility with TVM FFI updates

TVM and MLC commits

The WASMs that the config now pulls were compiled at the following TVM and MLC-LLM commits (mlc-ai/binary-mlc-llm-libs#158):
TVM: apache/tvm@c8515e1
MLC-LLM: mlc-ai/mlc-llm@4084e7f

Copy link
Member

@CharlieFRuan CharlieFRuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work. Thank you so much!

@akaashrp akaashrp merged commit f43cc5c into mlc-ai:main Nov 24, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants