09 Aug 10:14

XprobeBot

3e7ed86

v0.14.1

What's new in 0.14.1 (2024-08-09)

These are the changes in inference v0.14.1.

New features

FEAT: support SenseVoice audio-to-text model by @qinxuye in #2008
FEAT: support flux.1-schnell & flux.1-dev by @qinxuye in #2007
FEAT: support kolors image model by @qinxuye in #2028
FEAT: Add support for llama-3.1-instruct 405B model by @frostyplanet in #2025
FEAT: Support CogVideoX video model by @codingl2k1 in #2049
FEAT: Support MiniCPM-v-2_6 by @Minamiyama in #2031

Enhancements

ENH: Improve internal server error by @codingl2k1 in #2009
ENH: Add stream option in Benchmark by @Dawnfz-Lenfeng in #2038
ENH: optimize availability of vLLM by @qinxuye in #2046
ENH: [worker] Allow init supervisor_ref lazy by @frostyplanet in #1958
ENH: optimize performance of sglang by @qinxuye in #2050
REF: Mark Deprecate for prompt, system_prompt and chat_history parameters in chat client interface by @ChengjieLi28 in #2043

Bug fixes

BUG: fix flexible model register in worker by @frostyplanet in #2011
BUG: [UI] Fix the 'model_path' bug. by @yiboyasss in #2015
BUG: fix custom embedding launch error by @amumu96 in #2016

Tests

TST: Fix some dependency version issues by @ChengjieLi28 in #2042

Documentation

DOC: Directly launch custom model by model_path by @ChengjieLi28 in #2047
DOC: fix typo in README by @ArtificialZeng in #2048

Others

CHORE: Increased frequency of issue processing by @ChengjieLi28 in #2024

New Contributors

@ArtificialZeng made their first contribution in #2048
@Dawnfz-Lenfeng made their first contribution in #2038

Full Changelog: v0.14.0...v0.14.1

Contributors

frostyplanet, qinxuye, and 7 other contributors

Assets 2

05 Aug 06:06

XprobeBot

v0.14.0.post1

1112993

v0.14.0.post1

What's new in 0.14.0.post1 (2024-08-05)

These are the changes in inference v0.14.0.post1.

Enhancements

ENH: Improve internal server error by @codingl2k1 in #2009

Bug fixes

BUG: fix flexible model register in worker by @frostyplanet in #2011
BUG: [UI] Fix the 'model_path' bug. by @yiboyasss in #2015
BUG: fix custom embedding launch error by @amumu96 in #2016

Full Changelog: v0.14.0...v0.14.0.post1

Contributors

frostyplanet, amumu96, and 2 other contributors

Assets 2

02 Aug 08:42

XprobeBot

v0.14.0

dd85cfe

v0.14.0

What's new in 0.14.0 (2024-08-02)

These are the changes in inference v0.14.0.

New features

FEAT: Supports model_path input when launching models by @Valdanitooooo in #1918
FEAT: Support gte-Qwen2-7B-instruct and multi gpu deploy by @amumu96 in #1994

Enhancements

ENH: Add support of sglang for llama 3 qwen 2 by @luweizheng in #1947
ENH: add cache_limit_gb option for MLX by @qinxuye in #1954
ENH: [benchmark] Add api-key support by @frostyplanet in #1961
ENH: Support for Gemma 2 and Llama 3.1 Models for vllm & sglang by @vikrantrathore in #1929
ENH: [K8s] worker log dir name by @ChengjieLi28 in #1997
ENH: support image_to_image by @qinxuye in #1986
REF: enable sglang by default by @qinxuye in #1953

Bug fixes

BUG: Fix GLM chat by @codingl2k1 in #1966
BUG: fix match for transformers from model registered by @qinxuye in #1955
BUG: Load llama.so failed in docker image by @ChengjieLi28 in #1974
BUG: [UI]Modifying 'model format' again resulted in an error message. by @yiboyasss in #1990
BUG: fix loading multiple gguf parts by @qinxuye in #1987

Documentation

DOC: ascend support by @qinxuye in #1978
DOC: add CosyVoice doc by @qinxuye in #1980
DOC: Documents for K8s by @ChengjieLi28 in #2004

New Contributors

@vikrantrathore made their first contribution in #1929
@Valdanitooooo made their first contribution in #1918

Full Changelog: v0.13.3...v0.14.0

Contributors

frostyplanet, qinxuye, and 7 other contributors

Assets 2

26 Jul 10:59

XprobeBot

v0.13.3

aa51ff2

v0.13.3

What's new in 0.13.3 (2024-07-26)

These are the changes in inference v0.13.3.

New features

FEAT: GLM4 support stream tool call by @codingl2k1 in #1876
FEAT: support csg-wukong-chat-v0.1 by @qinxuye in #1916
FEAT: [UI]Add configuration for image and audio models. by @yiboyasss in #1922
FEAT: support mistral-nemo-instruct by @qinxuye in #1936
FEAT: CosyVoice speech by @codingl2k1 in #1881
FEAT: add llama-3.1, llama-3.1-instruct by @Weaxs in #1932
FEAT: support mistral-large-instruct by @qinxuye in #1944
Feat: support for llama 3.1 for vllm by @Phoenix500526 in #1935
FEAT: add rembg flexible model to remove background of image by @qinxuye in #1917

Enhancements

ENH: added MLX for llama-3-instruct, codestral, Yi-1.5-chat, internlm2.5-chat by @qinxuye in #1908
ENH: add gptq for llama-3-instruct by @Phoenix500526 in #1934

Bug fixes

BUG: fix inpainting and flexible infer due to inner API change by @qinxuye in #1907

Documentation

DOC: update new models to readme by @qinxuye in #1946

New Contributors

@Phoenix500526 made their first contribution in #1934

Full Changelog: v0.13.2...v0.13.3

Contributors

qinxuye, Phoenix500526, and 3 other contributors

Assets 2

19 Jul 11:26

XprobeBot

v0.13.2

880929c

v0.13.2

What's new in 0.13.2 (2024-07-19)

These are the changes in inference v0.13.2.

New features

FEAT: support sd inpainting models by @qinxuye in #1879
FEAT: Stream ChatTTS by @codingl2k1 in #1812
FEAT: support codegeex4 by @qinxuye in #1888
FEAT: support internlm2.5-chat & internlm2.5-chat-1m by @qinxuye in #1887

Enhancements

ENH: added gguf formats for gemma-2-it by @qinxuye in #1874

Bug fixes

BUG: Fix stream unicode issue for chinese characters when using vllm backend by @ChengjieLi28 in #1865
BUG: sglang stream error while stream_option not set by @wxiwnd in #1901
BUG: fix client import by @amumu96 in #1905

Full Changelog: v0.13.1...v0.13.2

Contributors

qinxuye, wxiwnd, and 3 other contributors

Assets 2

12 Jul 11:10

XprobeBot

v0.13.1

5e3f254

v0.13.1

What's new in 0.13.1 (2024-07-12)

These are the changes in inference v0.13.1.

New features

FEAT: support choose download hub by @amumu96 in #1841
FEAT: [UI] Specify download hub. by @yiboyasss in #1840
FEAT: Add support for Flexible Model by @shellc in #1671

Enhancements

ENH: Update ChatTTS by @codingl2k1 in #1776
ENH: Added the parameter 'worker_ip' to the 'register' model. by @hainaweiben in #1773
REF: Remove chatglm-cpp and Fix latest llama-cpp-python issue by @ChengjieLi28 in #1844

Bug fixes

BUG: cache status missing for model id with quantization placeholder by @Zihann73 in #1849

Documentation

DOC: Define a custom Rerank model by @Weaxs in #1821
DOC: update readme by @qinxuye in #1815

Others

FIX: [UI] Historical parameter echo bugs. by @yiboyasss in #1810
FIX: [UI] Fix download_hub bugs. by @yiboyasss in #1846
CHORE: Close issue when it is stale by @ChengjieLi28 in #1827
CHORE: Update issue template by @ChengjieLi28 in #1833

New Contributors

@Weaxs made their first contribution in #1821
@shellc made their first contribution in #1671

Full Changelog: v0.13.0...v0.13.1

Contributors

qinxuye, shellc, and 7 other contributors

Assets 2

05 Jul 10:33

XprobeBot

v0.13.0

007408c

v0.13.0

What's new in 0.13.0 (2024-07-05)

These are the changes in inference v0.13.0.

New features

FEAT: support MLX engine by @qinxuye in #1765
FEAT: add gemma-2-it by @qinxuye in #1774

Enhancements

ENH: added gguf files for qwen2 by @qinxuye in #1745
ENH: Add more log modules by @ChengjieLi28 in #1771
ENH: Continuous batching supports vision model ability by @ChengjieLi28 in #1724
ENH: Add guard for model launching by @frostyplanet in #1680
BLD: Supports Aliyun docker image by @ChengjieLi28 in #1753
BLD: GPU docker use vllm image as base by @ChengjieLi28 in #1759
BLD: Pin llama-cpp-python to v0.2.77 in Docker for stability by @ChengjieLi28 in #1767

Bug fixes

BUG: Fix glm4 tool call by @codingl2k1 in #1747
BUG: [UI] Fix authentication mode related bugs by @yiboyasss in #1772
BUG: Fix python client returns documents for rerank task by default by @ChengjieLi28 in #1780
BUG: Fix LLM based reranker may raise a TypeError by @codingl2k1 in #1794
BUG: fix deepseek-vl-chat by @qinxuye in #1795

Tests

TST: Fix llama-cpp-python issue in CI by @ChengjieLi28 in #1763

Documentation

DOC: Update continuous batching and docker usage by @ChengjieLi28 in #1785

Full Changelog: v0.12.3...v0.13.0

Contributors

frostyplanet, qinxuye, and 3 other contributors

Assets 2

28 Jun 07:36

XprobeBot

v0.12.3

3d9c261

v0.12.3

What's new in 0.12.3 (2024-06-28)

These are the changes in inference v0.12.3.

New features

FEAT: [UI] Add favorite function. by @yiboyasss in #1714
FEAT: add SD3 support by @qinxuye in #1723
FEAT: [UI] Add the function of automatically obtaining the last configuration information. by @yiboyasss in #1730
FEAT: support jina-rerank-v2 by @qinxuye in #1733
FEAT: tensorizer integration by @Zihann73 in #1579
FEAT: Delete cluster by @hainaweiben in #1719

Enhancements

ENH: Set the CSG Hub endpoint as an environment variable. by @hainaweiben in #1666
BLD: pin chatglm-cpp version v0.3.x by @ChengjieLi28 in #1692

Bug fixes

BUG: [UI] Fix deleting prompt_style when Model Family is other. by @yiboyasss in #1707
BUG: GGUF models cannot use GPU in docker by @ChengjieLi28 in #1710
BUG: Fix tool call observation by @codingl2k1 in #1648
BUG: [UI]fix favorite bug. by @yiboyasss in #1728
BUG: curl with stream returns unicode chars rather than chinese character by @ChengjieLi28 in #1732
BUG: Cluster info can be accessed without authorization in the auth mode by @ChengjieLi28 in #1731

Others

CHORE: upgrade version fix security vulnerability by @rickywu in #1674

New Contributors

@Zihann73 made their first contribution in #1579
@rickywu made their first contribution in #1674

Full Changelog: v0.12.2...v0.12.3

Contributors

qinxuye, rickywu, and 5 other contributors

Assets 2

22 Jun 17:37

XprobeBot

v0.12.2.post1

7705d4a

v0.12.2.post1

What's new in 0.12.2.post1 (2024-06-22)

These are the changes in inference v0.12.2.post1.

Enhancements

BLD: pin chatglm-cpp version v0.3.x by @ChengjieLi28 in #1692

Full Changelog: v0.12.2...v0.12.2.post1

Contributors

ChengjieLi28

Assets 2

21 Jun 09:14

XprobeBot

v0.12.2

5cef7c3

v0.12.2

What's new in 0.12.2 (2024-06-21)

These are the changes in inference v0.12.2.

New features

FEAT: Add Tools Support for Qwen Series MOE Models by @zhanghx0905 in #1642
FEAT: [UI]Modify the deletion function of a custom model. by @yiboyasss in #1656
FEAT: [UI]Custom model presents JSON data and modifies it. by @yiboyasss in #1670
FEAT: Add Rerank model token input/output usage by @wxiwnd in #1657

Enhancements

ENH: Continuous batching supports all the models with transformers backend by @ChengjieLi28 in #1659

Bug fixes

BUG: show error when user launch quantized model without device supported by @Minamiyama in #1645
BUG: Fix default rerank type by @codingl2k1 in #1649
BUG: chat_completion not response while error appears more than 100 by @liuzhenghua in #1663

Tests

TST: Fix CI due to tenacity by @ChengjieLi28 in #1660

Others

CHORE: [pre-commit] Add exclude thirdparty rules by @frostyplanet in #1678

Full Changelog: v0.12.1...v0.12.2

Contributors

frostyplanet, Minamiyama, and 6 other contributors

Assets 2

Releases: xorbitsai/inference

v0.14.1

What's new in 0.14.1 (2024-08-09)

New features

Enhancements

Bug fixes

Tests

Documentation

Others

New Contributors

Contributors

v0.14.0.post1

What's new in 0.14.0.post1 (2024-08-05)

Enhancements

Bug fixes

Contributors

v0.14.0

What's new in 0.14.0 (2024-08-02)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.13.3

What's new in 0.13.3 (2024-07-26)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.13.2

What's new in 0.13.2 (2024-07-19)

New features

Enhancements

Bug fixes

Contributors

v0.13.1

What's new in 0.13.1 (2024-07-12)

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Contributors

v0.13.0

What's new in 0.13.0 (2024-07-05)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

v0.12.3

What's new in 0.12.3 (2024-06-28)

New features

Enhancements

Bug fixes

Others

New Contributors

Contributors

v0.12.2.post1

What's new in 0.12.2.post1 (2024-06-22)

Enhancements

Contributors

v0.12.2

What's new in 0.12.2 (2024-06-21)

New features

Enhancements

Bug fixes

Tests

Others

Contributors