08 Dec 05:47

XprobeBot

b5a5f0a

v0.7.0

What's new in 0.7.0 (2023-12-08)

These are the changes in inference v0.7.0.

Enhancements

ENH: upgrade insecure requests when necessary by @waltcow in #712
ENH: [UI] Using tab in running models by @ChengjieLi28 in #714
ENH: [UI] supports launching rerank models by @ChengjieLi28 in #711
ENH: [UI] Error can be shown on web UI directly via Snackbar by @ChengjieLi28 in #721
ENH: [UI] Supports n_gpu config when launching LLM models on web ui by @ChengjieLi28 in #730
ENH: [UI] n_gpu default value auto by @ChengjieLi28 in #738
ENH: [UI] Support unregistering custom model on web UI by @ChengjieLi28 in #735
ENH: Auto recover model actor by @codingl2k1 in #694
ENH: allow rerank models run with LLM models on same device by @aresnow1 in #741

Bug fixes

BUG: Auto patch trust remote code for embedding model by @codingl2k1 in #710
BUG: Fix vLLM backend by @codingl2k1 in #728

Others

Update builtin model list by @onesuper in #709
Revert "ENH: upgrade insecure requests when necessary" by @qinxuye in #716
CHORE: Format js file and check js code style by @ChengjieLi28 in #727

New Contributors

@waltcow made their first contribution in #712
@qinxuye made their first contribution in #716

Full Changelog: v0.6.5...v0.7.0

Contributors

qinxuye, onesuper, and 4 other contributors

Assets 2

01 Dec 10:36

XprobeBot

v0.6.5

909a428

v0.6.5

What's new in 0.6.5 (2023-12-01)

These are the changes in inference v0.6.5.

New features

FEAT: Support jina embedding models by @aresnow1 in #704
FEAT: Support Yi-chat by @aresnow1 in #700
FEAT: Support qwen 72b by @aresnow1 in #705
FEAT: ChatGLM3 tool calls by @codingl2k1 in #701

Enhancements

ENH: Specify actor pool port for distributed deployment by @ChengjieLi28 in #688
ENH: Remove xorbits dependency by @ChengjieLi28 in #699
ENH: User can just specify a string for prompt style when registering custom LLM models by @ChengjieLi28 in #682
ENH: Add more models supported by vllm by @aresnow1 in #706

Bug fixes

BUG: Fix xinference start failed if invalid custom model found by @codingl2k1 in #690

Documentation

Doc: Fix some incorrect links in documentation by @aresnow1 in #684
Doc: Update readme by @aresnow1 in #687
DOC: documentation for docker and k8s by @lynnleelhl in #661

Others

Add langchain streamlit demo example code by @onesuper in #681

New Contributors

@lynnleelhl made their first contribution in #661

Full Changelog: v0.6.4...v0.6.5

Contributors

onesuper, lynnleelhl, and 3 other contributors

Assets 2

24 Nov 04:40

XprobeBot

v0.6.4

8fd2e3b

v0.6.4

What's new in 0.6.4 (2023-11-24)

These are the changes in inference v0.6.4.

New features

FEAT: Support registering custom embedding model by @ChengjieLi28 in #667
FEAT: Supports qwen.cpp for qwen-chat with ggml format by @ChengjieLi28 in #675
FEAT: Xverse by @fengsxy in #678
FEAT: Support rerank models by @aresnow1 in #672

Enhancements

ENH: Add generate interface for chatglm with ggml format by @ChengjieLi28 in #671

Bug fixes

BUG: Fix custom model missing config json by @codingl2k1 in #674
BUG: Fix http error is not raised by @codingl2k1 in #657
BUG: Fix pip install xinference[all] by @codingl2k1 in #679

Documentation

DOC: update pot files by @UranusSeven in #638
DOC: A more detailed beginner's guide has been created, covering various aspects of the first-time usage experience for new users. by @onesuper in #651
DOC: documentation for using xinference by @fengsxy in #677
DOC: Register custom embedding model by @ChengjieLi28 in #683

Others

Add why xinf section to readme to compare pivitol features with others by @onesuper in #652
Fix README.md by @aresnow1 in #669

New Contributors

@fengsxy made their first contribution in #677

Full Changelog: v0.6.3...v0.6.4

Contributors

onesuper, fengsxy, and 4 other contributors

Assets 2

16 Nov 07:16

XprobeBot

v0.6.3

67361ec

v0.6.3

What's new in 0.6.3 (2023-11-16)

These are the changes in inference v0.6.3.

New features

FEAT: qwen-chat-14b by @UranusSeven in #494
FEAT: Support gptq quantization by @codingl2k1 in #645

Bug fixes

BUG: Fix restful api serialization slow by @codingl2k1 in #648

Tests

TST: disable test_is_self_hosted by @UranusSeven in #641

Documentation

DOC: About Logging in Xinference by @ChengjieLi28 in #631
DOC: Init for Chinese doc by @ChengjieLi28 in #565

Full Changelog: v0.6.2...v0.6.3

Contributors

ChengjieLi28, UranusSeven, and codingl2k1

Assets 2

09 Nov 09:47

XprobeBot

v0.6.2

70cd5a0

v0.6.2

What's new in 0.6.2 (2023-11-09)

These are the changes in inference v0.6.2.

New features

FEAT: Support Yi Model by @ChengjieLi28 in #629

Enhancements

ENH: cache status by @UranusSeven in #616
ENH: Supports request limits for the model by @ChengjieLi28 in #596
ENH: running model location & accelerators by @UranusSeven in #626
ENH: Create completion restful api compatibility by @codingl2k1 in #622

Bug fixes

BUG: Compatible with openai 1.1 by @codingl2k1 in #619
BUG: fix spec decoding by @UranusSeven in #628
BUG: No slot available error for embedding and LLM model on one card by @ChengjieLi28 in #611
BUG: Rotating log does not create a new one when recreate the xinference cluster by @ChengjieLi28 in #618

Documentation

DOC: Change links for some tutorials by @onesuper in #617

Full Changelog: v0.6.1...v0.6.2

Contributors

onesuper, ChengjieLi28, and 2 other contributors

Assets 2

06 Nov 11:52

XprobeBot

v0.6.1

e1dedc8

v0.6.1

What's new in 0.6.1 (2023-11-06)

These are the changes in inference v0.6.1.

New features

FEAT: support chatglm3 with ggml format by @aresnow1 in #613

Enhancements

ENH: add command xinference-local by @UranusSeven in #610
ENH: Don't check dead nodes by @aresnow1 in #614

Full Changelog: v0.6.0...v0.6.1

Contributors

aresnow1 and UranusSeven

Assets 2

03 Nov 03:08

XprobeBot

v0.6.0

4903bae

v0.6.0

What's new in 0.6.0 (2023-11-03)

These are the changes in inference v0.6.0.

New features

FEAT: Zephyr by @UranusSeven in #597
FEAT: stable diffusion with controlnet by @codingl2k1 in #575

Enhancements

ENH: increase heartbeat interval by @UranusSeven in #604
ENH: Support more models downloading from modelscope by @aresnow1 in #595
ENH: Supports rotating file log by @ChengjieLi28 in #590
ENH: stateless supervisor and worker by @UranusSeven in #546

Bug fixes

BUG: Fix chat system messages by @codingl2k1 in #594
BUG: fix transformers compatibility by @UranusSeven in #600

Tests

TST: Compatible with llama-cpp-python 0.2.12 by @ChengjieLi28 in #603

Documentation

DOC: Download model from ModelScope by @ChengjieLi28 in #553
DOC: Stable Diffusion with ControlNet example by @codingl2k1 in #605

Full Changelog: v0.5.6...v0.6.0

Contributors

aresnow1, ChengjieLi28, and 2 other contributors

Assets 2

30 Oct 03:53

XprobeBot

v0.5.6

148d25a

v0.5.6

What's new in 0.5.6 (2023-10-30)

These are the changes in inference v0.5.6.

New features

FEAT: launch embedding models by @Minamiyama in #582
FEAT: chatglm3 by @UranusSeven in #587

Documentation

DOC: update hot topics and fix docs by @UranusSeven in #584

Others

CHORE: install setuptools in release actions by @aresnow1 in #588
CHORE: Use python3.10 to build and release by @aresnow1 in #589

Full Changelog: v0.5.5...v0.5.6

Contributors

Minamiyama, aresnow1, and UranusSeven

Assets 2

26 Oct 09:22

XprobeBot

v0.5.5

ee8b7e9

v0.5.5

What's new in 0.5.5 (2023-10-26)

These are the changes in inference v0.5.5.

Enhancements

ENH: display language tags by @Minamiyama in #558
ENH: filter models by type by @Minamiyama in #559
ENH: disable create embeddings using LLMs by @UranusSeven in #570
ENH: benchmark latency by @UranusSeven in #576
ENH: configurable XINFERENCE_HOME env by @ChengjieLi28 in #566

Bug fixes

BUG: Fix bge-base-zh and bge-large-zh from ModelScope by @ChengjieLi28 in #571
BUG: When change model revision, xinference still uses the previous model by @ChengjieLi28 in #573
BUG: incorrect vLLM config by @UranusSeven in #579
BUG: fix llama-2 stop words by @UranusSeven in #580

Documentation

DOC: Incompatibility Between NVIDIA Driver and PyTorch Version by @onesuper in #551
DOC: Examples and resources page by @onesuper in #561

Full Changelog: v0.5.4...v0.5.5

Contributors

Minamiyama, onesuper, and 2 other contributors

Assets 2

20 Oct 16:07

XprobeBot

v0.5.4

a6bf734

v0.5.4

What's new in 0.5.4 (2023-10-20)

These are the changes in inference v0.5.4.

New features

FEAT: wizardcoder python by @UranusSeven in #539
FEAT: Support grammar-based sampling for ggml models by @aresnow1 in #525
FEAT: speculative decoding by @UranusSeven in #509

Enhancements

ENH: Download embedding models from ModelScope by @ChengjieLi28 in #532
ENH: lock transformers version by @UranusSeven in #549
ENH: Support downloading code-llama family models from ModelScope by @ChengjieLi28 in #557
ENH: Add gguf format of codellama-instruct by @aresnow1 in #567

Bug fixes

BUG: Fix stream not compatible with openai by @codingl2k1 in #524
BUG: set trust_remote_code to true by default by @richzw in #555
BUG: add quantization to valid file name by @richzw in #562
BUG: remove "generate" ability from Baichuan-2-chat json config by @Minamiyama in #556

Documentation

DOC: update pot files by @UranusSeven in #538
DOC: Add Client API reference by @codingl2k1 in #543
DOC: Add client doc to the user guide by @codingl2k1 in #547

New Contributors

@richzw made their first contribution in #555
@Minamiyama made their first contribution in #556

Full Changelog: v0.5.3...v0.5.4

Contributors

Minamiyama, richzw, and 4 other contributors

Assets 2

Releases: xorbitsai/inference

v0.7.0

What's new in 0.7.0 (2023-12-08)

Enhancements

Bug fixes

Others

New Contributors

Contributors

v0.6.5

What's new in 0.6.5 (2023-12-01)

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Contributors

v0.6.4

What's new in 0.6.4 (2023-11-24)

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Contributors

v0.6.3

What's new in 0.6.3 (2023-11-16)

New features

Bug fixes

Tests

Documentation

Contributors

v0.6.2

What's new in 0.6.2 (2023-11-09)

New features

Enhancements

Bug fixes

Documentation

Contributors

v0.6.1

What's new in 0.6.1 (2023-11-06)

New features

Enhancements

Contributors

v0.6.0

What's new in 0.6.0 (2023-11-03)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

v0.5.6

What's new in 0.5.6 (2023-10-30)

New features

Documentation

Others

Contributors

v0.5.5

What's new in 0.5.5 (2023-10-26)

Enhancements

Bug fixes

Documentation

Contributors

v0.5.4

What's new in 0.5.4 (2023-10-20)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors