Release llamafile v0.8.14 · Mozilla-Ocho/llamafile

llamafile lets you distribute and run LLMs with a single file

llamafile is a local LLM inference tool introduced by Mozilla Ocho in Nov 2023, which offers superior performance and binary portability to the stock installs of six OSes without needing to be installed. It features the best of llama.cpp and cosmopolitan libc while aiming to stay ahead of the curve by including the most cutting-edge performance and accuracy enhancements. What llamafile gives you is a fun web GUI chatbot, a turnkey OpenAI API compatible server, and a shell-scriptable CLI interface which together put you in control of artificial intelligence.

v0.8.14 changes

This release introduces our new CLI chatbot interface. It supports
multi-line input using triple quotes. It will syntax highlight Python,
C, C++, Java, and JavaScript code.

This chatbot is now the default mode of operation. When you launch
llamafile without any special arguments, the chatbot will be launched
in the foreground, and the server will be launched in the background.
You can use the --chat and --server flags to disambiguate this
behavior if you only want one of them.

a384fd7 Create ollama inspired cli chatbot
63205ee Add syntax highlighting to chatbot
7b395be Introduce new --chat flag for chatbot
28e98b6 Show prompt loading progress in chatbot
4199dae Make chat+server hybrid the new default mode

The whisperfile server now lets you upload mp3/ogg/flac.

74dfd21 Rewrite audio file loader code
7517a5f whisperfile server: convert files without ffmpeg (#568)

Other improvements have been made.

d617c0b Added vision support to api_like_OAI (#524)
726f6e8 Enable gpu support in llamafile-bench (#581)
c7c4d65 Speed up KV in llamafile-bench
2c940da Make replace_all() have linear complexity
fa4c4e7 Use bf16 kv cache when it's faster
20fe696 Upgrade to Cosmopolitan 3.9.4
c44664b Always favor fp16 arithmetic in tinyBLAS
98eff09 Quantize TriLM models using Q2_K_S (#552)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llamafile v0.8.14

v0.8.14 changes