MyGit

0.8.10

Mozilla-Ocho/llamafile

版本发布时间: 2024-07-24 01:53:04

Mozilla-Ocho/llamafile最新发布版本:0.8.13(2024-08-19 01:22:48)

[line drawing of llama animal head in front of slightly open manilla folder filled with files]

llamafile lets you distribute and run LLMs with a single file

llamafile is a local LLM inference tool introduced by Mozilla Ocho in Nov 2023, which offers superior performance and binary portability to the stock installs of six OSes without needing to be installed. It features the best of llama.cpp and cosmopolitan libc while aiming to stay ahead of the curve by including the most cutting-edge performance and accuracy enhancements. What llamafile gives you is a fun web GUI chatbot, a turnkey OpenAI API compatible server, and a shell-scriptable CLI interface which together put you in control of artificial intelligence.

This release includes a build of the new llamafile server rewrite we've been promising, which we're calling llamafiler. It's matured enough to recommend for embedding serving. This is the fastest way to serve embeddings. If you use it with all-MiniLM-L6-v2.Q6_K.gguf then on Threadripper it can serve JSON /embedding at 800 req/sec whereas the old llama.cpp server could only do 100 req/sec. So you can fill up your RAG databases very quickly if you productionize this.

The old llama.cpp server came from a folder named "examples" and was never intended to be production worthy. This server is designed to be sturdy and uncrashable. It has /completion and /tokenize endpoints too, which serves 3.7 million requests per second on Threadripper, thanks to Cosmo Libc improvements.

See the LLaMAfiler Documentation for further details.

This release restores support for non-AVX x86 microprocessors. We had to drop support at the beginning of the year. However our CPUid dispatching has advanced considerably since then. We're now able to offer top speeds on modern hardware, without leaving old hardware behind.

Here's the remaining improvements included in this release:

相关地址:原始地址 下载(tar) 下载(zip)

1、 llamafile-0.8.10 28.3MB

2、 llamafile-0.8.10.zip 60.09MB

查看:2024-07-24发行的版本