0.4.1
版本发布时间: 2023-12-28 18:41:42
Mozilla-Ocho/llamafile最新发布版本:0.8.13(2024-08-19 01:22:48)
llamafile lets you distribute and run LLMs with a single file
If you had trouble generating filenames following the "bash one-liners" blog post using the latest release, then please try again.
- 0984ed8 Fix regression with --grammar flag
Crashes on older Intel / AMD systems should be fixed:
- 3490afa Fix SIGILL on older Intel/AMD CPUs w/o F16C
The OpenAI API compatible endpoint has been improved.
- 9e4bf29 Fix OpenAI server sampling w.r.t. temp and seed
This release improves the documentation.
- 5c7ff6e Improve llamafile manual
- 658b18a Add WSL CUDA to GPU section (#105)
- 586b408 Update README.md so links and curl commands work (#136)
- a56ffd4 Update README to clarify Darwin kernel versioning
- 47d8a8f Fix README changing SSE3 to SSSE3
- 4da8e2e Fix README examples for certain UNIX shells
- faa7430 Change README to list Mixtral Q5 (instead of Q3)
- 6b0b64f Fix CLI README examples
We're making strides to automating our testing process.
- dadd5a7 Add CI (#126)
Some other improvements:
- 9e972b2 Improve README examples
- 9de5686 Support bos token in llava-cli
- 3d81e22 Set logger callback for Apple Metal
- 9579b73 Make it easier to override CPPFLAGS
Our .llamafiles on Hugging Face have been updated to incorporate these new release binaries. You can redownload here:
- https://huggingface.co/jartine/llava-v1.5-7B-GGUF/tree/main
- https://huggingface.co/jartine/Mistral-7B-Instruct-v0.2-llamafile/tree/main
- https://huggingface.co/jartine/wizardcoder-13b-python/tree/main
- https://huggingface.co/jartine/Mixtral-8x7B-Instruct-v0.1-llamafile
Known Issues
LLaVA image processing using the builtin tinyBLAS library may go slow on Windows. Here's the workaround for using the faster NVIDIA cuBLAS library instead.
- Delete the
.llamafile
directory in your home directory. - Install CUDA
- Install MSVC
- Open the "x64 MSVC command prompt" from Start
- Run llamafile there for the first invocation.
There's a YouTube video tutorial on doing this here: https://youtu.be/d1Fnfvat6nM?si=W6Y0miZ9zVBHySFj
1、 llamafile-0.4.1 15.96MB
2、 llamafile-0.4.1.zip 15.29MB
3、 llamafile-server-0.4.1 16.58MB
4、 zipalign-0.4.1 643.75KB