0.6.1
版本发布时间: 2024-01-20 16:09:47
Mozilla-Ocho/llamafile最新发布版本:0.8.13(2024-08-19 01:22:48)
llamafile lets you distribute and run LLMs with a single file
This release fixes a crash that can happen on Apple Metal GPUs.
- 9c85d9c Fix free() related crash in ggml-metal.m
Windows users will see better performance with tinyBLAS. Please note we still recommend installing the CUDA SDK (NVIDIA), or HIP/ROCm SDK (AMD) for maximum performance and accuracy if you're in their support vector.
- df0b3ff Use thread-local register file for matmul speedups (#205)
- 4892494 Change BM/BN/BK to template parameters (#203)
- ed05ba9 Reduce server memory use on Windows
This release also synchronizes with llama.cpp upstream (as of Jan 9th) along with other improvements.
- 133b05e Sync with llama.cpp upstream
- 67d97b5 Use hipcc on $PATH if it exists
- 15e2339 Do better job reporting AMD hipBLAS errors
- c617679 Don't crash when --image argument is invalid
- 3e8aa78 Clarify install/gpu docs/behavior per feedback
- eb4989a Fix typo in OpenAI API
Example llamafiles
Our llamafiles on Hugging Face are updated shortly after a release goes live.
Flagship models
Supreme models (highest-end consumer hardware)
- https://hf.co/jartine/Mixtral-8x7B-Instruct-v0.1-llamafile
- https://hf.co/jartine/WizardCoder-Python-34B-V1.0-llamafile
Tiny models (small enough to use on raspberry pi)
- https://hf.co/jartine/phi-2-llamafile
- https://hf.co/jartine/rocket-3B-llamafile
- https://hf.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF
Other models:
- https://hf.co/jartine/jartine/wizardcoder-13b-python
- https://hf.co/jartine/jartine/Nous-Hermes-Llama2-llamafile
- https://hf.co/jartine/jartine/dolphin-2.5-mixtral-8x7b-llamafile
If you have a slow Internet connection and want to update your llamafiles without needing to redownload, then see the instructions here: https://github.com/Mozilla-Ocho/llamafile/issues/24#issuecomment-1836362558 You can also download llamafile-0.6.1
and simply say ./llamafile-0.6.1 -m old.llamafile
to run your old weights.
1、 llamafile-0.6.1 30.18MB
2、 llamafile-0.6.1.zip 14.15MB
3、 zipalign-0.6.1 725.21KB