0.7.1
版本发布时间: 2024-04-13 12:21:31
Mozilla-Ocho/llamafile最新发布版本:0.8.13(2024-08-19 01:22:48)
This release fixes bugs in the 0.7.0 release.
- Fix 2 embeddings-related issues in server.cpp (#324)
- Detect search query to start webchat (#333)
- Use LLAMAFILE_GPU_ERROR value -2 instead of -1 (#291)
- Fix --silent-prompt flag regression #328
- Clamp out of range values in K quantizer ef0307ea62c8631a9b67acee02afeef8979d89eb
- Update to latest q5_k quantization code a8b0b159a32392ab6a3e13711f54fd6c544a097b
- Change file format magic number for recently bf16 file format introduced in 0.7.0. This is a breaking change. It's due to a numbering conflict with the upstream project. We're still waiting on a permanent assignment for bfloat16 so this could potentially change again. Follow https://github.com/ggerganov/llama.cpp/pull/6412 for updates.
Mixtral 8x22b and Grok support are not available in this release, but they are available if you build llamafile from source on the main branch at HEAD. We're currently dealing with an AMD Windows GPU support regression there. Once it's resolved, a 0.8 release will ship.
1、 llamafile-0.7.1 23.08MB
2、 llamafile-0.7.1.zip 28.62MB
3、 zipalign-0.7.1 721.88KB