MyGit

0.8.5

Mozilla-Ocho/llamafile

版本发布时间: 2024-05-25 17:06:24

Mozilla-Ocho/llamafile最新发布版本:0.8.13(2024-08-19 01:22:48)

This release fixes bugs and introduces @kawrakow's latest quant performance enhancements (a feature exclusive to llamafile). As of #435 the K quants now go consistently 2x faster than llama.cpp upstream. On big CPUs like Threadripper we've doubled the performance of tiny models, for both prompt processing and token generation for tiny models (see the benchmarks below) The llamafile-bench and llamafile-upgrade-engine commands have been introduced.

Note: Please use llamafile v0.8.4 if you need prebuilt (driver-only) AMD GPU support on Windows, at least for the next few weeks, until https://github.com/ggerganov/llama.cpp/issues/7156 is resolved.

Binaries run on Linux, Windows, MacOS, FreeBSD, OpenBSD, and NetBSD for AMD64 and ARM64. Supported GPUs are CUDA, ROCm, and Metal. Prebuilt GPU binaries are provided for CUDA/ROCm on Linux, and CUDA on Windows. To install this release on systems with a POSIX-style shell:

sudo -s
cd /usr/local
wget https://github.com/Mozilla-Ocho/llamafile/releases/download/0.8.5/llamafile-0.8.5.zip
unzip llamafile-0.8.5.zip
exit
llamafile --help

To upgrade your old llamafiles without needing to redownload, run:

llamafile-upgrade-engine old.llamafile new.llamafile

Prebuilt llamafiles that have the LLM weights included are available at:

Here are some tutorials:

Here are some performance benchmarks for various quantization formats, on the world's flagship CPUs. See https://justine.lol/matmul/ to compare these numbers to where we were back in March two months ago.

cpu_info model_filename size test t/s
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.BF16 86.99 GiB pp512 447.01
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.BF16 86.99 GiB tg16 11.35
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.F16 86.99 GiB pp512 340.63
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.F16 86.99 GiB tg16 11.01
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q8_0 46.22 GiB pp512 288.16
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q8_0 46.22 GiB tg16 15.82
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q6_K 35.74 GiB pp512 431.51
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q6_K 35.74 GiB tg16 22.73
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q5_K_M 30.95 GiB pp512 427.71
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q5_K_M 30.95 GiB tg16 24.90
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q4_K_M 26.49 GiB pp512 440.03
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q4_K_M 26.49 GiB tg16 27.31
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q4_0 24.63 GiB pp512 287.51
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q4_0 24.63 GiB tg16 18.92
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q3_K_M 21.00 GiB pp512 433.89
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q3_K_M 21.00 GiB tg16 30.30
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q3_K_S 19.03 GiB pp512 432.36
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q3_K_S 19.03 GiB tg16 31.34
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q2_K 16.12 GiB pp512 449.64
AMD Ryzen Threadripper PRO 7995WX (znver4) mixtral-8x7b-instruct-v0.1.Q2_K 16.12 GiB tg16 33.71
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.F32 4.10 GiB pp512 2103.25
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.F32 4.10 GiB tg16 57.34
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.BF16 2.05 GiB pp512 2603.84
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.BF16 2.05 GiB tg16 77.18
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.F16 2.05 GiB pp512 2038.64
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.F16 2.05 GiB tg16 80.23
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q8_0 1.09 GiB pp512 2203.77
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q8_0 1.09 GiB tg16 100.78
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q6_K 860.86 MiB pp512 2838.05
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q6_K 860.86 MiB tg16 135.27
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_1 791.50 MiB pp512 2328.06
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_1 791.50 MiB tg16 138.15
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_K_M 745.11 MiB pp512 2676.14
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_K_M 745.11 MiB tg16 143.58
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_0 729.84 MiB pp512 2281.44
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_0 729.84 MiB tg16 145.02
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_K_S 729.84 MiB pp512 2757.59
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q5_K_S 729.84 MiB tg16 143.59
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_1 668.18 MiB pp512 2444.11
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_1 668.18 MiB tg16 148.50
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_K_M 636.18 MiB pp512 2758.90
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_K_M 636.18 MiB tg16 149.92
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_K_S 609.53 MiB pp512 2847.95
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_K_S 609.53 MiB tg16 150.84
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_0 606.53 MiB pp512 2420.58
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q4_0 606.53 MiB tg16 154.27
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q3_K_L 563.42 MiB pp512 2743.74
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q3_K_L 563.42 MiB tg16 155.29
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q3_K_M 522.30 MiB pp512 2779.92
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q3_K_M 522.30 MiB tg16 157.92
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q3_K_S 475.51 MiB pp512 2758.16
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q3_K_S 475.51 MiB tg16 162.65
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q2_K 411.41 MiB pp512 2777.59
AMD Ryzen Threadripper PRO 7995WX (znver4) TinyLlama-1.1B-Chat-v1.0.Q2_K 411.41 MiB tg16 166.82
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.F32 4.10 GiB pp512 384.37
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.F32 4.10 GiB tg16 40.00
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.BF16 2.05 GiB pp512 386.59
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.BF16 2.05 GiB tg16 49.91
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.F16 2.05 GiB pp512 703.34
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.F16 2.05 GiB tg16 47.44
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q8_0 1.09 GiB pp512 700.94
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q8_0 1.09 GiB tg16 94.79
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q6_K 860.86 MiB pp512 225.57
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q6_K 860.86 MiB tg16 103.42
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_1 791.50 MiB pp512 224.11
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_1 791.50 MiB tg16 103.06
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_K_M 745.11 MiB pp512 248.61
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_K_M 745.11 MiB tg16 106.27
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_0 729.84 MiB pp512 250.70
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_0 729.84 MiB tg16 108.10
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_K_S 729.84 MiB pp512 237.00
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q5_K_S 729.84 MiB tg16 104.68
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_1 668.18 MiB pp512 281.29
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_1 668.18 MiB tg16 115.67
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_K_M 636.18 MiB pp512 316.26
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_K_M 636.18 MiB tg16 119.35
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_K_S 609.53 MiB pp512 306.96
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_K_S 609.53 MiB tg16 107.95
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_0 606.53 MiB pp512 659.77
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q4_0 606.53 MiB tg16 135.96
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q3_K_L 563.42 MiB pp512 207.70
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q3_K_L 563.42 MiB tg16 102.14
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q3_K_M 522.30 MiB pp512 230.59
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q3_K_M 522.30 MiB tg16 93.07
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q3_K_S 475.51 MiB pp512 205.75
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q3_K_S 475.51 MiB tg16 100.52
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q2_K 411.41 MiB pp512 247.06
Apple M2 Ultra (+fp16+dotprod) TinyLlama-1.1B-Chat-v1.0.Q2_K 411.41 MiB tg16 106.44
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.F32 4.10 GiB pp512 27.84
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.F32 4.10 GiB tg16 2.10
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.BF16 2.05 GiB pp512 28.09
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.BF16 2.05 GiB tg16 4.55
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.F16 2.05 GiB pp512 58.27
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.F16 2.05 GiB tg16 4.89
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q8_0 1.09 GiB pp512 44.60
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q8_0 1.09 GiB tg16 8.36
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q6_K 860.86 MiB pp512 18.21
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q6_K 860.86 MiB tg16 11.47
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_1 791.50 MiB pp512 16.89
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_1 791.50 MiB tg16 12.43
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_K_M 745.11 MiB pp512 19.38
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_K_M 745.11 MiB tg16 13.22
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_0 729.84 MiB pp512 18.35
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_0 729.84 MiB tg16 13.20
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_K_S 729.84 MiB pp512 19.51
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q5_K_S 729.84 MiB tg16 13.68
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_1 668.18 MiB pp512 20.12
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_1 668.18 MiB tg16 14.67
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_K_M 636.18 MiB pp512 24.52
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_K_M 636.18 MiB tg16 14.61
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_K_S 609.53 MiB pp512 25.78
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_K_S 609.53 MiB tg16 15.69
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_0 606.53 MiB pp512 42.03
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q4_0 606.53 MiB tg16 15.32
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q3_K_L 563.42 MiB pp512 17.40
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q3_K_L 563.42 MiB tg16 13.83
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q3_K_M 522.30 MiB pp512 18.82
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q3_K_M 522.30 MiB tg16 14.47
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q3_K_S 475.51 MiB pp512 16.29
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q3_K_S 475.51 MiB tg16 13.77
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q2_K 411.41 MiB pp512 19.77
Raspberry Pi 5 TinyLlama-1.1B-Chat-v1.0.Q2_K 411.41 MiB tg16 16.48
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.F32 26.98 GiB pp512 398.57
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.F32 26.98 GiB tg16 10.18
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.BF16 13.49 GiB pp512 759.25
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.BF16 13.49 GiB tg16 19.29
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.F16 13.49 GiB pp512 559.94
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.F16 13.49 GiB tg16 19.26
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q8_0 7.17 GiB pp512 518.76
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q8_0 7.17 GiB tg16 26.31
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q6_K 5.53 GiB pp512 726.13
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q6_K 5.53 GiB tg16 38.65
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_1 5.07 GiB pp512 534.04
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_1 5.07 GiB tg16 38.68
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_K_M 4.78 GiB pp512 723.25
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_K_M 4.78 GiB tg16 41.13
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_0 4.65 GiB pp512 536.67
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_0 4.65 GiB tg16 42.46
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_K_S 4.65 GiB pp512 651.05
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q5_K_S 4.65 GiB tg16 42.14
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_1 4.24 GiB pp512 572.67
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_1 4.24 GiB tg16 43.19
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_K_M 4.07 GiB pp512 728.48
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_K_M 4.07 GiB tg16 44.29
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_K_S 3.86 GiB pp512 666.82
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_K_S 3.86 GiB tg16 45.18
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_0 3.83 GiB pp512 562.96
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q4_0 3.83 GiB tg16 48.02
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q3_K_L 3.56 GiB pp512 706.64
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q3_K_L 3.56 GiB tg16 46.82
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q3_K_M 3.28 GiB pp512 715.62
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q3_K_M 3.28 GiB tg16 48.29
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q3_K_S 2.95 GiB pp512 722.11
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q3_K_S 2.95 GiB tg16 49.76
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q2_K 2.53 GiB pp512 739.28
AMD Ryzen Threadripper PRO 7995WX (znver4) mistral-7b-instruct-v0.2.Q2_K 2.53 GiB tg16 53.01

相关地址:原始地址 下载(tar) 下载(zip)

1、 llamafile-0.8.5 33.79MB

2、 llamafile-0.8.5.zip 72.87MB

3、 llamafile-bench-0.8.5 7.51MB

4、 llamafile-quantize-0.8.5 7.24MB

5、 zipalign-0.8.5 752.92KB

查看:2024-05-25发行的版本