v1.5.0
版本发布时间: 2023-11-16 05:06:23
ggerganov/whisper.cpp最新发布版本:v1.6.2(2024-05-27 15:36:55)
Overview
This major release includes the following changes:
- Full GPU processing of the Encoder and the Decoder with CUDA and Metal is now supported
- Efficient beam-search implementation via batched decoding and unified KV cache
- Full quantization support of all available
ggml
quantization types - Support for grammar constrained sampling
- Support for Distil Whisper models
- Support for Whisper Large-v3
and more
Full GPU support
On Apple Silicon, GPU support has been available to a large extend since 15 Sep. However, part of the Encoder was still being executed on the CPU due to lack of MSL kernels for the convolution operations. These kernels are now available resulting in additional speed-up of the Encoder in this release:
Encoder performance on Apple M1 Max - before and after (plot by @dreness)
For NVIDIA hardware, the entire computation can now be offloaded to the GPU which results in significant performance boost. For detailed performance breakdown, checkout the Benchmarks section below.
The GPU processing on Apple Silicon is enabled by default, while for NVIDIA you need to build with WHISPER_CUBLAS=1
:
# Apple Silicon
make
# NVIDIA
WHISPER_CUBLAS=1 make
Implementation: https://github.com/ggerganov/whisper.cpp/pull/1472
Special credits to: @FSSRepo, @slaren
Batched decoding + efficient Beam Search
At last, whisper.cpp
now supports efficient Beam Search decoding. The missing piece was the implementation of batched decoding, which now follows closely the unified KV cache idea from llama.cpp. On modern NVIDIA hardware, the performance with 5 beams is the same as 1 beam thanks to the large amount of computing power available. With Metal, the speed with 5 beams is a bit slower compared to 1 beam, but it is significantly faster compared to 5x times the time for single batch which was observed with the old naive implementation.
Beam Search is now enabled by default in whisper.cpp
to match the OG implementation of OpenAI Whisper. For more performance details, checkout the Benchmarks section below.
Implementation: https://github.com/ggerganov/whisper.cpp/pull/1486
Quantization support
All ggml
quantization types are now supported. Quantization mixtures for Whisper model can be implemented. It's still unclear how the quality is affected from the quantization - this is an interesting area which can be explored in the future.
Grammar sampling
The decoder output can now be constrained with a GBNF grammar. This can be a useful technique for further improving the transcription quality in situations where the set of possible phrases are limited.
https://github.com/ggerganov/whisper.cpp/assets/377495/d24716e2-5e9c-441b-8c6b-395922dccbf4
Implementation: https://github.com/ggerganov/whisper.cpp/pull/1229
Special credits to @ejones
Distil Whisper
Recently, Distil Whisper models have been released: https://huggingface.co/distil-whisper
whisper.cpp
offers support for these models, although it still lacks full implementation of the proposed chunking strategy. Performance details for distilled models are included in the Benchmarks section below.
Implementation: https://github.com/ggerganov/whisper.cpp/pull/1424
Whisper Large-v3
Recently, OpenAI released a new version 3 of the Large model: https://github.com/openai/whisper/pull/1761
Implementation: https://github.com/ggerganov/whisper.cpp/pull/1444
Benchmarks
Below is a breakdown of the performance of whisper.cpp
on Apple Silicon, NVIDIA and CPU. The tables show the Encoder and Decoder speed in ms/tok
. The Dec.
column corresponds to batch size 1. The Bch5
column corresponds to batch size 5. The PP
column corresponds to batch size 128.
For optimal Beam Search performance, the Bch5
number should be 5 times smaller than Dec.
Hw | Config | Model | Th | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|
M2 Ultra | METAL | tiny | 1 | 11.14 | 1.40 | 0.49 | 0.01 | ccc85b4 |
M2 Ultra | METAL | tiny-q5_0 | 1 | 11.51 | 1.41 | 0.52 | 0.01 | ccc85b4 |
M2 Ultra | METAL | tiny-q5_1 | 1 | 12.21 | 1.41 | 0.52 | 0.01 | ccc85b4 |
M2 Ultra | METAL | base | 1 | 20.21 | 2.05 | 0.77 | 0.02 | ccc85b4 |
M2 Ultra | METAL | base-q5_0 | 1 | 19.89 | 1.96 | 0.81 | 0.02 | ccc85b4 |
M2 Ultra | METAL | base-q5_1 | 1 | 20.14 | 2.02 | 0.81 | 0.02 | ccc85b4 |
M2 Ultra | METAL | small | 1 | 51.01 | 3.97 | 1.74 | 0.05 | ccc85b4 |
M2 Ultra | METAL | small-q5_0 | 1 | 56.86 | 4.09 | 1.85 | 0.06 | ccc85b4 |
M2 Ultra | METAL | small-q5_1 | 1 | 56.81 | 4.14 | 1.85 | 0.06 | ccc85b4 |
M2 Ultra | METAL | medium | 1 | 141.21 | 8.47 | 3.98 | 0.13 | ccc85b4 |
M2 Ultra | METAL | medium-q5_0 | 1 | 160.56 | 8.27 | 4.18 | 0.14 | ccc85b4 |
M2 Ultra | METAL | medium-q5_1 | 1 | 160.52 | 8.40 | 4.15 | 0.14 | ccc85b4 |
M2 Ultra | METAL | medium-dis | 1 | 128.14 | 1.13 | 0.43 | 0.02 | ccc85b4 |
M2 Ultra | METAL | large-v2 | 1 | 248.73 | 11.96 | 6.08 | 0.22 | ccc85b4 |
M2 Ultra | METAL | large-v2-q5_0 | 1 | 286.31 | 11.99 | 6.60 | 0.26 | ccc85b4 |
M2 Ultra | METAL | large-v2-q5_1 | 1 | 284.56 | 12.42 | 6.47 | 0.26 | ccc85b4 |
M2 Ultra | METAL | large-v2-dis | 1 | 224.31 | 1.26 | 0.49 | 0.02 | ccc85b4 |
Hw | Config | Model | Th | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|
M2 Ultra | COREML METAL | tiny | 1 | 7.60 | 1.41 | 0.50 | 0.01 | ccc85b4 |
M2 Ultra | COREML METAL | base | 1 | 11.90 | 2.07 | 0.78 | 0.02 | ccc85b4 |
M2 Ultra | COREML METAL | small | 1 | 32.19 | 4.10 | 1.78 | 0.05 | ccc85b4 |
M2 Ultra | COREML METAL | medium | 1 | 94.43 | 8.40 | 3.89 | 0.12 | ccc85b4 |
M2 Ultra | COREML METAL | large-v2 | 1 | 179.78 | 12.12 | 6.07 | 0.22 | ccc85b4 |
Hw | Config | Model | Th | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|
NVIDIA V100 | BLAS CUDA | tiny | 1 | 8.84 | 1.62 | 0.33 | 0.02 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | tiny-q5_0 | 1 | 8.43 | 1.19 | 0.31 | 0.02 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | tiny-q5_1 | 1 | 8.41 | 1.19 | 0.29 | 0.02 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | base | 1 | 14.79 | 2.31 | 0.46 | 0.03 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | base-q5_0 | 1 | 15.05 | 1.66 | 0.44 | 0.03 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | base-q5_1 | 1 | 15.01 | 1.68 | 0.46 | 0.03 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | small | 1 | 40.30 | 4.37 | 0.88 | 0.05 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | small-q5_0 | 1 | 41.17 | 3.11 | 0.94 | 0.05 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | small-q5_1 | 1 | 41.12 | 3.11 | 0.82 | 0.05 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | medium | 1 | 104.93 | 10.06 | 1.77 | 0.11 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | medium-q5_0 | 1 | 107.11 | 6.13 | 2.07 | 0.12 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | medium-q5_1 | 1 | 107.91 | 6.21 | 1.77 | 0.12 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | medium-dis | 1 | 103.45 | 1.11 | 0.24 | 0.02 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | large-v2 | 1 | 171.55 | 15.76 | 2.62 | 0.17 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | large-v2-q5_0 | 1 | 176.27 | 8.61 | 3.17 | 0.19 | ccc85b4 |
NVIDIA V100 | BLAS CUDA | large-v2-q5_1 | 1 | 176.23 | 8.67 | 2.59 | 0.19 | ccc85b4 |
Hw | Config | Model | Th | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|
AMD Ryzen 9 5950X | AVX2 | tiny | 8 | 197.47 | 1.22 | 0.44 | 0.25 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | tiny-q5_0 | 8 | 222.92 | 0.87 | 0.45 | 0.30 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | tiny-q5_1 | 8 | 221.25 | 0.89 | 0.45 | 0.30 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | base | 8 | 427.14 | 3.11 | 0.88 | 0.43 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | base-q5_0 | 8 | 474.96 | 1.41 | 0.72 | 0.51 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | base-q5_1 | 8 | 485.05 | 1.48 | 0.73 | 0.52 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | small | 8 | 1470.51 | 11.70 | 2.89 | 1.21 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | small-q5_0 | 8 | 1700.43 | 5.48 | 1.98 | 1.41 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | small-q5_1 | 8 | 1719.03 | 5.79 | 2.02 | 1.42 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | medium | 8 | 4417.70 | 35.13 | 8.14 | 3.24 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | medium-q5_0 | 8 | 5335.77 | 17.44 | 5.35 | 3.92 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | medium-q5_1 | 8 | 5372.26 | 18.36 | 5.42 | 3.88 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | medium-dis | 8 | 4070.25 | 4.86 | 1.16 | 0.53 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | large-v2 | 8 | 8179.09 | 66.89 | 15.45 | 5.88 | ccc85b4 |
AMD Ryzen 9 5950X | AVX2 | large-v2-dis | 8 | 7490.45 | 7.06 | 1.63 | 0.70 | ccc85b4 |
API Changes
-
Add
struct whisper_context_params
-
Add
whisper_log_set
-
Deprecate:
-
whisper_init_from_file
-
whisper_init_from_buffer
-
whisper_init
-
whisper_init_from_file_no_state
-
whisper_init_from_buffer_no_state
-
whisper_init_no_state
-
-
Add:
-
whisper_init_from_file_with_params
-
whisper_init_from_buffer_with_params
-
whisper_init_with_params
-
whisper_init_from_file_with_params_no_state
-
whisper_init_from_buffer_with_params_no_state
-
whisper_init_with_params_no_state
-
-
Diff of
struct whisper_full_params
struct whisper_full_params {
enum whisper_sampling_strategy strategy;
@@ -338,6 +435,7 @@ extern "C" {
bool translate;
bool no_context; // do not use past transcription (if any) as initial prompt for the decoder
+ bool no_timestamps; // do not generate timestamps
bool single_segment; // force single segment output (useful for streaming)
bool print_special; // print special tokens (e.g. <SOT>, <EOT>, <BEG>, etc.)
bool print_progress; // print progress information
@@ -355,8 +453,12 @@ extern "C" {
// [EXPERIMENTAL] speed-up techniques
// note: these can significantly reduce the quality of the output
bool speed_up; // speed-up the audio by 2x using Phase Vocoder
+ bool debug_mode; // enable debug_mode provides extra info (eg. Dump log_mel)
int audio_ctx; // overwrite the audio context size (0 = use default)
+ // [EXPERIMENTAL] [TDRZ] tinydiarize
+ bool tdrz_enable; // enable tinydiarize speaker turn detection
+
// tokens to provide to the whisper decoder as initial prompt
// these are prepended to any existing text context from a previous call
const char * initial_prompt;
@@ -365,6 +467,7 @@ extern "C" {
// for auto-detection, set to nullptr, "" or "auto"
const char * language;
+ bool detect_language;
// common decoding parameters:
bool suppress_blank; // ref: https://github.com/openai/whisper/blob/f82bc59f5ea234d4b97fb2860842ed38519f7e65/whisper/decoding.py#L89
@@ -403,11 +506,24 @@ extern "C" {
whisper_encoder_begin_callback encoder_begin_callback;
void * encoder_begin_callback_user_data;
+ // called each time before ggml computation starts
+ whisper_abort_callback abort_callback;
+ void * abort_callback_user_data;
+
// called by each decoder to filter obtained logits
whisper_logits_filter_callback logits_filter_callback;
void * logits_filter_callback_user_data;
+
+ const whisper_grammar_element ** grammar_rules;
+ size_t n_grammar_rules;
+ size_t i_start_rule;
+ float grammar_penalty;
};
There might be some instability around the API, especially with the existing language bindings. I wasn't able to test everything, so expect some issues and feel free to submit PRs with any kind of fixes that you find.
Highlights and what's next
A lot of the updates in these release are possible thanks to the many contributions in llama.cpp - huge shoutout to all the contributors and collaborators there!
Regarding future updates to whisper.cpp
, I'm looking forward to the following things:
- Add server example similar to the one in
llama.cpp
- Try to improve Metal's batched decoding performance
- Look for some interesting applications of the grammar sampling functionality
-
Latest performance of the talk-llama example
https://github.com/ggerganov/whisper.cpp/assets/1991296/d97a3788-bf2a-4756-9a43-60c6b391649e
What's Changed
- Fix quantize bug by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/842
- whisper.wasm : fix typo in readme by @BaffinLee in https://github.com/ggerganov/whisper.cpp/pull/832
- Adding --session support in examples/talk-llama by @herrera-luis in https://github.com/ggerganov/whisper.cpp/pull/845
- --detect-language mode by @CRD716 in https://github.com/ggerganov/whisper.cpp/pull/853
- talk-llama: updating session prompts load by @herrera-luis in https://github.com/ggerganov/whisper.cpp/pull/854
- CMake/Makefile : CLBlast support as in llama.cpp by @trholding in https://github.com/ggerganov/whisper.cpp/pull/862
- Instruction: Partial OpenCL GPU support via CLBlast by @trholding in https://github.com/ggerganov/whisper.cpp/pull/863
- Add cuBLAS build workflow and fix error causing lines in CMakeLists by @RelatedTitle in https://github.com/ggerganov/whisper.cpp/pull/867
- cmake : fix options disabling AVX and AVX2 flags by @blazingzephyr in https://github.com/ggerganov/whisper.cpp/pull/885
- Added large-v2. Added instructions on converting to GGML. Added --no-… by @cjheath in https://github.com/ggerganov/whisper.cpp/pull/874
- talk-llama: only copy used KV cache in get / set state by @herrera-luis in https://github.com/ggerganov/whisper.cpp/pull/890
- Fix define used for COREML_ALLOW_FALLBACK by @jcsoo in https://github.com/ggerganov/whisper.cpp/pull/893
- coreml : fix memory leak by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/899
- whisper.objc : enable Core ML in example & fix segmentation fault by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/910
- Align --no-timestamps in help to actual behavior by @Miserlou in https://github.com/ggerganov/whisper.cpp/pull/908
- readme : improve Core ML model conversion guidance by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/915
- Added support of large-v1 model into CoreML by @abCods in https://github.com/ggerganov/whisper.cpp/pull/926
- Update of Hebrew Language Code: 'iw' to 'he' by @ttv20 in https://github.com/ggerganov/whisper.cpp/pull/935
- java bindings by @nalbion in https://github.com/ggerganov/whisper.cpp/pull/931
- ci: Build with any BLAS compatible library by @akharlamov in https://github.com/ggerganov/whisper.cpp/pull/927
- [DOCS] highlight openblas support in https://github.com/ggerganov/whisper.cpp/pull/956
- Update elevenlabs example to use official python API by @DGdev91 in https://github.com/ggerganov/whisper.cpp/pull/837
- Update README.md by @genevera in https://github.com/ggerganov/whisper.cpp/pull/964
- Feature/java bindings2 by @nalbion in https://github.com/ggerganov/whisper.cpp/pull/944
- Support decode wav file has 2 channels. by @geniusnut in https://github.com/ggerganov/whisper.cpp/pull/972
- README.md: Corrected syntax for markdown link by @LarryBattle in https://github.com/ggerganov/whisper.cpp/pull/995
- Make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files by @akashmjn in https://github.com/ggerganov/whisper.cpp/pull/1001
- Fixing Accidental 'exit(0)' and Ensuring Proper 'return 1' in
examples/main/main.cpp
whisper_params_parse
by @faker2048 in https://github.com/ggerganov/whisper.cpp/pull/1002 - Fix for issue #876 by @burningion in https://github.com/ggerganov/whisper.cpp/pull/1012
- Make cuBLAS compilation compatible with x86 as well as aarch64 by @byte-6174 in https://github.com/ggerganov/whisper.cpp/pull/1015
- feat(golang): improve progress reporting and callback handling by @appleboy in https://github.com/ggerganov/whisper.cpp/pull/1024
- Add support for whisper_full_lang_id() to go bindings by @jaybinks in https://github.com/ggerganov/whisper.cpp/pull/1010
- Add alternative java binding to readme by @GiviMAD in https://github.com/ggerganov/whisper.cpp/pull/1029
- diarization: add diarization support for all current output types by @colinc in https://github.com/ggerganov/whisper.cpp/pull/1031
- Fix cd statements to allow spaces in model path by @roddurd in https://github.com/ggerganov/whisper.cpp/pull/1041
- adding ggml_to_pt script by @simonMoisselin in https://github.com/ggerganov/whisper.cpp/pull/1042
- whisper: Fix build with -Werror=undef by @philn in https://github.com/ggerganov/whisper.cpp/pull/1045
- Fix talk-llama build after ggml sync (commit 5feb0dffbae5). by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1049
- Do not use _GNU_SOURCE gratuitously. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1027
- whisper :
split_on_word
no longer trims by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1046 - Updated 'quantize-all.sh' to quantize all downloaded models by @thefinaldegree in https://github.com/ggerganov/whisper.cpp/pull/1054
- Fix talk-llama build on macOS. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1062
- whisper : support speaker segmentation (local diarization) of mono audio via tinydiarize by @akashmjn in https://github.com/ggerganov/whisper.cpp/pull/1058
- Minor: updated readme by @mwarnaar in https://github.com/ggerganov/whisper.cpp/pull/1064
- OpenVINO support by @RyanMetcalfeInt8 in https://github.com/ggerganov/whisper.cpp/pull/1037
- go bindings: fix context.Process call in examples by @mvrilo in https://github.com/ggerganov/whisper.cpp/pull/1067
- go: Call SetDuration appropriately by @tmc in https://github.com/ggerganov/whisper.cpp/pull/1077
- Multi platforms CI by @alonfaraj in https://github.com/ggerganov/whisper.cpp/pull/1101
- Add Vim plugin by @AustinMroz in https://github.com/ggerganov/whisper.cpp/pull/1131
- chore: move progress calculation out of whisper.cpp by @geekodour in https://github.com/ggerganov/whisper.cpp/pull/1081
- expose api to let user control log output by @evmar in https://github.com/ggerganov/whisper.cpp/pull/1060
- Add a larger (30min) sample by @vadi2 in https://github.com/ggerganov/whisper.cpp/pull/1092
- Sync opencl compilation fix in ggml by @goncha in https://github.com/ggerganov/whisper.cpp/pull/1111
- README.md: Add OpenVINO support details by @RyanMetcalfeInt8 in https://github.com/ggerganov/whisper.cpp/pull/1112
- Fix MSVC compile error C3688 on non-unicode Windows by @goncha in https://github.com/ggerganov/whisper.cpp/pull/1110
- Now make tests can be called as make tests base.en by @Jerry-Master in https://github.com/ggerganov/whisper.cpp/pull/1113
- Go binding: Implement SetSplitOnWord by @xdrudis in https://github.com/ggerganov/whisper.cpp/pull/1114
- set NVCC -arch flag by cuda version by @alonfaraj in https://github.com/ggerganov/whisper.cpp/pull/1115
- Fix CLBlast build on MacOS by @iceychris in https://github.com/ggerganov/whisper.cpp/pull/1120
- Fixed the issue of OpenBLAS not being enabled on Windows. by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1128
- whisper : fix visibility warning of struct whisper_full_params by declaring in advance by @IronBlood in https://github.com/ggerganov/whisper.cpp/pull/1124
- Fix MSVC compile error C3688 by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1136
- Add tinydiarization support for streaming by @DMcConnell in https://github.com/ggerganov/whisper.cpp/pull/1137
- quantize : fix load vocab crash when len is 128 by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1160
- Fix AVX etc. under GCC/CMake by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1174
- Fix PowerPC build failures introduced in #1174 by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1196
- Simplify Makefile by @alonfaraj in https://github.com/ggerganov/whisper.cpp/pull/1147
- Add precalculated values of sin/cos for speeding up FFT by @AlexandrGraschenkov in https://github.com/ggerganov/whisper.cpp/pull/1142
- Make build work on Linux machines supporting AVX1 not AVX2 by @lachesis in https://github.com/ggerganov/whisper.cpp/pull/1162
- Fix OpenBLAS detection under Arch Linux by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1173
- Minor fixes by @csukuangfj in https://github.com/ggerganov/whisper.cpp/pull/1154
- New command line option by @jbyunes in https://github.com/ggerganov/whisper.cpp/pull/1205
- whisper.android : migrate from ndk-build to CMake by @JunkFood02 in https://github.com/ggerganov/whisper.cpp/pull/1204
- Significantly improve whisper.cpp inference quality by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1148
- whisper : allow whisper_full from mel spectrogram - no audio by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1214
- ROCm Port by @ardfork in https://github.com/ggerganov/whisper.cpp/pull/1209
- Improvements to vim plugin and LSP server by @AustinMroz in https://github.com/ggerganov/whisper.cpp/pull/1144
- Detect SSSE3 by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1211
- ggml : fix compiling when SSE3 is available but not SSSE3 by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1210
- make : add support for building on DragonFlyBSD/NetBSD/OpenBSD by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1212
- make : use cpuinfo in MSYS2 to enable x86 ISA extensions on the host by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1216
- Fix CoreML memleak (fixes #1202) by @denersc in https://github.com/ggerganov/whisper.cpp/pull/1218
- whisper.android : fix cmake multiple libraries build by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1224
- Fix compilation errors incurred by -Werror by @shivamidow in https://github.com/ggerganov/whisper.cpp/pull/1227
- ci : enable java package publishing by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1228
- fix cmake commands in README #1225 by @wizardforcel in https://github.com/ggerganov/whisper.cpp/pull/1231
- ggml : sync (ggml-alloc, GPU, eps, etc.) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1220
- make : improve cpuinfo handling on x86 hosts by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1238
- ggml : sync latest llama.cpp (view_src + alloc improvements) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1247
- Posixify pagesize. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1251
- Fix detection of AVX2 on macOS by @didzis in https://github.com/ggerganov/whisper.cpp/pull/1250
- Address ARM's big.LITTLE arch by checking cpu info. by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/1254
- Bump gradle plugin and dependencies + a lint pass by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/1255
- Add quantized models to download-ggml-model.sh by @nchudleigh in https://github.com/ggerganov/whisper.cpp/pull/1235
- Do not use _GNU_SOURCE gratuitously. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1129
- ci : upgrade gradle to 2.4.2 by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1263
- sync : ggml (HBM + Metal + style) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1264
- ci : try to fix gradle action by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1265
- Fixed signing of java artifact using gradle by @nalbion in https://github.com/ggerganov/whisper.cpp/pull/1267
- Faster
beam_search
sampling by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1243 - whisper : fix bench regression by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1275
- whisper : Metal and ggml-alloc support by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1270
- bench: fix missing include by @nekr0z in https://github.com/ggerganov/whisper.cpp/pull/1303
- ruby : fix build by add missing ggml-alloc by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1305
- Update README.md. Adding missing options, remove
--speed-up
. by @Sogl in https://github.com/ggerganov/whisper.cpp/pull/1306 - Update README.md by @computerscienceiscool in https://github.com/ggerganov/whisper.cpp/pull/1290
- save the recorded audio to a file by @litongjava in https://github.com/ggerganov/whisper.cpp/pull/1310
- Python benchmark script by @nchudleigh in https://github.com/ggerganov/whisper.cpp/pull/1298
- Minor: fix example talk readme gpt-2 github url by @brunofaustino in https://github.com/ggerganov/whisper.cpp/pull/1334
- Missing speaker turn function in API by @didzis in https://github.com/ggerganov/whisper.cpp/pull/1330
- examples: Move wav_writer from stream.cpp to common.h by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1317
- Better abort callback by @mkiol in https://github.com/ggerganov/whisper.cpp/pull/1335
- Add conversion scripts from HuggingFace models to CoreML by @AlienKevin in https://github.com/ggerganov/whisper.cpp/pull/1304
- Prefer pkg-config while looking for BLAS by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1349
- Abort build if a feature was requested and could not be configured by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1350
- Abort callback improvements by @mkiol in https://github.com/ggerganov/whisper.cpp/pull/1345
- Dockerfile for cublas by @joecryptotoo in https://github.com/ggerganov/whisper.cpp/pull/1286
- docs: fix typo by @jorismertz in https://github.com/ggerganov/whisper.cpp/pull/1362
- Expose the audio_ctx param through the Go binding by @JohanRaffin in https://github.com/ggerganov/whisper.cpp/pull/1368
- Clarify doc about where to compile from by @ai-at-home in https://github.com/ggerganov/whisper.cpp/pull/1400
- Faster download for models on windows using BitTransfer by @WhiteOlivierus in https://github.com/ggerganov/whisper.cpp/pull/1404
- JSON: allow outputting per-token data too by @akx in https://github.com/ggerganov/whisper.cpp/pull/1358
- Move up-to-date demo to top by @asadm in https://github.com/ggerganov/whisper.cpp/pull/1417
- Use absolute paths for the converted OpenVINO model by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1356
- sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1422
- whisper : add support for new distilled Whisper models by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1424
- whisper : add context param for disable gpu by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1293
- talk-llama : fix n_gpu_layers usage by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1441
- talk-llama : fix n_gpu_layers usage again by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1442
- Fix variable names in GitHub actions config by @iamthad in https://github.com/ggerganov/whisper.cpp/pull/1440
- Reset ctx->t_start_us when calling whisper_reset_timings() by @bjnortier in https://github.com/ggerganov/whisper.cpp/pull/1434
- Decouple Android example into a library and app module by @tobrun in https://github.com/ggerganov/whisper.cpp/pull/1445
- whisper : add support for large v3 by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1444
- Add support for Swift Package Manager by @sindresorhus in https://github.com/ggerganov/whisper.cpp/pull/1370
- Reset mel time when resetting timings by @bjnortier in https://github.com/ggerganov/whisper.cpp/pull/1452
- coreml: use the correct n_mel by @jxy in https://github.com/ggerganov/whisper.cpp/pull/1458
- models : Fix
n_mel
mismatch in convert-whisper-to-openvino.py by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1459 - Add '-l auto' to talk-llama example by @kubaracek in https://github.com/ggerganov/whisper.cpp/pull/1467
- Return with error from whisper_encode_internal and whisper_decode_int… by @bjnortier in https://github.com/ggerganov/whisper.cpp/pull/1456
- whisper : add full CUDA and Metal offloading by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1472
- examples : Enhanced compatibility with older Android versions using Java by @litongjava in https://github.com/ggerganov/whisper.cpp/pull/1382
- Add n_gpu_layers option to talk-llama example by @rlapray in https://github.com/ggerganov/whisper.cpp/pull/1475
- whisper : add grammar-based sampling by @ejones in https://github.com/ggerganov/whisper.cpp/pull/1229
- java : use tiny.en for tests by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1484
- whisper : add batched decoding by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1486
- java : fix test by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1492
- whisper : make large version explicit + fix data size units by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1493
New Contributors
- @BaffinLee made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/832
- @herrera-luis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/845
- @CRD716 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/853
- @trholding made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/862
- @RelatedTitle made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/867
- @blazingzephyr made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/885
- @cjheath made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/874
- @jcsoo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/893
- @Miserlou made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/908
- @abCods made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/926
- @ttv20 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/935
- @nalbion made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/931
- @akharlamov made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/927
- @geniusnut made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/972
- @LarryBattle made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/995
- @akashmjn made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1001
- @faker2048 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1002
- @burningion made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1012
- @byte-6174 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1015
- @appleboy made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1024
- @jaybinks made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1010
- @GiviMAD made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1029
- @colinc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1031
- @roddurd made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1041
- @simonMoisselin made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1042
- @philn made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1045
- @przemoc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1049
- @thefinaldegree made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1054
- @mwarnaar made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1064
- @RyanMetcalfeInt8 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1037
- @mvrilo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1067
- @tmc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1077
- @alonfaraj made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1101
- @AustinMroz made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1131
- @geekodour made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1081
- @evmar made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1060
- @vadi2 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1092
- @goncha made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1111
- @Jerry-Master made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1113
- @xdrudis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1114
- @iceychris made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1120
- @bobqianic made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1128
- @IronBlood made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1124
- @DMcConnell made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1137
- @marmistrz made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1174
- @AlexandrGraschenkov made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1142
- @lachesis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1162
- @csukuangfj made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1154
- @jbyunes made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1205
- @JunkFood02 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1204
- @ardfork made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1209
- @denersc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1218
- @shivamidow made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1227
- @wizardforcel made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1231
- @didzis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1250
- @nchudleigh made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1235
- @nekr0z made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1303
- @Sogl made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1306
- @computerscienceiscool made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1290
- @litongjava made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1310
- @brunofaustino made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1334
- @mkiol made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1335
- @AlienKevin made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1304
- @joecryptotoo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1286
- @jorismertz made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1362
- @JohanRaffin made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1368
- @ai-at-home made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1400
- @WhiteOlivierus made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1404
- @akx made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1358
- @asadm made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1417
- @iamthad made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1440
- @bjnortier made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1434
- @tobrun made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1445
- @sindresorhus made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1370
- @jxy made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1458
- @kubaracek made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1467
- @rlapray made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1475
Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.4.0...v1.5.0
1、 whisper-bin-Win32.zip 1.53MB
2、 whisper-bin-x64.zip 1.79MB
3、 whisper-blas-bin-Win32.zip 8.05MB
4、 whisper-blas-bin-x64.zip 13.76MB
5、 whisper-cublas-bin-x64.zip 1.35MB
6、 win32-x86-64_whisper.dll.zip 299.49KB
7、 win32-x86_whisper.dll.zip 237.8KB