v1.2.0
版本发布时间: 2023-02-04 16:55:40
ggerganov/whisper.cpp最新发布版本:v1.6.2(2024-05-27 15:36:55)
Overview
In this release we significantly reduce the memory usage during inference by introducing "scratch" buffers to ggml
.
The new memory requirements per model are as follows:
Model | Disk | Mem (Old) | Mem (New) |
---|---|---|---|
tiny | 75 MB | ~390 MB | ~125 MB |
base | 142 MB | ~500 MB | ~210 MB |
small | 466 MB | ~1.0 GB | ~600 MB |
medium | 1.5 GB | ~2.6 GB | ~1.7 GB |
large | 2.9 GB | ~4.7 GB | ~3.3 GB |
It's a simple idea that instead of creating a new memory buffer for each new tensor in the computation, we reuse the memory of old tensors that are no longer needed. The implementation is in PR #431. It's not very clean - I think there is some better way to do this, but for now it will work.
Additionally, there might be some inference speed improvements on Apple Silicon in the Decoder part of the transformer. I haven't done proper benchmarks, but seems there is about ~30% performance boost. The results are identical to v1.1.1
.
What's Changed
Core ggml
/ whisper
-
whisper
: PPC64 big-endian support by @fitzsim in https://github.com/ggerganov/whisper.cpp/pull/398 -
whisper
: condition sampled timestamp tokens to be monotonically increasing by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/425 -
wasm
: fix typo in helper.js by @bhbs in https://github.com/ggerganov/whisper.cpp/pull/459 -
ggml
/whisper
: reduce memory usage during inference by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/431
Bindings
-
ci
: run workflows on pull requests + bindings depend on .h by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/446 -
go
: added wrappers to reset and print timings by @glaslos in https://github.com/ggerganov/whisper.cpp/pull/436 -
go
: add WhisperLangAutoDetect method to go binding by @RobinXL in https://github.com/ggerganov/whisper.cpp/pull/451 -
go
: add wrapper for system info by @glaslos in https://github.com/ggerganov/whisper.cpp/pull/456 -
go
: support "auto" as an option when set language by @polarmoon in https://github.com/ggerganov/whisper.cpp/pull/462
Examples
-
whisper.wasm
: add labels for easier radio selection by @kokes in https://github.com/ggerganov/whisper.cpp/pull/435 -
livestream.sh
: run main with model arg instead of default by @EricTendian in https://github.com/ggerganov/whisper.cpp/pull/453 -
main
: CSV format export trimmed spaces fix by @alex-bacart in https://github.com/ggerganov/whisper.cpp/pull/444 -
addon.node
: using whisper as a Node.js addon by @chenqianhe in https://github.com/ggerganov/whisper.cpp/pull/443
New Contributors
- @kokes made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/435
- @glaslos made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/436
- @EricTendian made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/453
- @RobinXL made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/451
- @alex-bacart made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/444
- @bhbs made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/459
- @polarmoon made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/462
- @chenqianhe made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/443
Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.1.1...v1.2.0
Highlights
I'll use these release notes to write some random thoughts about the project - sort of a short blog post.
I'm really happy with how whisper.cpp
turned out to be so far. There is a very positive reception in the ML community - most people seem to be excited by the simplicity of the implementation and the fact that it is quite self-contained. I receive a lot of questions about the project and about various ideas that it can be applied to. I really enjoy it and I try to respond to everyone!
I also find it very satisfying that there are so many contributions already happening by so many people. To me this illustrates the power of open-source collaboration. The contributions not only improve the functionality and the quality of the code, but also help to generate various new ideas and approaches to explore.
Another interesting thing is that the project keeps on giving. Every time I start to think that now is a good time to put it in the background for a while and focus on other stuff, some new cool idea pops up and I can't help but start working on it. Having this custom implementation allows me to interact with the model on a lower level which opens some interesting ways to explore it.
So far the development has been focused on improving the performance, expanding the platform coverage and having robust decoding strategies with a variety of examples. During this time, there have been several ideas that accumulated over-time which I find interesting to explore (diarization, token-level timestamps, improved timestamp accuracy, etc). I think I'll try to focus more on these in the future and see if I can achieve something interesting.
-
Windows port of
whisper.cpp
utilising vendor-agnostic GPGPU based on DirectCompute by @Const-me
-
"The New Yorker" article featuring
whisper.cpp
1、 whisper-bin-Win32.zip 1.05MB
2、 whisper-bin-x64.zip 1.2MB
3、 whisper-blas-bin-Win32.zip 7.47MB
4、 whisper-blas-bin-x64.zip 12.67MB