v0.9.0
版本发布时间: 2024-10-09 10:10:17
argmaxinc/WhisperKit最新发布版本:v0.9.4(2024-11-07 09:51:16)
Highlights
Package Updates
With https://github.com/argmaxinc/WhisperKit/pull/216 the default for checking whether a model is supported on the device uses the model repo config.json as a source of truth. The need for this came about with the release of the new large-v3 turbo model, which is listed in the model repo as openai_whisper-large-v3-v20240930, which was recommended for devices that would crash if attempting to load. This situation can now be mitigated by updating this config.json without the need for a new release and can be called directly with the new static method recommendedRemoteModels
:
let recommendedModel = await WhisperKit.recommendedRemoteModels().default
let pipe = WhisperKit(model: recommendModel)
The existing interface for WhisperKit.recommendedModels()
remains the same, but now returns a ModelSupport
object with a list of supported models for the current device.
public struct ModelSupport: Codable, Equatable {
public let `default`: String
public let supported: [String]
public var disabled: [String] = []
}
Also, in an ongoing effort to improve modularity, extensibility, and code structure, there is a new way to initialize WhisperKit: using the new WhisperKitConfig
class. The parameters are exactly the same and the previous init method is still in place, but this can assist in defining WhisperKit settings and protocol objects ahead of time and initialize WhisperKit more cleanly:
Previous:
let pipe = try? await WhisperKit(model: "your-custom-model", modelRepo: "username/your-model-repo")
New:
let config = WhisperKitConfig(model: "your-custom-model", modelRepo: "username/your-model-repo") // Initialize config
config.model = "your-custom-model" // Alternatively set parameters directly
let pipe = try? await WhisperKit(config) // Pass into WhisperKit initializer
WhisperAX example app and CLI
Thanks to some memory and audio processing optimizations in #195, #216, and #217, (shout out to @keleftheriou for finding a big improvement there) we've updated the example implementations to use VAD by default with a concurrentWorkerCount
of 4. This will significantly improve default inference speed on long files for devices that support async prediction, as well as real time streaming for devices/model combinations that are greater than 1 real-time factor.
⚠️ Deprecations and changed interfaces
- The extension on
Process.processor
is nowProcessInfo.processor
and includes a new propertyProcessInfo.hwModel
which will return a similar string asuname(&utsname)
for non-macs. -
public func modelSupport(for deviceName: String) -> (default: String, disabled: [String])
is now a disfavored overload in preference ofpublic func modelSupport(for deviceName: String, from config: ModelSupportConfig? = nil) -> ModelSupport
What's Changed
- Make additional initializers, functions, members public for extensibility by @bpkeene in https://github.com/argmaxinc/WhisperKit/pull/192
- Fix start time logic for file loading by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/195
- Change
static var
stored properties tostatic let
by @fumoboy007 in https://github.com/argmaxinc/WhisperKit/pull/190 - Add VoiceActivityDetector base class by @a2they in https://github.com/argmaxinc/WhisperKit/pull/199
- Set default concurrentWorkerCount by @atiorh in https://github.com/argmaxinc/WhisperKit/pull/205
- Improving modularity and code structure by @a2they in https://github.com/argmaxinc/WhisperKit/pull/212
- Add model support config fetching from model repo by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/216
- Example app VAD default + memory reduction by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/217
New Contributors
- @bpkeene made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/192
- @fumoboy007 made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/190
- @a2they made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/199
- @atiorh made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/205
- @1amageek made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/216
- @keleftheriou made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/217
Full Changelog: https://github.com/argmaxinc/WhisperKit/compare/v0.8.0...v0.9.0