v0.9.0

版本发布时间: 2024-10-09 10:10:17

argmaxinc/WhisperKit最新发布版本:v0.10.1(2024-12-21 13:48:53)

Highlights

Package Updates

With https://github.com/argmaxinc/WhisperKit/pull/216 the default for checking whether a model is supported on the device uses the model repo config.json as a source of truth. The need for this came about with the release of the new large-v3 turbo model, which is listed in the model repo as openai_whisper-large-v3-v20240930, which was recommended for devices that would crash if attempting to load. This situation can now be mitigated by updating this config.json without the need for a new release and can be called directly with the new static method recommendedRemoteModels:

    let recommendedModel =  await WhisperKit.recommendedRemoteModels().default
	let pipe  = WhisperKit(model: recommendModel)

The existing interface for WhisperKit.recommendedModels() remains the same, but now returns a ModelSupport object with a list of supported models for the current device.

public struct ModelSupport: Codable, Equatable {
    public let `default`: String
    public let supported: [String]
    public var disabled: [String] = []
}

Also, in an ongoing effort to improve modularity, extensibility, and code structure, there is a new way to initialize WhisperKit: using the new WhisperKitConfig class. The parameters are exactly the same and the previous init method is still in place, but this can assist in defining WhisperKit settings and protocol objects ahead of time and initialize WhisperKit more cleanly:

let pipe = try? await WhisperKit(model: "your-custom-model", modelRepo: "username/your-model-repo")

New:

let config = WhisperKitConfig(model: "your-custom-model", modelRepo: "username/your-model-repo") // Initialize config
config.model = "your-custom-model" // Alternatively set parameters directly
let pipe = try? await WhisperKit(config) // Pass into WhisperKit initializer

WhisperAX example app and CLI

Thanks to some memory and audio processing optimizations in #195, #216, and #217, (shout out to @keleftheriou for finding a big improvement there) we've updated the example implementations to use VAD by default with a concurrentWorkerCount of 4. This will significantly improve default inference speed on long files for devices that support async prediction, as well as real time streaming for devices/model combinations that are greater than 1 real-time factor.

⚠️ Deprecations and changed interfaces

The extension on Process.processor is now ProcessInfo.processor and includes a new property ProcessInfo.hwModel which will return a similar string as uname(&utsname) for non-macs.
public func modelSupport(for deviceName: String) -> (default: String, disabled: [String]) is now a disfavored overload in preference of public func modelSupport(for deviceName: String, from config: ModelSupportConfig? = nil) -> ModelSupport

What's Changed

Make additional initializers, functions, members public for extensibility by @bpkeene in https://github.com/argmaxinc/WhisperKit/pull/192
Fix start time logic for file loading by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/195
Change static var stored properties to static let by @fumoboy007 in https://github.com/argmaxinc/WhisperKit/pull/190
Add VoiceActivityDetector base class by @a2they in https://github.com/argmaxinc/WhisperKit/pull/199
Set default concurrentWorkerCount by @atiorh in https://github.com/argmaxinc/WhisperKit/pull/205
Improving modularity and code structure by @a2they in https://github.com/argmaxinc/WhisperKit/pull/212
Add model support config fetching from model repo by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/216
Example app VAD default + memory reduction by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/217

New Contributors

@bpkeene made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/192
@fumoboy007 made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/190
@a2they made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/199
@atiorh made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/205
@1amageek made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/216
@keleftheriou made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/217

Full Changelog: https://github.com/argmaxinc/WhisperKit/compare/v0.8.0...v0.9.0

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-10-09发行的版本