v5.0
版本发布时间: 2024-06-28 04:07:34
snakers4/silero-vad最新发布版本:v5.1(2024-07-09 21:18:46)
Performance and Model Size
- 3x faster inference for TorchScript, 10% faster inference for ONNX;
- Now TorchScript is as fast as ONNX;
- Model size is 2x larger, 2MB vs. 1MB;
Quality
- The VAD supports more than 6,000 languages now;
- Significanly more robust on noisy data;
- Overall 5-7% quality increase on clean data;
- Quality difference for 8 kHz and 16 kHz is negligible now;
- Quality difference for different window sizes is negligible => window size was deprecated;
- Added benchmarks on 9 unique datasets (2 private) and one holistic multi-domain dataset;
Changes and deprecations
- ONNX opset 16;
-
window_size_samples
is deprecated - now the VAD only works with fixed size window; - VAD now works with 8 kHz and 16 kHz sample rates, only with fixed 256 and 512 sample windows respectively;
- Slightly changed internal logic, now some context (part of previous chunk) is passed along with the current chunk;
- Sample rates that are a multiple of 16 kHz are still supported;