v0.2.0
版本发布时间: 2023-02-20 10:22:02
modelscope/FunASR最新发布版本:v0.3.0(2023-03-16 16:15:02)
What's new:
2023.2.17, funasr-0.2.0, modelscope-1.3.0
- We support a new feature, export paraformer models into onnx and torchscripts from modelscope. The local finetuned models are also supported.
- We support a new feature, onnxruntime, you could deploy the runtime without modelscope or funasr, for the paraformer-large model, the rtf of onnxruntime is 3x speedup(0.110->0.038) on cpu, details.
- We support a new feature, grpc, you could build the ASR service with grpc, by deploying the modelscope pipeline or onnxruntime.
- We release a new model paraformer-large-contextual, which supports the hotword customization based on the incentive enhancement, and improves the recall and precision of hotwords.
- We optimize the timestamp alignment of Paraformer-large-long, the prediction accuracy of timestamp is much improved, and achieving accumulated average shift (aas) of 74.7ms, details.
- We release a new model, 8k VAD model, which could predict the duration of none-silence speech. It could be freely integrated with any ASR models in modelscope.
- We release a new model, MFCCA, a multi-channel multi-speaker model which is independent of the number and geometry of microphones and supports Mandarin meeting transcription.
- We release several new UniASR model: Southern Fujian Dialect model, French model, German model, Vietnamese model, Persian model.
- We release a new model, paraformer-data2vec model, an unsupervised pretraining model on AISHELL-2, which is inited for paraformer model and then finetune on AISHEL-1.
- We release a new feature, the
VAD
,ASR
andPUNC
models could be integrated freely, which could be models from modelscope, or the local finetine models. The demo. - We optimize punctuation common model, enhance the recall and precision, fix the badcases of missing punctuation marks.
- Various new types of audio input types are now supported by modelscope inference pipeline, including: mp3、flac、ogg、opus...
最新更新:
- 2023年2月(2月17号发布):funasr-0.2.0, modelscope-1.3.0
-
功能完善:
- 新增加模型导出功能,Modelscope中所有Paraformer模型与本地finetune模型,支持一键导出onnx格式模型与torchscripts格式模型,用于模型部署。
- 新增加Paraformer模型onnxruntime部署功能,无须安装Modelscope与FunASR,即可部署,cpu实测,onnxruntime推理速度提升近3倍(rtf: 0.110->0.038)。
- 新增加grpc服务功能,支持对Modelscope推理pipeline进行服务部署,也支持对onnxruntime进行服务部署。
- 优化Paraformer-large长音频模型时间戳,对badcase时间戳预测准确率有较大幅度提升,平均首尾时间戳偏移74.7ms,详见论文。
- 新增加任意VAD模型、ASR模型与标点模型自由组合功能,可以自由组合Modelscope中任意模型以及本地finetune后的模型进行推理,用法示例。
- 优化标点通用模型,增加标点召回和精度,修复缺少标点等问题。
- 新增加采样率自适应功能,任意输入采样率音频会自动匹配到模型采样率;新增加多种语音格式支持,如,mp3、flac、ogg、opus等。
-
上线新模型:
- Paraformer-large热词模型,可实现热词定制化,基于提供的热词列表,对热词进行激励增强,提升模型对热词的召回。
- MFCCA多通道多说话人识别模型,与西工大音频语音与语言处理研究组合作论文,一种基于多帧跨通道注意力机制的多通道语音识别模型。
- 8k语音端点检测VAD模型,可用于检测长语音片段中有效语音的起止时间点,支持流式输入,最小支持10ms语音输入流。
- UniASR流式离线一体化模型: 16k UniASR闽南语、 16k UniASR法语、 16k UniASR德语、 16k UniASR越南语、 16k UniASR波斯语。
- 基于Data2vec结构无监督预训练Paraformer模型,采用Data2vec无监督预训练初值模型,在AISHELL-1数据中finetune Paraformer模型。
-
New Contributors
- @zjc6666 made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/35
- @lyblsgo made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/37
- @lingyunfly made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/42
- @fangd123 made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/44
- @dyyzhmm made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/48
- @R1ckShi made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/50
- @chenmengzheAAA made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/57
- @ZhihaoDU made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/95
- @SWHL made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/97
- @yufan-aslp made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/105
- @magicharry made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/119
Full Changelog: https://github.com/alibaba-damo-academy/FunASR/compare/v0.1.6...v0.2.0