v.0.9.0
版本发布时间: 2020-08-01 10:10:14
espnet/espnet最新发布版本:v.202409(2024-10-01 14:28:01)
New Features
- [New Features][ASR] Non-autoregressive ASR with Mask CTC #2070 by @YosukeHiguchi
- [New Features][ASR] Support Conformer model. #2144 by @pengchengguo
- [New Features][ASR][ST] CTC posterior visualization during training #2221 by @hirofumi0810
- [New Features][ESPnet2] Implement espnet2.bin.zenodo_upload #2168 by @kamo-naoyuki
- [New Features][ESPnet2] Python API for inference #2092 by @kamo-naoyuki
- [New Features][ESPnet2] Support TTS-Transformer in ESPnet2 #2134 by @kan-bayashi
- [New Features][ESPnet2][ASR] Enable batch joint decoding with CTC in recog API v2 #2197 by @takaaki-hori
- [New Features][ESPnet2][SE] Speech Enhancement Frontend for ESPNet2 Phase 1 #2124 by @LiChenda
- [New Features][ESPnet2][TTS] Support FastSpeech for ESPnet2 TTS #2149 by @kan-bayashi
- [New Features][ESPnet2][TTS] Support FastSpeech2 (+FastPitch) #2218 by @kan-bayashi
- [New Features][ESPnet2][TTS] Support GST in ESPnet2 TTS #2139 by @kan-bayashi
- [New Features][README][ASR] CTC forced alignment in E2E ASR Transformer model #2095 by @simpleoier
- [New Features][VC] Voice Transformer Network #2064 by @unilight
Enhancement
- [Enhancement] Fix error when downloading large files using
download_from_google_drive.sh
#2074 by @unilight - [Enhancement][ASR] added more beam search info #2130 by @sw005320
- [Enhancement][ESPnet2] Change packed file of espnet2 to zip format #2161 by @kamo-naoyuki
- [Enhancement][ESPnet2] Make read_text faster #2114 by @kamo-naoyuki
- [Enhancement][ESPnet2] RESULTS.md -> README.md #2077 by @kamo-naoyuki
- [Enhancement][ESPnet2] Remove long wave in template recipe #2075 by @kamo-naoyuki
- [Enhancement][ESPnet2] Update ESPnet2 JSUT TTS recipe and TTS template #2110 by @kan-bayashi
- [Enhancement][MT][ST] Fix ST/MT models for compatibility with ASR #2179 by @hirofumi0810
- [Enhancement][ST] Add source case information to json files in ST task #2208 by @hirofumi0810
- [Enhancement][ST] Refactor multi-task learning in ST #2202 by @hirofumi0810
Recipe
- [Recipe][ASR] Add aidatatang_200zh recipe #2122 by @nzhoward
- [Recipe][ASR] Add chime6 info #2250 by @sw005320
- [Recipe][ASR] CHiME-6 recipe #2171 by @GNroy
- [Recipe][ASR] Fix a bug in espnet wsj recipe. #2145 by @houwenxin
- [Recipe][ASR] New Recipe for Yoloxóchitl-Mixtec (SLR89) #2085 by @ftshijt
- [Recipe][ASR] Support averaging model for Conformer. #2244 by @pengchengguo
- [Recipe][ASR] Updated model after tuning aidatatang_200zh recipe #2204 by @nzhoward
- [Recipe][ASR] created a recipe to run asr on ljspeech #1996 by @ibkuroyagi
- [Recipe][ASR] updatemodel link (add pre-trained bpe model and lm model) #2101 by @ftshijt
- [Recipe][ESPnet2][ASR] espnet2 librispeech recipe #2109 by @sw005320
- [Recipe][ESPnet2][ASR] espnet2 librispeech v2 #2189 by @sw005320
- [Recipe][ESPnet2][ASR] update espnet2 aishell results #2150 by @Cescfangs
- [Recipe][ESPnet2][ASR][TTS] fix dev_set/eval_sets issues #2142 by @sw005320
- [Recipe][ESPnet2][TTS] Add ESPnet2 CSMSC TTS recipe #2129 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Add ESPnet2 LJSpeech recipe #2117 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Add VCTK recipe for ESPnet2 TTS #2165 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Create espnet2 jsut/tts recipe #2047 by @kamo-naoyuki
Refactoring
- [Refactoring][ESPnet2] Change stats_dir naming not to overwrite #2111 by @kan-bayashi
- [Refactoring][ESPnet2] Move modules #2086 by @kamo-naoyuki
- [Refactoring][ESPnet2] Remove $KALDI_ROOT/tools/env.sh from path.sh #2242 by @kamo-naoyuki
- [Refactoring][ESPnet2] Several update for pretrain model #2212 by @kamo-naoyuki
- [Refactoring][ESPnet2] Update Makefile #2225 by @kamo-naoyuki
Documentation
- [README] Fix URL in README #2090 by @kan-bayashi
- [README] Update README about TTS #2079 by @kan-bayashi
- [README] Update README.md #2046 by @kamo-naoyuki
- [README] Update README.md #2067 by @kamo-naoyuki
- [README] Update README.md #2243 by @kamo-naoyuki
- [README] Update citation #2206 by @hirofumi0810
- [README] Update installation.md #2233 by @kamo-naoyuki
- [README][ESPnet2] Update egs2/TEMPLATE/README.md #2098 by @kamo-naoyuki
Bugfix
- [Bugfix] Add cupy.done in make python #2091 by @kan-bayashi
- [Bugfix] Append a missing space in cmd-line args in utils/dump_pcm.sh #2209 by @yistLin
- [Bugfix] Fix Makefile #2097 by @kamo-naoyuki
- [Bugfix] Fix minor bug of Makefile #2055 by @kamo-naoyuki
- [Bugfix] Fix old model compatibility #2048 #2060 #2063 by @kan-bayashi
- [Bugfix] Fix pretrained model #2053 #2069 by @kan-bayashi
- [Bugfix] Fix pyopenjtalk installation #2108 by @kan-bayashi
- [Bugfix] Fix typo in run.sh of TTS recipes #2216 by @hirofumi0810
- [Bugfix] Update Makefile to disable cupy for cuda=10.2 or later #2230 by @kamo-naoyuki
- [Bugfix] fix path of PESQ #2058 by @kamo-naoyuki
- [Bugfix] scorerinterface warning English correction #2076 by @qmpzzpmq
- [Bugfix][CI] Fix bug in attention plotting #2185 by @hirofumi0810
- [Bugfix][CI] Freeze the matplotlib version with 3.1.0 #2181 by @sw005320
- [Bugfix][CI] fix integration_test_ctc_align_wav.bats with a small model #2170 by @simpleoier
- [Bugfix][CI] temporally disable subsample 6 and 8 tests #2205 by @sw005320
- [Bugfix][CI][MT][ST] Add integration test for ST/MT tasks #2210 by @hirofumi0810
- [Bugfix][ESPnet2] Add missing path.sh in egs2/vctk/tts1 #2167 by @kan-bayashi
- [Bugfix][ESPnet2] Fix TTS inference #2222 by @kan-bayashi
- [Bugfix][ESPnet2] Fix
tts_inference
whenfeats_extract
is None #2176 by @kan-bayashi - [Bugfix][ESPnet2] Fix bug for feats_type=extracted #2087 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix bug of iterable dataset when num_workers>=1 #2081 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix bug of when espnet2/bin/tokenize_text.py --cutoff or --vocabulary_size is used #2158 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix log: benchmark -> deterministic #2080 by @kamo-naoyuki
- [Bugfix][ESPnet2] Implement configargparse in espnet2 #2157 by @kamo-naoyuki
- [Bugfix][ESPnet2] Select torchaudio version according to torch version #2214 by @kamo-naoyuki
- [Bugfix][ESPnet2] avoid UnboundLocalError when lm is not loaded #2227 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix #2050 #2051 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix #2198: PhonemeTokenizer can't perform with multiprocessing #2201 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix best_model_criterion: wsj/asr1/conf/tuning/train_lm.yaml #2153 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix bug of lm.py #2056 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix the stage number: enh.sh #2220 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix: decode_config -> inference_config #2239 by @kamo-naoyuki
- [Bugfix][ESPnet2][Recipe] Not removing short/long utterances for eval_sets #2112 by @kamo-naoyuki
- [Bugfix][ESPnet2][SE] Fix bugs in espnet2/enh and format related directory structures #2215 by @Emrys365
- [Bugfix][ESPnet2][TTS] Fix feature extractor of TTS for compatibility #2102 by @kamo-naoyuki
Acknowledgements
Special thanks to @Cescfangs, @Emrys365, @GNroy, @LiChenda, @YosukeHiguchi, @ftshijt, @hirofumi0810, @houwenxin, @ibkuroyagi, @kamo-naoyuki, @kan-bayashi, @nzhoward, @pengchengguo, @qmpzzpmq, @simpleoier, @sw005320, @takaaki-hori, @unilight, @yistLin.