v.0.7.0
版本发布时间: 2020-05-24 15:01:16
espnet/espnet最新发布版本:v.202409(2024-10-01 14:28:01)
Now, the ESPnet project moves on to a new endeavor! We launched espnet2, which aims to refine the modularities (chainer-free, kaldi-free), use a more customizable trainer, support distributed training, and achieve the scalability mainly led by @kamo-naoyuki with his great efforts and leadership. This project is one of the outcomes of our ESPnet hackathon in Tokyo 2019 with a lot of discussions about the design, new features, and community contributions. espnet2 currently supports main ASR recipes (with a well-designed recipe template) and limited TTS recipes. We maintain both espnet1 and espnet2, but gradually move to our development in espnet2. The ESPnet project is further accelerated!
ESPnet2
- [ESPnet2] keep the latest model #1769 by @kamo-naoyuki
- [ESPnet2] Remove "E2E" from all comments #1766 by @kamo-naoyuki
- [ESPnet2] Refactoring for ESPnetDataset #1758 by @kamo-naoyuki
- [ESPnet2] Implement SpecAug for ESPnet2 #1746 by @kamo-naoyuki
- [ESPnet2] Implement BatchBinSampler #1742 by @kamo-naoyuki
- [ESPnet2] Support torch_optimizer #1739 by @kamo-naoyuki
- [ESPnet2] Log rotation for launch.py #1737 by @kamo-naoyuki
- [ESPnet2] Change the type of --chunk_length to str_or_int #1733 by @kamo-naoyuki
- [ESPnet2] Change cudnn deterministic mode to default #1732 by @kamo-naoyuki
- [ESPnet2] Add wsj results for espnet2 #1724 by @kamo-naoyuki
- [ESPnet2] Show estimated time to finish #1717 by @kamo-naoyuki
- [ESPnet2] Add --name option for training job #1714 by @kamo-naoyuki
- [ESPnet2] Show the log file when training process is failed: espnet2.bin.launch.py #1713 by @kamo-naoyuki
- [ESPnet2] --max_length -> --fold_length #1712 by @kamo-naoyuki
- [ESPnet2] Double quoter for NCCL_SOCKET_IFNAME #1706 by @kamo-naoyuki
- [ESPnet2] Save apex state in checkpoint and support apex optimizer #1705 by @kamo-naoyuki
- [ESPnet2] Update asr.sh #1694 by @zh794390558
- [ESPnet2] Update ctc.py #1688 by @zh794390558
- [ESPnet2] Update launch.py #1681 by @zh794390558
- [ESPnet2] Update asr.sh #1678 by @zh794390558
- [ESPnet2] --keep_n_best_checkpoints -> --keep_nbest_models #1647 by @kamo-naoyuki
- [ESPnet2] Avoid deprecated warning: reduction="none" #1510 by @kamo-naoyuki
- [ESPnet2] Minor change for speed perturbation #1627 by @kamo-naoyuki
- [ESPnet2] Fix how2 recipe #1620 by @kamo-naoyuki
- [ESPnet2] Fix recipes #1617 by @kamo-naoyuki
- [ESPnet2] Renaming #1610 by @kamo-naoyuki
- [ESPnet2] Implement chunk iterator #1608 by @kamo-naoyuki
- [ESPnet2] Update voxforge RESULTS #1601 by @kamo-naoyuki
- [ESPnet2] vivos recipe: --audio_format wav #1592 by @kamo-naoyuki
- [ESPnet2] Lower python requirements to 3.6 #1565 by @kamo-naoyuki
- [ESPnet2] dirha_wsj recipe for espnet2 #1556 by @yuekaizhang
- [ESPnet2] Update AISHELL ASR Recipe #1549 by @Emrys365
- [ESPnet2] Remove short data #1531 by @kamo-naoyuki
- [ESPnet2] [WIP] Update JSUT ASR Recipe #1529 by @YosukeHiguchi
- [ESPnet2] Update HOW2 recipe #1522 by @b-flo
- [ESPnet2] [WIP] Update CSJ ASR Recipe #1520 by @YosukeHiguchi
- [ESPnet2] Change NoamLR to deprecated and implement WarmupLR #1519 by @kamo-naoyuki
- [ESPnet2] Implement --max_cache_size option #1509 by @kamo-naoyuki
- [ESPnet2] distributed training #1506 by @kamo-naoyuki
- [ESPnet2] ESPNet2 Recipe Update -- commonvoice, babel, ami #1504 by @ftshijt
- [ESPnet2] Refactoring #1494 by @kamo-naoyuki
- [ESPnet2] Fix ci of flake8 part #1491 by @kamo-naoyuki
- [ESPnet2] Tensorboard, --num_iters_per_epoch, etc. #1487 by @kamo-naoyuki
- [ESPnet2] Fix espnet2.bin.pack #1486 by @kamo-naoyuki
- [ESPnet2] show_result.sh #1478 by @kamo-naoyuki
- [ESPnet2] Pack and Unpack model #1477 by @kamo-naoyuki
- [ESPnet2] collect-stats mode, trainer class, etc. #1462 by @kamo-naoyuki
- [ESPnet2] add test codes for asr decoders #1445 by @kamo-naoyuki
- [ESPnet2] Integrate Griffin-Lim with tts_decode() #1442 by @kan-bayashi
- [ESPnet2] Update ASR recipe #1439 by @kan-bayashi
- [ESPnet2] Update TTS recipes #1430 by @kan-bayashi
- [ESPnet2] Disable wer/cer calculation when training #1547 by @kamo-naoyuki
- [ESPnet2] Change CTC default to builtin #1546 by @kamo-naoyuki
- [ESPnet2] Update chime4 asr1 Recipe #1570 by @yuekaizhang
- [ESPnet2] Create documentation for espnet2 #1710 by @kamo-naoyuki
- [ESPnet2] shellcheck for local/data.sh #1524 by @kamo-naoyuki
- [ESPnet2] commonvoice: RESULTS.md -> README.md #1797 by @kamo-naoyuki
Bugfix
- [Bugfix] % -> percent: espnet2/tasks/abs_task.py #1767 by @kamo-naoyuki
- [Bugfix] Fix gpu mode for tts_inference.py #1755 by @kamo-naoyuki
- [Bugfix] Fix SubReporter #1748 by @kamo-naoyuki
- [Bugfix] Fix calculate_all_attentions for espnet2 #1747 by @kamo-naoyuki
- [Bugfix] Not to create the averaged mdel if --keep_nbest_models=1 #1744 by @kamo-naoyuki
- [Bugfix] Fix --best_model_criterions #1743 by @kamo-naoyuki
- [Bugfix] Fix the gpu device when resuming #1731 by @kamo-naoyuki
- [Bugfix] Fix error log for espnet2/bin/launch.py #1730 by @kamo-naoyuki
- [Bugfix] Disable CUDNN deterministic for CTC: espnet2/asr/ctc.py #1720 by @kamo-naoyuki
- [Bugfix] Update default.py #1698 by @zh794390558
- [Bugfix] Fix chunk iterator and refactoring for distributed training #1685 by @kamo-naoyuki
- [Bugfix] Update vgg_rnn_encoder.py #1676 by @zh794390558
- [Bugfix] [ESPnet2] chmod +x: run.sh for JSUT #1628 by @kamo-naoyuki
- [Bugfix] [ESPnet2]Remove nlsyms when word scoring #1614 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix setup.sh #1596 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix launch.py for slurm #1588 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix ci for local/data.sh #1572 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix nj of scripts/audio/format_wav_scp.sh #1550 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Use load_scp_sequential in formart_wav_scp.py #1541 by @kamo-naoyuki
- [Bugfix] [ESPNet2] Minor fix for CSJ recipe #1540 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix transformer #1539 by @kamo-naoyuki
- [Bugfix] [ESPnet2] fix rnn_type when bidirectional is used #1533 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix format_wav_scp.py #1532 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix bug of using GPU even if CPU mode #1526 by @kamo-naoyuki
- [Bugfix] [ESPnet2 ] Fix --accum_grad #1525 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix voxforge config #1511 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Bug fix of splitting files for collect_stats mode #1505 by @kamo-naoyuki
- [Bugfix] fix to use queue.conf #1431 by @sw005320
- [Bugfix] [ESPnet2] Fix a bug in TTS #1428 by @kan-bayashi
- [Bugfix] [ESPnet2] Refactor Encoder and Decoder and bug fix #1427 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix bug of text-chars converter #1426 by @kamo-naoyuki
- [Bugfix] Optionize trans_type in egs/ljspeech/tts2 #1789 by @kan-bayashi
- [Bugfix] bugfix in ljspeech/tts2 #1783 by @beckgom
- [Bugfix] missing argument for local/data_prep.sh added #1782 by @beckgom
- [Bugfix] avoid sentencepiece==0.1.90 #1923 by @kamo-naoyuki
- [Bugfix] FIX E523,E541,E741 #1918 by @kamo-naoyuki
- [Bugfix] fix reverse option for cmvn #1906 by @magictron
- [Bugfix] Error handling for Transformer with CTC-based VAD #1875 by @takenori-y
- [Bugfix] Revert deletion of init files #1842 by @Fhrozen
- [Bugfix] fix the missing link of tedlium3 #1841 by @sw005320
- [Bugfix] Add test for torch>1.1 #1840 by @kamo-naoyuki
- [Bugfix] Fix #1808: change the argument order of --batch_type for collect stat… #1810 by @kamo-naoyuki
- [Bugfix] Change to configargparse>=1.2.1 #1803 by @kamo-naoyuki
- [Bugfix] typo fixed for attention type #1793 by @beckgom
- [Bugfix] fix https://github.com/espnet/espnet/issues/1780 #1784 by @qmeeus
- [Bugfix] Fix bug of espnet2 asr_inference.py #1952 by @kamo-naoyuki
- [Bugfix] Minor fix of import place and comments #1959 by @kan-bayashi
New Features
- [New Features] Add utils/translate_wav.sh #1530 by @ShigekiKarita
- [New Features] Batch beam search V2 for Transformer (no CTC) #1402 by @ShigekiKarita
Enhancement
- [Enhancement] Support multiple sentences in synth_wav.sh #1788 by @kan-bayashi
- [Enhancement] fix+update transducer #1760 by @b-flo
Documentation
- [Documentation] Update notebook #1963 by @kan-bayashi
- [Documentation] Update installation manual #1960 by @kan-bayashi
- [Documentation] Update installation.md #1957 by @kamo-naoyuki
- [Documentation] Add note in synth_wav.sh #1785 by @kan-bayashi
- [Documentation] Update docs #1954 #1955 by @kamo-naoyuki
- [Documentation] Update docs #1938 by @kamo-naoyuki
- [Documentation] docs: added fbank link to the experiment readme #1910 by @kdubovikov
Recipe
- [Recipe] Added some TIMIT results #1819 by @sknadig
- [Recipe] add recipe for French Polyphone: ELRA-S0030_02 #1711 by @AdolfVonKleist
- [Recipe] Use espnet_tts_frontend #1794 by @kamo-naoyuki
CI
- [CI] Use cache in actions #1917 by @ShigekiKarita
- [CI] Apply black #1850 by @kamo-naoyuki
- [CI] Create .mergify.yml #1813 by @kamo-naoyuki
Acknowledgements
Special thanks to @AdolfVonKleist, @Emrys365, @Fhrozen, @ShigekiKarita, @YosukeHiguchi, @beckgom, @b-flo, @ftshijt, @kamo-naoyuki, @kan-bayashi, @kdubovikov, @magictron, @qmeeus, @sknadig, @sw005320, @takenori-y, @yuekaizhang, @zh794390558