v1.7.0
版本发布时间: 2022-03-02 08:57:00
NVIDIA/NeMo最新发布版本:r2.0.0rc1(2024-08-16 05:55:14)
Known Issues
- Megatron GPT training with O2 and FP16 is bugged. FP16 and O1 still works.
- find_unused_parameters should be False when training GPT: #3837
- FastPitch training may result in stalled GPUs. Users will have to manually kill their runs and continue training from the latest checkpoint.
- mT5 issue with whole word masking, see #3776
- T5 finetuning config issue, see #3776
Container
NOTE: From NeMo 1.7.0 onwards, NeMo containers will follow the YY.MM conversion for naming, where the YY.MM value is based on the base container. For additional information regarding NeMo containers, please visit : https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:22.01
ASR
- Wav2vec by @tbartley94 :: PR: #3297
- Fix bug in multi-checkpoint loading by @sam1373 :: PR: #3536
- Add HuggingFace Datasets to NeMo ASR Dataset script by @titu1994 :: PR: #3513
- Add support for Gradient Clipping (clamp) in RNNT Numba loss by @titu1994 :: PR: #3550
- Enable Tarred Dataset Support for NVIDIA DALI by @titu1994 :: PR: #3485
- Add initial support for Buffered RNNT Scripts by @titu1994 :: PR: #3602
- Significantly speed up RNNT loss on CUDA by @titu1994 :: PR: #3653
- Fixing the bug in the stateful rnnt decoder. by @VahidooX :: PR: #3673
- Add Buffered RNNT with LCS Merge algorithm by @titu1994 :: PR: #3669
- Asr noise data scripts by @jbalam-nv :: PR: #3660
- ASR SSL update by @sam1373 :: PR: #3746
- Add randomized bucketing by @VahidooX :: PR: #3445
- Self-supervised tutorial & update by @sam1373 :: PR: #3344
- Updated conformer models. by @VahidooX :: PR: #3741
- Added speaker identification script with cosine and neural classifier… by @nithinraok :: PR: #3672
- Fix in clustering diarizer by @nithinraok :: PR: #3701
- Add a function that writes cluster label in diarization pipeline by @tango4j :: PR: #3643
TTS
- port UnivNet to NeMo TTS collection by @L0SG :: PR: #3186
- E2E TTS fixes by @redoctopus :: PR: #3508
- New structure for TTS datasets in scripts/dataset_processing, VocoderDataset, update TTSDataset by @Oktai15 :: PR: #3484
- Depreciate some TTS models and TTS datasets by @Oktai15 :: PR: #3576
- Fix bugs in HiFi-GAN (scheduler, optimizers) and add input_example() in Mixer-TTS/Mixer-TTS-X by @Oktai15 :: PR: #3564
- Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
- Fix typo in FastPitch config (pitch_avg -> pitch_mean) by @eyentei :: PR: #3593
- Fix incorrect usage of TTSDataset in some files and fix one-line bug in NVIDIA's CMUDict by @Oktai15 :: PR: #3594
- Convert entry from UTF-16 to UTF-8 by @redoctopus :: PR: #3597
- remove CheckInstall by @blisc :: PR: #3577
- Fix UnivNet LibriTTS pretrained location by @m-toman :: PR: #3615
- FastPitch training tutorial by @subhankar-ghosh :: PR: #3631
- Update Aligner, add new methods to AlignmentEncoder by @Oktai15 :: PR: #3641
- Add Mixed Representation Training by @blisc :: PR: #3473
- Add speakerID to libritts/get_data.py by @subhankar-ghosh :: PR: #3662
- Update TTS tutorials, Simplification of testing Mixer-TTS and FastPitch by @Oktai15 :: PR: #3680
- Clean FastPitch_Finetuning.ipynb notebook by @Oktai15 :: PR: #3698
- Add cache_size to BetaBinomialInterpolator, fix bugs in TTS tutorials and FastPitch by @Oktai15 :: PR: #3706
- Fix bugs in VocoderDataset and TTSDataset by @Oktai15 :: PR: #3713
- Fix bugs in E2E TTS, Mixer-TTS and FastPitch by @Oktai15 :: PR: #3740
NLP / NMT
- NLPDDPPlugin find_unused_parameters is configurable by @mlgill :: PR: #3478
- Megatron encoder-decoder refactor by @michalivne :: PR: #3542
- Finetuning NeMo Megatron T5 Models on GLUE by @MaximumEntropy :: PR: #3408
- Pipeline parallelism for GPT by @ericharper :: PR: #3388
- Generalized the P-tuning method to support various NLP tasks by @yidong72 :: PR: #3623
- Megatron_LM checkpoint to NeMo checkpoint support by @yidong72 :: PR: #3692
- Bugfix for GPT eval by @ericharper :: PR: #3744
- Yuya/megatron t5 glue eval by @yaoyu-33 :: PR: #3751
- Enforce legacy tokenizer for sentencepiece to add special tokens for T5 by @MaximumEntropy :: PR: #3457
- Added P-Tuning method by @yidong72 :: PR: #3488
- O2 style mixed precision training for T5 by @MaximumEntropy :: PR: #3664
- LM adapted T5 dataset by @MaximumEntropy :: PR: #3654
- Fix consumed samples calculation + PTune Model bugs by @yidong72 :: PR: #3738
- Add pipeline support to eval methods by @ericharper :: PR: #3684
- XNli benchmark by @yidong72 :: PR: #3693
- Refactor dialogue state tracking for modelling/dataset interoperability by @Zhilin123 :: PR: #3526
- Changes to support mean n-gram size masking for T5 by @MaximumEntropy :: PR: #3646
- Dialogue state tracking refactor by @Zhilin123 :: PR: #3667
- Parallel prompt tuning by @vadam5 :: PR: #3670
- GEGLU activation for T5 by @MaximumEntropy :: PR: #3694
Text Normalization / Inverse Text Normalization
- Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
- ITN bug fixes, ip address, card num support, whitelist clean up by @ekmb :: PR: #3574
- Fix tn bugs by @yzhang123 :: PR: #3580
- add serial number to itn by @yzhang123 :: PR: #3584
- ITN: SH bug fixes for telephone by @ekmb :: PR: #3592
- Tn bug 1.7.0 by @yzhang123 :: PR: #3730
- TN docs update by @ekmb :: PR: #3735
Export
- Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
- Conformer onnx fix by @borisfom :: PR: #3524
- Add onnx support for speaker models by @nithinraok :: PR: #3650
- Jasper mask/export fix by @borisfom :: PR: #3691
Bugfixes
- Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
- Dialogue state tracking refactor/ SGDGEN patch 2 by @Zhilin123 :: PR: #3674
- lower bound PTL to 1.5.10 and remove last ckpt patch fix by @nithinraok :: PR: #3690
Improvements
- Wfst tutorial by @tbartley94 :: PR: #3479
- Update CMUdict with ADLR version pronunciations by @redoctopus :: PR: #3446
- Fix docs by @yzhang123 :: PR: #3523
- Add docstring to UnivNetModel by @L0SG :: PR: #3529
- Increase lower bound due to security vulnerability by @ericharper :: PR: #3537
- Add Change Log builder to NeMo by @titu1994 :: PR: #3527
- Bugfix, need to freeze the model by @yidong72 :: PR: #3540
- Bucketing quick fix by @tbartley94 :: PR: #3543
- More fixes to SentencePiece for T5 by @MaximumEntropy :: PR: #3515
- Update CONTRIBUTING.md by @Oktai15 :: PR: #3569
- Update pr template and re-add Changelog builder by @titu1994 :: PR: #3575
- Apex quick fix by @ekmb :: PR: #3591
- Upgrade to 22.01 container by @ericharper :: PR: #3571
- Fix typo and update minimal version of scipy by @Oktai15 :: PR: #3604
- Add env variable to force transformers to run offline during CI by @ericharper :: PR: #3607
- Correctly install NeMo wheel by @titu1994 :: PR: #3599
- Fix wheel build by @titu1994 :: PR: #3610
- Fixed EH and error reporting in restore_from by @borisfom :: PR: #3583
- Clarifying documentation by @itzsimpl :: PR: #3616
- Improve docs for finetuning by @titu1994 :: PR: #3622
- Add NeMo version to all new .nemo files by @titu1994 :: PR: #3605
- Update numba if NVIDIA_PYTORCH_VERSION not correct by @itzsimpl :: PR: #3614
- Remove @experimental decorator in diarization related files. by @tango4j :: PR: #3625
- Remove compression from .nemo files by @okuchaiev :: PR: #3626
- Update adobe analytics by @ericharper :: PR: #3645
- Add ssl tutorial to tutorial docs page by @sam1373 :: PR: #3649
- Fix number of channels>1 issue by @ekmb :: PR: #3652
- Fixed the bug in bucketing. by @VahidooX :: PR: #3663
- Adding guard by @yzhang123 :: PR: #3655
- Add tutorial paths by @titu1994 :: PR: #3651
- Folder name update by @ekmb :: PR: #3671
- Test HF online for SGD-GEN only by @MaximumEntropy :: PR: #3681
- Update Librosa support to 0.9 by @titu1994 :: PR: #3682
- Comment out numba in 22.01 release by @titu1994 :: PR: #3685
- Fix failing tests inside of the 22.01 container in PR 3571 by @fayejf :: PR: #3609
- Fixed Apex guard when imported classes are used for default values by @michalivne :: PR: #3700
- Update citrinet_512.yaml by @Jorjeous :: PR: #3642
- update torchaudio in Dockerfile to match torch version by @GNroy :: PR: #3637
- Enforce import tests on the three domains by @titu1994 :: PR: #3702
- Audio based norm speed up by @ekmb :: PR: #3703
- Fix device on notebook by @titu1994 :: PR: #3732
- pynini pip by @yzhang123 :: PR: #3729
- Removed fp16 converting in complete method by @dimapihtar :: PR: #3709
- Mirror AN4 while CMU servers are down by @titu1994 :: PR: #3743
- Fix SSL configs for 1.7 by @sam1373 :: PR: #3748
- Punct process bug fix by @ekmb :: PR: #3747
- Specify gpus in SSL notebook by @sam1373 :: PR: #3753
- Duplex model inference fix, money encoder fix by @ekmb :: PR: #3754
- Update decoding strategy docs and override general value for tutorials by @titu1994 :: PR: #3755
- Fix directories in ssl notebook by @sam1373 :: PR: #3758
- Update Tacotron2_Training.ipynb by @blisc :: PR: #3769
- Fix dockerfile by @yzhang123 :: PR: #3778
- Prompt-Tuning-Documentation by @vadam5 :: PR: #3777
- Prompt tuning bug fix by @vadam5 :: PR: #3780