v1.7.0

版本发布时间: 2022-03-02 08:57:00

NVIDIA/NeMo最新发布版本:r2.0.0rc1(2024-08-16 05:55:14)

Known Issues

Megatron GPT training with O2 and FP16 is bugged. FP16 and O1 still works.
find_unused_parameters should be False when training GPT: #3837
FastPitch training may result in stalled GPUs. Users will have to manually kill their runs and continue training from the latest checkpoint.
mT5 issue with whole word masking, see #3776
T5 finetuning config issue, see #3776

Container

NOTE: From NeMo 1.7.0 onwards, NeMo containers will follow the YY.MM conversion for naming, where the YY.MM value is based on the base container. For additional information regarding NeMo containers, please visit : https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.01

ASR

Wav2vec by @tbartley94 :: PR: #3297
Fix bug in multi-checkpoint loading by @sam1373 :: PR: #3536
Add HuggingFace Datasets to NeMo ASR Dataset script by @titu1994 :: PR: #3513
Add support for Gradient Clipping (clamp) in RNNT Numba loss by @titu1994 :: PR: #3550
Enable Tarred Dataset Support for NVIDIA DALI by @titu1994 :: PR: #3485
Add initial support for Buffered RNNT Scripts by @titu1994 :: PR: #3602
Significantly speed up RNNT loss on CUDA by @titu1994 :: PR: #3653
Fixing the bug in the stateful rnnt decoder. by @VahidooX :: PR: #3673
Add Buffered RNNT with LCS Merge algorithm by @titu1994 :: PR: #3669
Asr noise data scripts by @jbalam-nv :: PR: #3660
ASR SSL update by @sam1373 :: PR: #3746
Add randomized bucketing by @VahidooX :: PR: #3445
Self-supervised tutorial & update by @sam1373 :: PR: #3344
Updated conformer models. by @VahidooX :: PR: #3741
Added speaker identification script with cosine and neural classifier… by @nithinraok :: PR: #3672
Fix in clustering diarizer by @nithinraok :: PR: #3701
Add a function that writes cluster label in diarization pipeline by @tango4j :: PR: #3643

TTS

port UnivNet to NeMo TTS collection by @L0SG :: PR: #3186
E2E TTS fixes by @redoctopus :: PR: #3508
New structure for TTS datasets in scripts/dataset_processing, VocoderDataset, update TTSDataset by @Oktai15 :: PR: #3484
Depreciate some TTS models and TTS datasets by @Oktai15 :: PR: #3576
Fix bugs in HiFi-GAN (scheduler, optimizers) and add input_example() in Mixer-TTS/Mixer-TTS-X by @Oktai15 :: PR: #3564
Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
Fix typo in FastPitch config (pitch_avg -> pitch_mean) by @eyentei :: PR: #3593
Fix incorrect usage of TTSDataset in some files and fix one-line bug in NVIDIA's CMUDict by @Oktai15 :: PR: #3594
Convert entry from UTF-16 to UTF-8 by @redoctopus :: PR: #3597
remove CheckInstall by @blisc :: PR: #3577
Fix UnivNet LibriTTS pretrained location by @m-toman :: PR: #3615
FastPitch training tutorial by @subhankar-ghosh :: PR: #3631
Update Aligner, add new methods to AlignmentEncoder by @Oktai15 :: PR: #3641
Add Mixed Representation Training by @blisc :: PR: #3473
Add speakerID to libritts/get_data.py by @subhankar-ghosh :: PR: #3662
Update TTS tutorials, Simplification of testing Mixer-TTS and FastPitch by @Oktai15 :: PR: #3680
Clean FastPitch_Finetuning.ipynb notebook by @Oktai15 :: PR: #3698
Add cache_size to BetaBinomialInterpolator, fix bugs in TTS tutorials and FastPitch by @Oktai15 :: PR: #3706
Fix bugs in VocoderDataset and TTSDataset by @Oktai15 :: PR: #3713
Fix bugs in E2E TTS, Mixer-TTS and FastPitch by @Oktai15 :: PR: #3740

NLP / NMT

NLPDDPPlugin find_unused_parameters is configurable by @mlgill :: PR: #3478
Megatron encoder-decoder refactor by @michalivne :: PR: #3542
Finetuning NeMo Megatron T5 Models on GLUE by @MaximumEntropy :: PR: #3408
Pipeline parallelism for GPT by @ericharper :: PR: #3388
Generalized the P-tuning method to support various NLP tasks by @yidong72 :: PR: #3623
Megatron_LM checkpoint to NeMo checkpoint support by @yidong72 :: PR: #3692
Bugfix for GPT eval by @ericharper :: PR: #3744
Yuya/megatron t5 glue eval by @yaoyu-33 :: PR: #3751
Enforce legacy tokenizer for sentencepiece to add special tokens for T5 by @MaximumEntropy :: PR: #3457
Added P-Tuning method by @yidong72 :: PR: #3488
O2 style mixed precision training for T5 by @MaximumEntropy :: PR: #3664
LM adapted T5 dataset by @MaximumEntropy :: PR: #3654
Fix consumed samples calculation + PTune Model bugs by @yidong72 :: PR: #3738
Add pipeline support to eval methods by @ericharper :: PR: #3684
XNli benchmark by @yidong72 :: PR: #3693
Refactor dialogue state tracking for modelling/dataset interoperability by @Zhilin123 :: PR: #3526
Changes to support mean n-gram size masking for T5 by @MaximumEntropy :: PR: #3646
Dialogue state tracking refactor by @Zhilin123 :: PR: #3667
Parallel prompt tuning by @vadam5 :: PR: #3670
GEGLU activation for T5 by @MaximumEntropy :: PR: #3694

Text Normalization / Inverse Text Normalization

Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
ITN bug fixes, ip address, card num support, whitelist clean up by @ekmb :: PR: #3574
Fix tn bugs by @yzhang123 :: PR: #3580
add serial number to itn by @yzhang123 :: PR: #3584
ITN: SH bug fixes for telephone by @ekmb :: PR: #3592
Tn bug 1.7.0 by @yzhang123 :: PR: #3730
TN docs update by @ekmb :: PR: #3735

Export

Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
Conformer onnx fix by @borisfom :: PR: #3524
Add onnx support for speaker models by @nithinraok :: PR: #3650
Jasper mask/export fix by @borisfom :: PR: #3691

Bugfixes

Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
Dialogue state tracking refactor/ SGDGEN patch 2 by @Zhilin123 :: PR: #3674
lower bound PTL to 1.5.10 and remove last ckpt patch fix by @nithinraok :: PR: #3690

Improvements

Wfst tutorial by @tbartley94 :: PR: #3479
Update CMUdict with ADLR version pronunciations by @redoctopus :: PR: #3446
Fix docs by @yzhang123 :: PR: #3523
Add docstring to UnivNetModel by @L0SG :: PR: #3529
Increase lower bound due to security vulnerability by @ericharper :: PR: #3537
Add Change Log builder to NeMo by @titu1994 :: PR: #3527
Bugfix, need to freeze the model by @yidong72 :: PR: #3540
Bucketing quick fix by @tbartley94 :: PR: #3543
More fixes to SentencePiece for T5 by @MaximumEntropy :: PR: #3515
Update CONTRIBUTING.md by @Oktai15 :: PR: #3569
Update pr template and re-add Changelog builder by @titu1994 :: PR: #3575
Apex quick fix by @ekmb :: PR: #3591
Upgrade to 22.01 container by @ericharper :: PR: #3571
Fix typo and update minimal version of scipy by @Oktai15 :: PR: #3604
Add env variable to force transformers to run offline during CI by @ericharper :: PR: #3607
Correctly install NeMo wheel by @titu1994 :: PR: #3599
Fix wheel build by @titu1994 :: PR: #3610
Fixed EH and error reporting in restore_from by @borisfom :: PR: #3583
Clarifying documentation by @itzsimpl :: PR: #3616
Improve docs for finetuning by @titu1994 :: PR: #3622
Add NeMo version to all new .nemo files by @titu1994 :: PR: #3605
Update numba if NVIDIA_PYTORCH_VERSION not correct by @itzsimpl :: PR: #3614
Remove @experimental decorator in diarization related files. by @tango4j :: PR: #3625
Remove compression from .nemo files by @okuchaiev :: PR: #3626
Update adobe analytics by @ericharper :: PR: #3645
Add ssl tutorial to tutorial docs page by @sam1373 :: PR: #3649
Fix number of channels>1 issue by @ekmb :: PR: #3652
Fixed the bug in bucketing. by @VahidooX :: PR: #3663
Adding guard by @yzhang123 :: PR: #3655
Add tutorial paths by @titu1994 :: PR: #3651
Folder name update by @ekmb :: PR: #3671
Test HF online for SGD-GEN only by @MaximumEntropy :: PR: #3681
Update Librosa support to 0.9 by @titu1994 :: PR: #3682
Comment out numba in 22.01 release by @titu1994 :: PR: #3685
Fix failing tests inside of the 22.01 container in PR 3571 by @fayejf :: PR: #3609
Fixed Apex guard when imported classes are used for default values by @michalivne :: PR: #3700
Update citrinet_512.yaml by @Jorjeous :: PR: #3642
update torchaudio in Dockerfile to match torch version by @GNroy :: PR: #3637
Enforce import tests on the three domains by @titu1994 :: PR: #3702
Audio based norm speed up by @ekmb :: PR: #3703
Fix device on notebook by @titu1994 :: PR: #3732
pynini pip by @yzhang123 :: PR: #3729
Removed fp16 converting in complete method by @dimapihtar :: PR: #3709
Mirror AN4 while CMU servers are down by @titu1994 :: PR: #3743
Fix SSL configs for 1.7 by @sam1373 :: PR: #3748
Punct process bug fix by @ekmb :: PR: #3747
Specify gpus in SSL notebook by @sam1373 :: PR: #3753
Duplex model inference fix, money encoder fix by @ekmb :: PR: #3754
Update decoding strategy docs and override general value for tutorials by @titu1994 :: PR: #3755
Fix directories in ssl notebook by @sam1373 :: PR: #3758
Update Tacotron2_Training.ipynb by @blisc :: PR: #3769
Fix dockerfile by @yzhang123 :: PR: #3778
Prompt-Tuning-Documentation by @vadam5 :: PR: #3777
Prompt tuning bug fix by @vadam5 :: PR: #3780

相关地址：原始地址下载(tar) 下载(zip)

查看：2022-03-02发行的版本