MyGit

v1.14.0

NVIDIA/NeMo

版本发布时间: 2022-12-24 10:49:19

NVIDIA/NeMo最新发布版本:r2.0.0rc1(2024-08-16 05:55:14)

Highlights

NeMo ASR

NeMo Megatron

NeMo Core

NeMo Models

Detailed Changelogs

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.11

ASR

Changelog
  • [Tools][ASR] Tool for generating data using simulated RIRs by @anteju :: PR: #5158
  • Modernize RNNT ONNX export and add TS export by @titu1994 :: PR: #5248
  • Add Gradio App to ASR Docs by @titu1994 :: PR: #5270
  • Add support for Sampled Softmax for RNNT Joint by @titu1994 :: PR: #5216
  • Speed up HF data processing script for ASR by @titu1994 :: PR: #5330
  • bugfix in volume loss for CTC models by @bmwshop :: PR: #5348
  • Add cpWER for evaluation of ASR with diarization by @tango4j :: PR: #5279
  • Fix for getting tokenizer in character-based ASR models when using tarred dataset by @jonghwanhyeon :: PR: #5442
  • Refactor/unify ASR offline and buffered inference by @fayejf :: PR: #5440
  • Standalone diarization+ASR evaluation script by @tango4j :: PR: #5439
  • [ASR] Transcribe for multi-channel signals by @anteju :: PR: #5479
  • Add Silence Augmentation by @fayejf :: PR: #5476
  • add exportable mel spec by @1-800-BAD-CODE :: PR: #5512
  • add RNN-T loss implemented by PyTorch and test code by @hainan-xv :: PR: #5312
  • [ASR] AudioToAudio datasets and related test by @anteju :: PR: #5196
  • Add StreamingFeatureBufferer class for real-life streaming decoding by @tango4j :: PR: #5534
  • Pool stats with padding by @1-800-BAD-CODE :: PR: #5403
  • Adding Hybrid RNNT-CTC model by @VahidooX :: PR: #5364
  • Fix ASR Buffered inference scripts by @titu1994 :: PR: #5552
  • Add wer details - insertion, deletion, substitution rate by @fayejf :: PR: #5557
  • Add support for Time Stamp calculation using transcribe_speech.py by @titu1994 :: PR: #5568
  • [STT] Add Esperanto (Eo) ASR Conformer-CTC and Conformer-Transducer models by @andrusenkoau :: PR: #5639

TTS

Changelog
  • [TTS] Fastpitch energy condition and refactoring by @subhankar-ghosh :: PR: #5218
  • [TTS] HiFi-TTS Download Script by @oleksiivolk :: PR: #5241
  • [TTS] Add Mandarin/English Bilingual Recipe for Training Fastpitch Models by @yuekaizhang :: PR: #5208
  • [TTS] fixed type of filepath and rename openslr. by @XuesongYang :: PR: #5276
  • [TTS] replace obsolete torch_tts unit test marker with run_only_on('CPU') by @XuesongYang :: PR: #5307
  • [TTS] bugfix IPAG2P and refactor to remove duplicate process. by @XuesongYang :: PR: #5304
  • Update path to get_data.py in TTS tutorial by @redoctopus :: PR: #5311
  • [TTS] Replace IPA lambda arguments with locale string by @rlangman :: PR: #5298
  • [TTS] expand to support flexible dictionary entry formats in IPAG2P. by @XuesongYang :: PR: #5318
  • [TTS] update organization of model checkpoints and their pointers. by @XuesongYang :: PR: #5327
  • [TTS] bugfix for the script of generating mels from fastpitch. by @XuesongYang :: PR: #5344
  • [TTS] Add Spanish model documentation by @rlangman :: PR: #5390
  • [TTS] Add Spanish FastPitch training configs by @rlangman :: PR: #5383
  • [TTS] replace pitch normalization params with ??? by @XuesongYang :: PR: #5392
  • [TTS] Create script for processing TTS training audio by @rlangman :: PR: #5262
  • [TTS] remove useless logic for set_tokenizer. by @XuesongYang :: PR: #5430
  • [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue by @borisfom :: PR: #5358
  • JOC Optimization in FastPitch by @subhankar-ghosh :: PR: #5450
  • [TTS] Support speaker level pitch normalization by @rlangman :: PR: #5455
  • TTS tutorial update: use speaker 9017 instead of 6097 by @redoctopus :: PR: #5532
  • [TTS] Remove unused TTS eval function by @redoctopus :: PR: #5605
  • [TTS][ZH] add fastpitch and hifigan model NGC urls and update NeMo docs. by @XuesongYang :: PR: #5596
  • [TTS][DOC] add notes about automatic conversion to target sampling ra… by @XuesongYang :: PR: #5624
  • [TTS][ZH] bugfix for the tutorial and add NGC CLI installation guide. by @XuesongYang :: PR: #5643
  • [TTS][ZH] bugfix for ngc cli installation. by @XuesongYang :: PR: #5652
  • [TTS][ZH] fix broken link for the script. by @XuesongYang :: PR: #5666

NLP / NMT

Changelog
  • Option to pad the last validation input sequence if its smaller than the encoder sequence length for MegatronGPT by @anmolgupt :: PR: #5243
  • Fixes bugs with loss averaging with for Megatron GPT by @shanmugamr1992 :: PR: #5329
  • Fixing bug in Megatron BERT when loss mask is all zeros by @shanmugamr1992 :: PR: #5424
  • support to disable sequence length + 1 input tokens for each sample in MegatronGPT by @anmolgupt :: PR: #5363
  • [TN] raise NotImplementedError for unsupported languages and other minor fixes by @XuesongYang :: PR: #5414
  • Bug fix/gpt by @shanmugamr1992 :: PR: #5493
  • prompt tuning fix for unscale grad errors by @arendu :: PR: #5523
  • Bert sequence parallel support by @shanmugamr1992 :: PR: #5494
  • NLP docs fixes by @vsl9 :: PR: #5528
  • Switch order of args in optimizer_step override by @ericharper :: PR: #5549
  • Upgrade to 22.11 by @ericharper :: PR: #5550
  • Merge r1.13.0 main by @ericharper :: PR: #5570
  • some tokenizers do not have additional_special_tokens_ids attribute by @arendu :: PR: #5642
  • Remove cell output from tutorial by @ericharper :: PR: #5689

Text Normalization / Inverse Text Normalization

Changelog
  • [ITN] fix year date graph, cardinals extension for hundreds by @ekmb :: PR: #5435
  • [TN] raise NotImplementedError for unsupported languages and other minor fixes by @XuesongYang :: PR: #5414

Export

Changelog
  • Fixed the onnx bug in conformer for non-streaming models. by @VahidooX :: PR: #5242
  • Modernize RNNT ONNX export and add TS export by @titu1994 :: PR: #5248
  • Fixes for Conformer-xl export by @borisfom :: PR: #5309
  • Remove onnx graphsurgery from Dockerfile by @titu1994 :: PR: #5320
  • add exportable mel spec by @1-800-BAD-CODE :: PR: #5512

General Improvements

Changelog
  • bugfix in volume loss for CTC models by @bmwshop :: PR: #5348
  • Fix setting up of learning rate scheduler by @PeganovAnton :: PR: #5444
  • Better patch hydra by @titu1994 :: PR: #5591
  • [TTS][ZH] bugfix for the tutorial and add NGC CLI installation guide. by @XuesongYang :: PR: #5643
  • Add fully torch.jit.script-able speaker clustering module by @tango4j :: PR: #5191
  • Update perturb.py by @stevehuang52 :: PR: #5231
  • remove CV requirements. by @XuesongYang :: PR: #5233
  • checks for accepted adapter type at module level by @arendu :: PR: #5194
  • fix hypotheses return by @nithinraok :: PR: #5253
  • Support for inserting additional subsampling in conformer encoder by @shan18 :: PR: #5224
  • update tutorials to use meeting config as default and VAD by @nithinraok :: PR: #5237
  • Specifying audio signal dropout separately for the Conformer Encoder by @shan18 :: PR: #5263
  • created by @bmwshop :: PR: #5268
  • Fix failing speaker counting for short audio samples by @tango4j :: PR: #5267
  • O2bert + apex pipeline functions by @shanmugamr1992 :: PR: #5221
  • Upperbound PTL by @titu1994 :: PR: #5302
  • Update Interface(s) phonetic entry by @blisc :: PR: #5212
  • add label inference support to EncDecSpeakerLabel class by @nithinraok :: PR: #5278
  • Add italian model checkpoints by @Kipok :: PR: #5315
  • Text Memmap Parsing Improvements by @michalivne :: PR: #5265
  • Update librosa signature in HF processing script by @titu1994 :: PR: #5321
  • Force wav file format for audio_filepath by @titu1994 :: PR: #5323
  • Updates to T0 Dataset and Model by @MaximumEntropy :: PR: #5201
  • [DOC] add sphinx-copybutton requirement to copy button on code snippets. by @XuesongYang :: PR: #5326
  • Add support for Hydra multirun to NeMo by @titu1994 :: PR: #5159
  • typo fix by @arendu :: PR: #5328
  • add precommit hood to automatic sort entries in requirements. by @XuesongYang :: PR: #5333
  • Add speaker clustering arguments to forward function by @tango4j :: PR: #5306
  • Fixing de-autocast by @borisfom :: PR: #5319
  • [Bugfix] Added rm -f / wget- nc command to avoid bash error in multispeaker sim notebook by @tango4j :: PR: #5292
  • [DOC] added ipython dependency to support IPython.sphinxext extension by @XuesongYang :: PR: #5345
  • Bug fix (removing old compute consumed samples) by @shanmugamr1992 :: PR: #5355
  • removed uninstall nemo_cv and nemo_simple_gan and relax numba version… by @XuesongYang :: PR: #5332
  • Enable mlflow logger by @whrichd :: PR: #4893
  • Fix Python type hints according to Python Docs by @artbataev :: PR: #5370
  • Distributed optimizer support for BERT by @timmoon10 :: PR: #5305
  • SpeakerClustering: fix tensor dimennsions in forward() by @virajkarandikar :: PR: #5387
  • add squad by @arendu :: PR: #5407
  • added python and c++ alignment code by @yzhang123 :: PR: #5346
  • Add MoE support for T5 model (w/o expert parallel) by @aklife97 :: PR: #5409
  • Fix for concat map dataset by @1-800-BAD-CODE :: PR: #5133
  • Support for finetuning and finetuning inference with .ckpt files & batch size refactoring by @MaximumEntropy :: PR: #5339
  • update doc in terms of get_label for lang id model by @fayejf :: PR: #5366
  • Debug support for interleaved pipeline parallelism with the distributed Adam optimizer by @timmoon10 :: PR: #5236
  • Create codeql.yml by @titu1994 :: PR: #5445
  • Update codeql.yml by @titu1994 :: PR: #5449
  • Fix support for legacy sentencepiece models by @Numeri :: PR: #5406
  • Update docs with Comparison tool info, and slightly change .sh for ea… by @Jorjeous :: PR: #5182
  • Add float32 type casting for get_samples function by @tango4j :: PR: #5399
  • Add missing import in transcribe_utils.py by @jonghwanhyeon :: PR: #5487
  • Add auto-labeler by @SeanNaren :: PR: #5498
  • Add more glob patterns for labeler by @SeanNaren :: PR: #5504
  • Fix issues with PL 1.8 by @SeanNaren :: PR: #5353
  • [BugFix] Removing tokens from decoding timestamp by @tango4j :: PR: #5481
  • Upperbound the torchmetrics version by @SeanNaren :: PR: #5537
  • Data parallel collect results by @michalivne :: PR: #5547
  • Fix log-rank-0-only logic by @mikolajblaz :: PR: #5555
  • Fixed Docker build by @borisfom :: PR: #5562
  • Patch hydra launch by @titu1994 :: PR: #5589
  • Fix race condition bug with hydra multirun by @titu1994 :: PR: #5594
  • Update Dockerfile to use numba==0.53.1 by @stevehuang52 :: PR: #5614
  • Fixed a missing import for gather_objects by @michalivne :: PR: #5622

相关地址:原始地址 下载(tar) 下载(zip)

查看:2022-12-24发行的版本