MyGit

v1.20.0

NVIDIA/NeMo

版本发布时间: 2023-08-05 03:50:15

NVIDIA/NeMo最新发布版本:r2.0.0rc1(2024-08-16 05:55:14)

Highlights

Models

NeMo ASR

NeMo TTS

NeMo Framework

NeMo Tools

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.06

Detailed Changelogs

ASR

Changelog
  • [ASR] Adding ssl config for fast-conformer by @krishnacpuvvada :: PR: #6672
  • Fix for interctc test random failure by @Kipok :: PR: #6644
  • sharded manifests docs by @bmwshop :: PR: #6751
  • [TTS] Implement new vocoder dataset by @rlangman :: PR: #6670
  • TDT model pull request by @hainan-xv :: PR: #6536
  • Spec aug fix by @tbartley94 :: PR: #6775
  • Support large inputs to Conformer and Fast Conformer by @bmwshop :: PR: #6556
  • sharded manifests updated docs by @bmwshop :: PR: #6833
  • added fc-xl, xxl and titanet-s models by @nithinraok :: PR: #6832
  • Multi-lookahead cache-aware streaming models by @VahidooX :: PR: #6711
  • Update transcribe_utils.py by @stevehuang52 :: PR: #6865
  • Fix k2 build topo helper by @artbataev :: PR: #6887
  • Fix transcribe_utils.py for hybrid models in partial transcribe mode by @stevehuang52 :: PR: #6899
  • Add hybrid model support to transcribe_speech_parallel.py by @stevehuang52 :: PR: #6906
  • Update Frame-VAD doc by @stevehuang52 :: PR: #6902
  • Make sure asr_model.change_attention_model is run if either cfg.model_path or cfg.pretrained_name is specified by @erastorgueva-nv :: PR: #6908
  • Update fvad doc by @stevehuang52 :: PR: #6920
  • Online Code Switching Dataset for ASR by @trias702 :: PR: #6579
  • Fix AN4 dataset links by @artbataev :: PR: #6926
  • Fix confidence ensembles RNNT logprobs selection logic for exclude_blank scenario by @KunalDhawan :: PR: #6937
  • Adding cache-aware streaming ASR checkpoints. by @VahidooX :: PR: #6940
  • Remove from metrics by @titu1994 :: PR: #6979
  • Hybrid conformer export by @borisfom :: PR: #6983
  • Cache handling without input tensors mutation by @borisfom :: PR: #6980
  • Fixing an issue with confidence ensembles by @Kipok :: PR: #6987
  • Add ASR with TTS Tutorial. Fix enhancer usage. by @artbataev :: PR: #6955
  • fix install_beamsearch_decoders.sh by @karpnv :: PR: #7019
  • Add support for Numba FP16 RNNT Loss (#6991) by @titu1994 :: PR: #7038
  • Fix typo and branch in tutorial by @artbataev :: PR: #7048
  • Refined export_config by @borisfom :: PR: #7053
  • Fix documentation for Numba by @titu1994 :: PR: #7065
  • Adding docs and models for multiple lookahead cache-aware ASR by @VahidooX :: PR: #7067
  • Add updated fc ctc and rnnt xxl models by @nithinraok :: PR: #7128
  • Update notebook branch by @ericharper :: PR: #7135
  • Fixed main and merging this to r1.20 by @tango4j :: PR: #7127
  • Fix default context size by @nithinraok :: PR: #7141
  • Fix incorrect embedding grads with distopt BF16 grad reductions by @timmoon10 :: PR: #6958

TTS

Changelog
  • [TTS] Add callback for saving audio during FastPitch training by @rlangman :: PR: #6665
  • [TTS] Add script for text preprocessing by @rlangman :: PR: #6541
  • [TTS] Fix adapter duration issue by @hsiehjackson :: PR: #6697
  • [TTS] Filter out silent audio files during preprocessing by @rlangman :: PR: #6716
  • [TTS] fix inconsistent type hints for IpaG2p by @XuesongYang :: PR: #6733
  • [TTS] relax hardcoded prefix for phonemes and tones and infer phoneme set through dict by @XuesongYang :: PR: #6735
  • [TTS] corrected misleading deprecation warnings. by @XuesongYang :: PR: #6702
  • Fix TTS adapter tutorial by @hsiehjackson :: PR: #6741
  • [TTS][zh] refine hardcoded lowercase for ASCII letters. by @XuesongYang :: PR: #6781
  • [TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models by @racoiaws :: PR: #7012

NLP / NMT

Changelog
  • minor fix for missing chat attr by @arendu :: PR: #6671
  • eval fix by @arendu :: PR: #6685
  • VP Fixes for converter + Config management by @titu1994 :: PR: #6698
  • lora notebook by @arendu :: PR: #6765
  • peft eval directly from ckpt by @arendu :: PR: #6785
  • GPT inference long context by @ekmb :: PR: #6687
  • Fix validation with drop_last=False by @mikolajblaz :: PR: #6704
  • fix spellmapper tutorial, change branch to main by @bene-ges :: PR: #6803
  • text_generation_utils memory reduction if no logprob needed by @yzhang123 :: PR: #6773
  • Add optional index mapping dir in mmap text datasets by @gheinrich :: PR: #6683
  • Add inference kv cache support for transformer TE path by @yen-shi :: PR: #6627
  • add reference to our paper by @bene-ges :: PR: #6821
  • added changes to ramp up bs by @dimapihtar :: PR: #6799
  • t5 lora tuning by @arendu :: PR: #6612
  • Added rouge monitoring support for T5 by @jubick1337 :: PR: #6737
  • GPT extrapolatable position embedding (xpos/sandwich/alibi/kerple) and Flash Attention by @hsiehjackson :: PR: #6666
  • Import Enum for chatbot component by @ericharper :: PR: #6877
  • typo fix from #6666 by @arendu :: PR: #6882
  • removed unnecessary print by @dimapihtar :: PR: #6884
  • Fix destructor for delayed mmap dataset case by @mikolajblaz :: PR: #6703
  • Make Gradio library optional by @yidong72 :: PR: #6904
  • Fix fast-glu activation in change partitions by @hsiehjackson :: PR: #6909
  • Documentation for ONNX export of Megatron Models by @asfiyab-nvidia :: PR: #6914
  • FixTextMemMapDataset index file creation in multi-node setup by @gheinrich :: PR: #6768
  • Fix flash-attention by @hsiehjackson :: PR: #6901
  • ptuning oom fix by @arendu :: PR: #6916
  • add rampup bs assertion by @dimapihtar :: PR: #6927
  • Enable methods in bert-like models by @sararb :: PR: #6898
  • support value attribution condition by @yidong72 :: PR: #6934
  • Add missing save restore connector to eval scripts by @titu1994 :: PR: #6935
  • Merge release r1.19.0 into main by @ericharper :: PR: #6948
  • Stop at the stop token by @yidong72 :: PR: #6957
  • fixes for spellmapper by @bene-ges :: PR: #6994
  • Fix tabular data text generation by @yidong72 :: PR: #7022
  • fix pos id - hf update by @ekmb :: PR: #7075
  • fix syntax error introduced in PR-7079 by @bene-ges :: PR: #7102

NeMo Tools

Changelog
  • SDE unt lvl comparison by @Jorjeous :: PR: #6669
  • hot fix SDE by @Jorjeous :: PR: #6897

Bugfixes

Changelog
  • small Bugfix by @fayejf :: PR: #7079
  • Fix caching bug in causal convolutions for cache-aware ASR models by @VahidooX :: PR: #7034
  • Fix masking bug for TTS Aligner by @redoctopus :: PR: #6677
  • [bugfix] avoid the random shuffle of phoneme and tone tokens. by @XuesongYang :: PR: #6855
  • fix ptuning residuals bug by @arendu :: PR: #6866
  • TE bug fix by @dimapihtar :: PR: #7027
  • Update distopt API for coalesced NCCL calls by @timmoon10 :: PR: #6886

General Improvements

Changelog
  • update batch size recommendation to min 32 for 43b by @Zhilin123 :: PR: #6675
  • Make Note usage consistent in adapter_mixins.py by @BrianMcBrayer :: PR: #6678
  • Update all invalid tree references to blobs for NeMo samples by @BrianMcBrayer :: PR: #6679
  • Update README.rst about container by @fayejf :: PR: #6686
  • karpnv/issues6690 by @karpnv :: PR: #6705
  • Limit codeql scope by @titu1994 :: PR: #6710
  • Not pinning Gradio version by @yidong72 :: PR: #6680
  • preprocess squad in sft format by @arendu :: PR: #6727
  • Fix Codeql config by @titu1994 :: PR: #6731
  • Fix fastpitch test nightly by @hsiehjackson :: PR: #6730
  • Lora/PEFT training script CI test by @arendu :: PR: #6664
  • fixed decor to show messages only when the wrapped object is called. by @XuesongYang :: PR: #6793
  • lora pp2 by @arendu :: PR: #6818
  • Upperbound Numpy to < 1.24 by @titu1994 :: PR: #6829
  • Fix typo in documentation by @Dounx :: PR: #6838
  • NFA updates by @erastorgueva-nv :: PR: #6695
  • Update container for import action by @ericharper :: PR: #6883
  • removed some tests by @arendu :: PR: #6900
  • Update container info in README.rst by @fayejf :: PR: #6913
  • Removed optional optimize_for_inference by @borisfom :: PR: #6933
  • Update core commit for CI by @aklife97 :: PR: #6939
  • lora inference ci by @arendu :: PR: #6931
  • Upgrade base pytorch container to 23.06 by @ericharper :: PR: #6938
  • Fix requirements for pydantic + inflect by @titu1994 :: PR: #6956
  • Remove pyyaml by @titu1994 :: PR: #7052
  • Fix links in Segmentation tutorial by @ekmb :: PR: #7117
  • Update evaluator.py by @stevehuang52 :: PR: #7151

相关地址:原始地址 下载(tar) 下载(zip)

1、 asset-post-2023-08-forced-alignment-alignment_slots.png 43.82KB

2、 asset-post-2023-08-forced-alignment-allowed_seq_ctc.png 56.06KB

3、 asset-post-2023-08-forced-alignment-alowed_seq.png 17.7KB

4、 asset-post-2023-08-forced-alignment-asr_model.png 90.02KB

5、 asset-post-2023-08-forced-alignment-butter_betty_bought_words_aligned.mp4 1.68MB

6、 asset-post-2023-08-forced-alignment-ctc_trellis.png 127.39KB

7、 asset-post-2023-08-forced-alignment-ctc_viterbi_rule.png 225.73KB

8、 asset-post-2023-08-forced-alignment-fold_viterbi.mp4 654.72KB

9、 asset-post-2023-08-forced-alignment-naive_graph.mp4 712.72KB

10、 asset-post-2023-08-forced-alignment-redundancy_explain.mp4 654.74KB

11、 asset-post-2023-08-forced-alignment-redundancy_start_to_end.mp4 2.43MB

12、 asset-post-2023-08-forced-alignment-viterbi_rule.png 123.95KB

13、 asset-post-2023-08-forced-alignment-what_is_alignment.png 64.71KB

14、 asset-post-2023-10-28-numba-fp16-memory_joint.png 559.48KB

15、 asset-post-2023-10-28-numba-fp16-rnnt_joint.png 33.71KB

16、 nfa_forced_alignment_pipeline.png 129.34KB

17、 nfa_run.png 90.16KB

18、 nfa_word_segment_alignments.png 133.03KB

查看:2023-08-05发行的版本