v1.18.0

NVIDIA/NeMo

版本发布时间: 2023-05-13 01:49:27

NVIDIA/NeMo最新发布版本:r2.0.0rc1(2024-08-16 05:55:14)

Highlights

Models

NeMo ASR

Hybrid Autoregressive Transducer (HAT) #6260
Apple MPS Support for ASR Inference #6289
InterCTC Support for Hybrid ASR Models #6215
RNNT N-Gram Fusion with mAES algo #6118
ASR + Apple M2 CPU/GPU MPS #6289

NeMo TTS

TTS directory structure refactor
User-set symbol vocabulary #6172

NeMo Megatron

Model parallelism from Megatron Core #6393
Continued training for P-tuning #6273
SFT for GPT-3 #6210
Tensor and pipeline model parallel conversion #6218
Megatron NMT Export to Riva

NeMo Core

Detailed Changelogs

ASR

Changelog

minor cleanup by @messiaen :: PR: #6311
docs on the use of heterogeneous test / val manifests by @bmwshop :: PR: #6352
[WIP] add buffered chunked streaming for nemo force aligner by @Slyne :: PR: #6185
Word boosting for Flashlight decoder by @trias702 :: PR: #6367
Add installation and ASR inference instructions for Mac by @artbataev :: PR: #6377
specaug speedup by @1-800-BAD-CODE :: PR: #6347
updated lr for FC configs by @bmwshop :: PR: #6379
Make possible to control tqdm progress bar in ASR models by @SN4KEBYTE :: PR: #6375
[ASR] Conformer global tokens in local attention by @sam1373 :: PR: #6253
fixed torch warning on using a list of numpy arrays by @MKNachesa :: PR: #6382
Fix FastConformer config: correct bucketing strategy by @artbataev :: PR: #6413
fixing the ability to use temp sampling with concat datasets by @bmwshop :: PR: #6423
add conformer configs for hat model by @andrusenkoau :: PR: #6372
[ASR] Add optimization util for linear sum assignment algorithm by @tango4j :: PR: #6349
Added/updated new Conformer configs by @VahidooX :: PR: #6426
Fix typos by @titu1994 :: PR: #6494
Fix typos (#6523) by @titu1994 :: PR: #6539
added back the fast emit section to the configs. by @VahidooX :: PR: #6540
Add FastConformer Hybrid ASR models for EN, ES, IT, DE, PL, HR, UA, BY by @KunalDhawan :: PR: #6549
Add scores for FastConformer models by @titu1994 :: PR: #6557
Patch transcribe and support offline transcribe for hybrid model by @fayejf :: PR: #6550
More streaming conformer export fixes by @messiaen :: PR: #6567
Documentation for ASR-TTS models by @artbataev :: PR: #6594
Patch transcribe_util for steaming mode and add wer calculation back to inference scripts by @fayejf :: PR: #6601
Add HAT image to docs by @andrusenkoau :: PR: #6619
Patch decoding for PC models by @titu1994 :: PR: #6630
Fix wer.py where 'errors' variable was not set by @stevehuang52 :: PR: #6633
Fix for old models in change_attention_model by @VahidooX :: PR: #6635

TTS

Changelog

VITS HiFiTTS doc by @treacker :: PR: #6288
fix broken links r1.18.0 by @ekmb :: PR: #6501
[TTS] fixed broken path. by @XuesongYang :: PR: #6514

NLP / NMT

Changelog

[Core] return_config=True now extracts just config, not full tarfile by @titu1994 :: PR: #6346
restore path for p-tuning by @arendu :: PR: #6273
taskname and early stopping for adapters by @arendu :: PR: #6366
Adapter tuning accepts expanded language model dir by @arendu :: PR: #6376
Update gpt_training.rst by @blisc :: PR: #6378
Megatron GPT model finetuning by @MaximumEntropy :: PR: #6210
[NeMo Megatron] Cleanup configs to infer the models TP PP config automatically by @titu1994 :: PR: #6368
Fix prompt template unescaping by @MaximumEntropy :: PR: #6399
Add support for Megatron GPT Untied Embd TP PP Change by @titu1994 :: PR: #6388
Move Parallelism usage from Apex -> Megatron Core by @aklife97 :: PR: #6393
Add ability to enable/disable act ckpt and seq parallelism in GPT by @markelsanz14 :: PR: #6327
Refactor PP conversion + add support for TP only conversion by @titu1994 :: PR: #6419
fix CPU overheads of GPT synthetic dataset by @xrennvidia :: PR: #6427
check if grad is none before calling all_reduce by @arendu :: PR: #6428
Fix replace_bos_with_pad not found by @aklife97 :: PR: #6443
Support Swiglu in TP PP Conversion by @titu1994 :: PR: #6437
BERT pre-training mp fork to spawn by @aklife97 :: PR: #6442
Meagtron encoder decoder fix for empty validation outputs by @michalivne :: PR: #6459
Reduce workers on NMT CI by @aklife97 :: PR: #6472
Switch to NVIDIA Megatron repo by @aklife97 :: PR: #6465
Megatron KERPLE positional embeddings by @michalivne :: PR: #6478
Support in external sample mapping for Megatron datasets by @michalivne :: PR: #6462
Fix custom by @aklife97 :: PR: #6512
GPT fp16 inference fix by @MaximumEntropy :: PR: #6543
Fix for T5 FT model by @aklife97 :: PR: #6529
Pass instead of scaler object to core by @aklife97 :: PR: #6545
Change Megatron Enc Dec model to use persistent_workers by @aklife97 :: PR: #6548
Turn autocast off when precision is fp32 by @aklife97 :: PR: #6554
Fix batch size reconf for T5 FT for multi-validation by @aklife97 :: PR: #6582
Make tensor split contiguous for qkv and kv in attention by @aklife97 :: PR: #6580
Patches from main to r1.18.0 for Virtual Parallel by @titu1994 :: PR: #6592
Create dummy iters to satisy iter type len checks in core + update core commit by @aklife97 :: PR: #6600
Restore GPT support for interleaved pipeline parallelism by @timmoon10 :: PR: #6528
Add megatron_core to requirements by @ericharper :: PR: #6639

Export

Changelog

Bugfixes

Changelog

Fix the GPT SFT datasets loss mask bug by @yidong72 :: PR: #6409
[BugFix] Fix multi-processing bug in data simulator by @tango4j :: PR: #6310
Fix cache aware hybrid bugs by @VahidooX :: PR: #6466
[BugFix] Force _get_batch_preds() to keep logits in decoder timestamp… by @tango4j :: PR: #6500
Fixing bug in unsort_tensor by @borisfom :: PR: #6320
Bugfix for BF16 grad reductions with distopt by @timmoon10 :: PR: #6340
Limit urllib3 version to patch issue with RTD by @aklife97 :: PR: #6568

General improvements

Changelog

Pin the version to hopefully fix rtd build by @SeanNaren :: PR: #6334
enabling diverse datasets in val / test by @bmwshop :: PR: #6306
extract inference weights by @arendu :: PR: #6353
Add opengraph support for NeMo docs by @titu1994 :: PR: #6380
Adding basic preemption code by @athitten :: PR: #6161
Add documentation for preemption support by @athitten :: PR: #6403
Update hyperparameter recommendation based on experiments by @Zhilin123 :: PR: #6405
exceptions with empty test / val ds config sections by @bmwshop :: PR: #6421
Upgrade pt 23.03 by @ericharper :: PR: #6430
Update README to add core installation by @aklife97 :: PR: #6488
Not doing CastToFloat by default by @borisfom :: PR: #6524
Update manifest.py for speedup by @stevehuang52 :: PR: #6565
Update SDP docs by @erastorgueva-nv :: PR: #6485
Update core commit hash in readme by @aklife97 :: PR: #6622
Remove from jenkins by @ericharper :: PR: #6641
Remove dup by @ericharper :: PR: #6643

相关地址：原始地址下载(tar) 下载(zip)

1、 asset-post-fast-conformer-diagram.png 426.16KB

2、 asset-post-fast-conformer-local-attn.png 2.59MB

3、 dual_output_example_model.png 142.64KB

4、 encmaskdecoder_model.png 62.02KB

5、 single_output_example_model.png 73.1KB

查看：2023-05-13发行的版本