v1.1.1
版本发布时间: 2023-09-07 00:37:07
intel/intel-extension-for-transformers最新发布版本:v1.4.2(2024-05-24 20:23:38)
- Highlights
- Bug Fixing & Improvements
- Tests & Tutorials
Highlights In this release, we improved NeuralChat, a customizable chatbot framework under Intel® Extension for Transformers. NeuralChat is now available for you to create your own chatbot within minutes on multiple architectures.
Bug Fixing & Improvements
- Fix the code structure and the plugin in NeuralChat (commit 486e9e)
- Fix bug in retrieval chat (commit d2cee0)
- NeuralChat Inference return correct input len without pad to user (commit 18be4c)
- Fix MPT not support left padding issue (commit 24ae58)
- Fix double remove dataset columns when concatenation (commit 67ce6e)
- Fix DeepSpeed and use cache issue (commit 4675d4)
- Fix bugs in predict_stream (commit e1da7e)
- Fix docker CPU issues (commit 8fa0dc)
- Fix read HuggingFaceH4/oasst1_en dataset issue (commit 76ee68)
- Modify Dockerfile for finetuning (commit 797aa2)
- Fix the perf of LLaMA2 by static_shape in optimum Habana (commit 481f38)
- Remove NeuralChat redundant code and hard codes. (commit 0e1e4d, 037ce8, 10af3c)
- Refined NeuralChat finetuning config (commit e372cf)
Tests & Tutorials
- Add inference test for LLaMA2 and MPT with HPU (commit 5c4f5e)
- Add inference test for LLaMA2 and MPT with Intel CPUs (commit ad4bec, 2f6188)
- Add finetuning test for MPT (commit 72d81e, 423242)
- Add GHA Unit Tests (commit 49336d)
- NeuralChat finetuning tutorial for LLaMA2 and MPT (commit d156e9)
- NeuralChat deployment on Intel CPU/ Habana HPU/ Nvidia tutorial (commit b36711)
Validated Configurations
- Centos 8.4 & Ubuntu 22.04
- Python 3.9
- PyTorch 2.0.0
- TensorFlow 2.12.0
Acknowledgements Thanks for the contributions from sywangyi, jiafuzha and itayariel. Thanks to all the participants to Intel Extension for Transformers.