v0.0.1
版本发布时间: 2022-09-13 15:03:40
hpcaitech/EnergonAI最新发布版本:v0.0.1(2022-09-13 15:03:40)
Overview
EnergonAI is a service framework for large-scale model inference, which is powered by ColossalAI. It support large model inference with tensor parallelism and pipeline parallelism. The most important example of this release is serving OPT. You can serve OPT-175B conveniently using EnergonAI.
What's Changed
- add InferenceEngine. by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/1
- test engine switch func, it does not support pipeline now by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/2
- Feature/pipeline by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/3
- fp16 support by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/4
- timer utils by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/5
- deal with no pipeline by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/6
- gen new ncclid and broadcast to devices in the same Tensor Parallelis… by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/8
- evaluation by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/9
- Feature/activation reuse by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/10
- make require_grad to false by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/11
- make the distributed program a single entrance by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/12
- make host and port the arguement by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/13
- triton run by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/14
- make rpc shutdown correctly by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/15
- scaale mask softmax kernel by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/16
- Feature/examples by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/17
- gpt model sync by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/18
- Feature/example by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/19
- example update by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/20
- Feature/example by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/21
- Update _operation.py by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/22
- del useless by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/24
- checkpoint function by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/25
- fixed bugs with checkpoint path check by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/26
- Md edit by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/28
- make server correct by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/29
- fixed bugs with bert checkpoint by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/30
- add checkpoint function by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/31
- Lzm develop by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/33
- Lzm develop by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/34
- add more bert model, add rm_padding func by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/35
- update READEME by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/36
- bert redundant computation, new examples dir and we will delete examp… by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/38
- add bert example by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/39
- Feature/variable len by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/40
- block without timeout by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/41
- rpc_worker, retturn results in order by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/42
- retuurn in order by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/43
- batch manager by ziming liu by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/44
- Add comments to Batch Manager by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/46
- correctness for tp only by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/45
- move tokenizer out of manager and move select_top_k in to models by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/47
- enable fp16 kernel by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/48
- Delete example directory by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/49
- fixed batch manager bugs and reformat codes by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/50
- Reformat the codes by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/51
- fix parameter mistake by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/52
- update rm_padding in batch manager by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/53
- update hf_gpt2 by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/54
- add seq len and test api by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/55
- combine two pipeline wrapper by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/56
- update bert by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/57
- version compatibility by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/58
- modify batch manager by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/59
- update requirement by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/60
- add warm up phase for profiler by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/61
- refactor batch manager by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/62
- Update README.md by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/63
- readme update by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/64
- readme update by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/65
- Update README.md by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/66
- some details by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/67
- some details by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/68
- Update README.md by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/70
- delete header error by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/69
- update README by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/71
- Update README.md by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/72
- vit example by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/73
- update vit example and update logging by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/74
- Update README.md by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/76
- gpt return correct data structure by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/75
- change project name by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/77
- make config globally available by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/78
- update metaconfig by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/79
- update metaconfig by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/80
- Feature/trt by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/81
- Link TensorRT as backend for single device execution by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/82
- update readme by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/83
- update readme by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/84
- refactor batch manager related files by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/85
- add comments and delete unnecessary codes. by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/86
- update Readme by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/87
- update Readme by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/88
- update batcher for pipeline by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/91
- fix batch wrapping bug by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/92
- add offload manager by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/93
- add basic model as component by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/94
- match checkpoint for opt by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/95
- fix bug in batch manager by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/96
- timer with ignrore the first func by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/97
- update readme by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/98
- tTemporarily stop kernels for correctness by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/101
- modify offload manager and add linear example by @MaruyamaAya in https://github.com/hpcaitech/EnergonAI/pull/103
- add linear func by @oahzxl in https://github.com/hpcaitech/EnergonAI/pull/105
- [docker] add dockerfile and change hardcode path by @feifeibear in https://github.com/hpcaitech/EnergonAI/pull/107
- change rpc timeout by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/108
- [docker] add test_query.sh and update docker file by @feifeibear in https://github.com/hpcaitech/EnergonAI/pull/109
- add opt example by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/111
- [NFC] global var should be in uppercase by @feifeibear in https://github.com/hpcaitech/EnergonAI/pull/112
- refactor load_checkpoint by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/113
- refactor tp load checkpoint by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/114
- fix hf gpt2 example by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/115
- refactor opt server api by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/116
- add benchmark by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/117
- [example] refactor opt by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/118
- [docker] polish lauch docker scripts by @feifeibear in https://github.com/hpcaitech/EnergonAI/pull/119
- [model] support topk, topp, temperature when generating by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/120
- [opt] generate api receives top_k, top_p, and temperature by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/121
- add serving queue by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/122
- make the generation task within the engine by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/123
- [opt] add async executor by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/124
- [opt] add left padding by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/126
- [opt] add 175b model by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/127
- [opt] remove useless api by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/128
- [hotfix] fix worker server shutdown by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/129
- Generation model feature: add cache for removing the repeated computation in the loop. by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/130
- [opt] add data validator and cors middleware by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/132
- add a flag for disable cache by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/131
- [opt] executor update making batch policy by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/133
- improve the cache implementation by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/134
- [opt] add cache and modify api by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/135
- fit the opt_66B by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/136
- 66B model load checkpoint by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/137
- [opt] allow disabling multi procs loading by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/140
- [opt] fit opt-175b by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/139
- [opt] refactor server by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/142
- processing 66b ckpt by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/141
- [docker] rm source code after pip install by @feifeibear in https://github.com/hpcaitech/EnergonAI/pull/143
- [opt] add timeout option by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/145
- [opt] add queue size option by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/146
- update readme by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/144
- update readme by @dujiangsu in https://github.com/hpcaitech/EnergonAI/pull/148
- Add prometheus endpoint for opt_server.py by @ofey404 in https://github.com/hpcaitech/EnergonAI/pull/149
- prometheus by @feifeibear in https://github.com/hpcaitech/EnergonAI/pull/150
- [nn] replace energonai.nn with colossalai.nn by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/147
- [logging] remove logging module by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/153
- [utils] remove useless utils by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/154
- [utils] fix checkpointing import by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/155
- [setup] add version control by @ver217 in https://github.com/hpcaitech/EnergonAI/pull/156
New Contributors
- @dujiangsu made their first contribution in https://github.com/hpcaitech/EnergonAI/pull/1
- @MaruyamaAya made their first contribution in https://github.com/hpcaitech/EnergonAI/pull/22
- @oahzxl made their first contribution in https://github.com/hpcaitech/EnergonAI/pull/105
- @feifeibear made their first contribution in https://github.com/hpcaitech/EnergonAI/pull/107
- @ver217 made their first contribution in https://github.com/hpcaitech/EnergonAI/pull/111
- @ofey404 made their first contribution in https://github.com/hpcaitech/EnergonAI/pull/149
Full Changelog: https://github.com/hpcaitech/EnergonAI/commits/v0.0.1