v0.8.0
版本发布时间: 2024-01-10 02:16:07
tinygrad/tinygrad最新发布版本:v0.9.2(2024-08-14 07:19:48)
Close to the new limit of 5000 lines at 4981.
Release Highlights
- Real dtype support within kernels!
- New
.schedule()
API to separate concerns of scheduling and running - New lazy.py implementation doesn't reorder at build time.
GRAPH=1
is usable to debug issues - 95 TFLOP FP16->FP32 matmuls on 7900XTX
- GPT2 runs (jitted) in 2 ms on NVIDIA 3090
- Powerful and fast kernel beam search with
BEAM=2
- GPU/CUDA/HIP backends switched to
gpuctypes
- New (alpha) multigpu sharding API with
.shard