v1.6.0
版本发布时间: 2023-05-12 11:19:58
taichi-dev/taichi最新发布版本:v1.7.2(2024-08-22 16:59:46)
Deprecation Notice
- We removed some APIs that were deprecated a long time ago. See the table below:
Removed API | Replace with |
---|---|
Using atomic operations like a.atomic_add(b) | ti.atomic_add(a, b) or a += b |
Using is and is not inside Taichi kernel and Taichi function | Not supported |
Ndrange for loop with the number of the loop variables not equal to the dimension of the ndrange | Not supported |
ti.ui.make_camera() | ti.ui.Camera() |
ti.ui.Window.write_image() | ti.ui.Window.save_image() |
ti.SOA | ti.Layout.SOA |
ti.AOS | ti.Layout.AOS |
ti.print_profile_info | ti.profiler.print_scoped_profiler_info |
ti.clear_profile_info | ti.profiler.clear_scoped_profiler_info |
ti.print_memory_profile_info | ti.profiler.print_memory_profiler_info |
ti.CuptiMetric | ti.profiler.CuptiMetric |
ti.get_predefined_cupti_metrics | ti.profiler.get_predefined_cupti_metrics |
ti.print_kernel_profile_info | ti.profiler.print_kernel_profiler_info |
ti.query_kernel_profile_info | ti.profiler.query_kernel_profiler_info |
ti.clear_kernel_profile_info | ti.profiler.clear_kernel_profiler_info |
ti.kernel_profiler_total_time | ti.profiler.get_kernel_profiler_total_time |
ti.set_kernel_profiler_toolkit | ti.profiler.set_kernel_profiler_toolkit |
ti.set_kernel_profile_metrics | ti.profiler.set_kernel_profiler_metrics |
ti.collect_kernel_profile_metrics | ti.profiler.collect_kernel_profiler_metrics |
ti.VideoManager | ti.tools.VideoManager |
ti.PLYWriter | ti.tools.PLYWriter |
ti.imread | ti.tools.imread |
ti.imresize | ti.tools.imresize |
ti.imshow | ti.tools.imshow |
ti.imwrite | ti.tools.imwrite |
ti.ext_arr | ti.types.ndarray |
ti.any_arr | ti.types.ndarray |
ti.Tape | ti.ad.Tape |
ti.clear_all_gradients | ti.ad.clear_all_gradients |
ti.linalg.sparse_matrix_builder | ti.types.sparse_matrix_builder |
- We no longer deprecate the builtin min/max function in the Taichi kernel anymore.
- We deprecate some arguments in the declaration of the arguments of the compute graph, and they will be removed in v1.7.0. Including:
-
element_shape
argument for scalar and ndarray -
shape
,channel_format
andnum_channels
arguments for texture
-
-
cc
backend will be removed at next release (v1.7.0
)
New features
Struct arguments
You can now use struct arguments in all backends. The structs can be nested, and it can contain matrices and vectors. Here's an example:
transform_type = ti.types.struct(R=ti.math.mat3, T=ti.math.vec3)
pos_type = ti.types.struct(x=ti.math.vec3, trans=transform_type)
@ti.kernel
def kernel_with_nested_struct_arg(p: pos_type) -> ti.math.vec3:
return p.trans.R @ p.x + p.trans.T
trans = transform_type(ti.math.mat3(1), [1, 1, 1])
p = pos_type(x=[1, 1, 1], trans=trans)
print(kernel_with_nested_struct_arg(p)) # [4., 4., 4.]
Ndarray
- Support 0 dim ndarray read & write in python scope
- Fixed a bug when writing into ndarray from Python scope
Improvements
- Support rsqrt operator in autodiff
- Added assembly printer for CPU backend Zhanlue Yang
- Supporting CUDA shared array allocation over 48KiB
Performance
- Improved vectorization support on CPU backend, with significant performance gains for specific applications
New Examples
- 2D euler fluid simulation example by Lee-abcde
Misc
- Python 3.11 support
-
ti.frexp
is supported on CUDA, Vulkan, Metal, OpenGL backends. -
ti.math.popcnt
intrinsic by Garry Ling - Fixed a memory leak issue during SNodeTree destruction Zhanlue Yang
- Added validation and improved error report for ti.Field finalization Zhanlue Yang
- Fixed a memory leak issue with Cuda backend in C-API Zhanlue Yang
- Added support for formatted printing with str.format() and f-strings Tianyi Liu
- Changed Python code formatter from
yapf
toblack
Developer Experience
- build.py script for preparing build & testing environment
Full changelog
Highlights:
-
Bug fixes
- Fix wrong datatype size when writing to ndarray from Python scope (by Ailing Zhang)
-
CUDA backend
- Warn driver version if it doesn't support memory pool. (#7912) (by Haidong Lan)
- Better handling shared array shape check (#7818) (by Haidong Lan)
- Support large shared memory for CUDA backend (#7452) (by Haidong Lan)
-
Documentation
- Add doc about struct arguments (#7959) (by Lin Jiang)
- Fix docstring of mix function (#7922) (by Zhao Liang)
- Update faq and ggui, and add them to CI (#7861) (by Zhao Liang)
- Update doc for dynamic snode (#7804) (by Zhao Liang)
- Update field.md (#7819) (by zhoooou)
- Update readme (#7808) (by yanqingzhang)
- Update write_test.md (#7745) (by Qian Bao)
- Update performance.md (#7720) (by Zhao Liang)
- Update readme (#7673) (by Zhao Liang)
- Update tutorial.md (#7512) (by Chenzhan Shang)
- Update gui_system.md (#7628) (by Qian Bao)
- Remove deprecated api docstrings (#7596) (by pengyu)
- Fix the cexp docstring (#7588) (by Zhao Liang)
- Add doc about returning struct (#7556) (by Lin Jiang)
-
Error messages
- Update deprecation warning of the graph arguments (#7965) (by Lin Jiang)
-
Language and syntax
- Remove deprecated funcs in init.py (#7941) (by Lin Jiang)
- Remove deprecated sparse_matrix_builder function (#7942) (by Lin Jiang)
- Remove deprecated funcs in ti.ui (#7940) (by Lin Jiang)
- Remove the support for 'is' (#7930) (by Lin Jiang)
- Raise error when the dimension of the ndrange does not equal to the number of the loop variable (#7933) (by Lin Jiang)
- Remove a.atomic(b) (#7925) (by Lin Jiang)
- Cancel deprecating native min/max (#7928) (by Lin Jiang)
- Let nested data classes have methods (#7909) (by Lin Jiang)
- Let kernel argument support matrix nested in a struct (by lin-hitonami)
- Support the functions of dataclass as kernel argument and return value (#7865) (by Lin Jiang)
- Fix a bug on PosixPath (#7860) (by Zhao Liang)
- Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (#7803) (by Zhanlue Yang)
- Fix pylance warning (#7805) (by Zhao Liang)
- Support taking structs as kernel arguments (by lin-hitonami)
- Fix math module circular import bugs (#7762) (by Zhao Liang)
- Support formatted printing in str.format() and f-strings (#7686) (by 魔法少女赵志辉)
- Replace internal representation of Python-scope ti.Matrix with numpy arrays (#7559) (by Yi Xu)
- Stop letting ti.Struct inherit from TaichiOperations (#7474) (by Yi Xu)
- Support writing sparse matrix as matrix market file (#7529) (by pengyu)
-
Vulkan backend
- Fix repeated generation of array ranges in spirv codegen. (#7625) (by Haidong Lan)
Full changelog:
- [CUDA] Warn driver version if it doesn't support memory pool. (#7912) (by Haidong Lan)
- [Doc] Add doc about struct arguments (#7959) (by Lin Jiang)
- [Error] Update deprecation warning of the graph arguments (#7965) (by Lin Jiang)
- [windows] Workaround C++ mangling special chars (#7964) (by Ailing)
- [Lang] Remove deprecated funcs in init.py (#7941) (by Lin Jiang)
- [build] Remove redundant C-API shared object in wheel (#7950) (by Proton)
- [test] Do not test cc backend (by Proton)
- [Lang] Remove deprecated sparse_matrix_builder function (#7942) (by Lin Jiang)
- [Lang] Remove deprecated funcs in ti.ui (#7940) (by Lin Jiang)
- [Lang] Remove the support for 'is' (#7930) (by Lin Jiang)
- [Lang] Raise error when the dimension of the ndrange does not equal to the number of the loop variable (#7933) (by Lin Jiang)
- [Lang] Remove a.atomic(b) (#7925) (by Lin Jiang)
- [Lang] Cancel deprecating native min/max (#7928) (by Lin Jiang)
- [Doc] Fix docstring of mix function (#7922) (by Zhao Liang)
- [example] Fix ti example bugs (#7903) (by Zhao Liang)
- [ci] Build.py: Source generated env in new spawned shell (by Proton)
- [misc] Fix changelog commit extract code (by Proton)
- [ci] More robust build.py bootstrapping (#7920) (by Proton)
- [Lang] [bug] Let nested data classes have methods (#7909) (by Lin Jiang)
- [cuda] Only set CU_LIMIT_STACK_SIZE when necessary (#7906) (by Ailing)
- [Lang] Let kernel argument support matrix nested in a struct (by lin-hitonami)
- [Bug] Fix wrong datatype size when writing to ndarray from Python scope (by Ailing Zhang)
- [lang] Support 0 dim ndarray read & write in python scope (by Ailing Zhang)
- [Lang] Support the functions of dataclass as kernel argument and return value (#7865) (by Lin Jiang)
- [spirv] Support struct as kernel argument (by Lin Jiang)
- [spirv] Fix the ret type of frexp (by lin-hitonami)
- [ci] Build.py: Do not try to bootstrap pip (too many issues) (#7897) (by Proton)
- [ci] Build.py quirks fix (#7894) (by Proton)
- [Doc] Update faq and ggui, and add them to CI (#7861) (by Zhao Liang)
- [build] Remove unused apt pkg 'libmirclient-dev' to make 'build.py' run properly on ubuntu 22.04 (#7871) (by Yu Zhang)
- [Lang] Fix a bug on PosixPath (#7860) (by Zhao Liang)
- [ci] Polishing build.py, wave 4 (#7857) (by Proton)
- [build] Use LLVM without zstd dependency on M1 Macs (#7856) (by Proton)
- [doc] Update dev_install.md to reflect build.py usage (#7848) (by Proton)
- [ci] Polishing build.py, wave 3 (#7845) (by Proton)
- [lang] Add popcnt to llvm intrinsic support (#7772) (by Garry Ling)
- [Doc] Update doc for dynamic snode (#7804) (by Zhao Liang)
- [ci] Fix release build failure (#7834) (by Proton)
- [ci] More robust build.py bootstrapping (#7833) (by Proton)
- [Doc] Update field.md (#7819) (by zhoooou)
- [autodiff] Remove redundant autodiff mode in kernel name (#7829) (by Ailing)
- [lang] Migrate Caching Allocation logics from CudaDevice/AmdgpuDevice to DeviceMemoryPool (#7793) (by Zhanlue Yang)
- [misc] Resolve code formatter frictions (#7828) (by Proton)
- [Lang] Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (#7803) (by Zhanlue Yang)
- [bug] Fix imgui_context in destroying multiple GGUI windows (#7812) (by Ailing)
- [misc] Update git-blame-ignore-revs (#7825) (by Proton)
- [ci] Complete doc test list, remove redundant default prelude (#7823) (by Proton)
- [misc] Relax Black formatter line length limit to 120 (#7824) (by Proton)
- [Doc] Update readme (#7808) (by yanqingzhang)
- [misc] Switch code formatter from
yapf
toblack
(#7785) (by Proton) - [CUDA] Better handling shared array shape check (#7818) (by Haidong Lan)
- [misc] Improve ::liong::json::deserialize() (by PGZXB)
- [bug] Fix gen_offline_cache_key (#7810) (by PGZXB)
- [ci] Fix build.py ensurepip (#7811) (by Proton)
- [Lang] Fix pylance warning (#7805) (by Zhao Liang)
- [lang] Support frexp on spirv-based backends (#7770) (by Ailing)
- [lang] Split MemoryPool into DeviceMemoryPool and HostMemoryPool (#7786) (by Zhanlue Yang)
- [misc] Optimize import overhead: pytorch and get_clangpp (#7797) (by Haidong Lan)
- [ci] [doc] Tighten up document testing (#7801) (by Proton)
- [ci] Polishing build.py, wave 2 (#7800) (by Proton)
- [aot] Remove unused AotDataConverter (#7799) (by Lin Jiang)
- [perf] Fix Taichi CPU backend compile parameter to pair performance with Numba. (#7731) (by zhengxianli)
- [ci] Polishing build.py (#7794) (by Proton)
- [bug] Returning nan for ti.sym_eig on identity matrix (#7443) (by Yimin Tang)
- [Lang] Support taking structs as kernel arguments (by lin-hitonami)
- [ir] Add 'create_load' to ArgLoadStmt (by lin-hitonami)
- [ir] Let the src of GetElementStmt be a pointer (by lin-hitonami)
- [lang] Clean up runtime allocation functions (#7773) (by Zhanlue Yang)
- [lang] Migrate CUDA preallocation logic to CudaMemoryPool (#7746) (by Zhanlue Yang)
- [gfx] Fix runtime buffer/image copy barrier semantics (#7781) (by Bob Cao)
- [misc] Remove unnecessary TaskCodeGenLLVM::task_counter (#7777) (by PGZXB)
- [ci] Temporarily force Windows release builds to run on sm70 nodes (#7767) (by Proton)
- [refactor] Remove Kernel::lowered_ (#7765) (by PGZXB)
- [gui] Fluid visualization utilities (#7682) (by Qian Bao)
- [Lang] Fix math module circular import bugs (#7762) (by Zhao Liang)
- [misc] Make pre-commit happy (#7768) (by Proton)
- [ci] Build iOS AOT static library (by Proton)
- [misc] Wrap path with std::filesystem::path (#7754) (by Bob Cao)
- [lang] Support vector and matrix dtypes in ti.field (#7761) (by Ailing)
- [ir] Remove unnecessary field_dims_ in ArgLoadStmt (#7755) (by Ailing)
- [refactor] Remove Kernel::task_counter_ (#7751) (by PGZXB)
- [ci] Build.py: Introduce TAICHI_CMAKE_ARGS manager for better log readability (by Proton)
- [ci] Reorganize build.py code (by Proton)
- [refactor] Let KernelCompilationManager manage kernel compilation in gfx::AotModuleBuilderImpl (#7715) (by PGZXB)
- [misc] Remove unused FullSimplifyPass::Args::program (#7750) (by PGZXB)
- [refactor] Re-impl LlvmAotModule using LLVM::KernelLauncher (#7744) (by PGZXB)
- [lang] Implement experimental CG(Conjugate Gradient) solver in Taichi-lang (#7690) (by Qian Bao)
- [lang] Transform bit_shr to bit_sar for uint (#7757) (by Ailing)
- [ir] Postpone scalarize and lower_matrix_ptr to after bit loop vectorization (#7726) (by 魔法少女赵志辉)
- [ci] Isolate post sm70 tests (#7740) (by Proton)
- [cuda] Suppport using SparseMatrix on more CUDA versions (#7724) (by Yu Zhang)
- [cuda] Update the data layout of CUDA (#7748) (by Lin Jiang)
- [ci] Ignore dup benchmark data points (#7749) (by Proton)
- [bug] Fix reduction of atomic max (#7747) (by Lin Jiang)
- [Doc] Update write_test.md (#7745) (by Qian Bao)
- [refactor] Remove 'args' from 'RuntimeContext' (by lin-hitonami)
- [gfx] Let gfx backends use LaunchContextBuilder to build arguments in struct type (by lin-hitonami)
- [gfx] [refactor] Convert f16 in LaunchContextBuilder (by lin-hitonami)
- [gfx] Record the struct type of arguments and results in KernelContextAttributes (by lin-hitonami)
- [gfx] Compile struct type of result and arguments in gfx backends (by lin-hitonami)
- [refactor] Implement CompiledKernelData::check() (#7743) (by PGZXB)
- [doc] [test] Update docs for printing with f-strings and formatted strings (#7733) (by 魔法少女赵志辉)
- [lang] Improve error message for mismatched index for ndarrays in python scope (#7737) (by Ailing)
- [bug] Avoid redundant cache loading (#7741) (by PGZXB)
- [refactor] Let KernelCompilationManager manage kernel compilation in LlvmAotModuleBuilder (#7714) (by PGZXB)
- [ci] Skip large shared memory test for Turing GPUs. (#7739) (by Haidong Lan)
- [cuda] Remove deprecated cusparse functions (#7725) (by Yu Zhang)
- [misc] Update pull_request_template.md (#7738) (by Ailing)
- [misc] Remove TI_WARN for cuda in memory_pool.cpp (#7734) (by Ailing)
- [CUDA] Support large shared memory for CUDA backend (#7452) (by Haidong Lan)
- [vulkan] Update SPIR-V codegen to emit FP16 consts (#7676) (by Bob Cao)
- [lang] Support frexp on cuda backend (#7721) (by Ailing)
- [refactor] Unify implementation of ProgramImpl::compile() (by PGZXB)
- [refactor] Introduce LLVM::KernelLauncher (by PGZXB)
- [refactor] Introduce gfx::KernelLauncher (by PGZXB)
- [test] Enable test offline cache on amdgpu and dx11 (#7703) (by PGZXB)
- [lang] Refactor ownership and inheritance of allocators (#7685) (by Zhanlue Yang)
- [ci] Fix git cache quirks (#7722) (by Proton)
- [lang] Improve error msg in create ndarray (#7709) (by Garry Ling)
- [Doc] Update performance.md (#7720) (by Zhao Liang)
- [bug] Switch the gallery image used by README. (#7716) (by Chengchen(Rex) Wang)
- [lang] Merge AMDGPUCachingAllocator to the generic CachingAllocator (#7717) (by Zhanlue Yang)
- [bug] Invalid Field cache, RWAccessors cache, and Kernel cache upon SNodeTree destruction (#7704) (by Zhanlue Yang)
- [ci] [test] Enable cc test on CI (by lin-hitonami)
- [test] [cc] Skip tests that cc backend doesn't support (by lin-hitonami)
- [test] Exclude the cc backend from tests that involve dynamic indexing (#7705) (by 魔法少女赵志辉)
- [bug] Fix camera controls (#7681) (by liblaf)
- [bug] [cc] Fix comparison op in cc backend (by Lin Jiang)
- [bug] [cc] Set external ptr for cc backend (by lin-hitonami)
- [lang] Merged VirtualMemoryAllocator into MemoryPool for LLVM-CPU backend (#7671) (by Zhanlue Yang)
- [misc] Remove useless JITEvaluatorId (#7700) (by PGZXB)
- [bug] Fixed building with clang on Windows failed (#7699) (by PGZXB)
- [Lang] Support formatted printing in str.format() and f-strings (#7686) (by 魔法少女赵志辉)
- [ci] Git caching proxy in CI (#7692) (by Proton)
- [build] Let msvc generate pdb for cpp & c_api tests (by lin-hitonami)
- [refactor] Stop storing pointers to array devallocs in kernel args (by lin-hitonami)
- [aot] Implement bin2c in AOT cppgen (#7687) (by PENGUINLIONG)
- [cpu] Remove atomics demotion for single-thread CPU targets. (#7631) (by Haidong Lan)
- [aot] Export templated kernels (#7683) (by PENGUINLIONG)
- [ci] Revive /benchmark (#7680) (by Proton)
- [Doc] Update readme (#7673) (by Zhao Liang)
- [misc] Device API public headers and CMake rework part 1 (#7624) (by Bob Cao)
- [misc] Move optimize cpu module to KernelCodeGen (#7667) (by PGZXB)
- [lang] [ir] Extract and save the format specifiers in str.format() (#7660) (by 魔法少女赵志辉)
- [example] Add 2D euler fluid simulation example (#7568) (by Lee-abcde)
- [wasm] Remove WASM backend (by lin-hitonami)
- [build] Fix ssize_t type undefined errors when building with TI_WITH_LLVM=OFF on windows (#7665) (by Yu Zhang)
- [misc] Remove unused Kernel::is_evaluator (#7669) (by PGZXB)
- [misc] Remove unused Program::jit_evaluator_cache and Program::jit_evaluator_cache_mut (#7668) (by PGZXB)
- [misc] Simplify test_offline_cache.py (#7663) (by PGZXB)
- [lang] Improve error reporting for FieldsBuilder finalization (#7640) (by Zhanlue Yang)
- [misc] Rename taichi::lang::llvm to taichi::lang::LLVM (#7659) (by PGZXB)
- [refactor] Remove MemoryPool daemon in LLVM runtime (#7648) (by Zhanlue Yang)
- [opt] Cleanup unncessary options in constant fold pass (#7661) (by Ailing)
- [ci] Use build.py to prepare testing environment on Windows (#7658) (by Proton)
- [opt] Move binary jit evaluator to host (by Ailing Zhang)
- [test] Update C++ constant fold tests to test operator one by one (by Ailing Zhang)
- [aot] Avoid shared library file being packaged into wheel data (#7652) (by Chenzhan Shang)
- [ci] Fix scipy install (#7649) (by Proton)
- [misc] Remove an unnecessary parameter of KernelCompilationManager::make_filename (by PGZXB)
- [refactor] Remove some unnecessary functions of KernelCodeGen (by PGZXB)
- [refactor] Re-impl JIT and Offline Cache on LLVM backends (by PGZXB)
- [refactor] Implement llvm::KernelCompiler (by PGZXB)
- [refactor] Gen code for KernelCodeGen::ir instead of KernelCodeGen::kernel->ir (by PGZXB)
- [Doc] Update tutorial.md (#7512) (by Chenzhan Shang)
- [ci] Test manylinux2014 build on PR (#7647) (by Proton)
- [bug] Fix logical comparison returns -1 (#7641) (by Ailing)
- [doc] Fix gui_system.md tests (#7646) (by Proton)
- [Doc] Update gui_system.md (#7628) (by Qian Bao)
- [aot] Hand-written CMake target script (#7644) (by PENGUINLIONG)
- [ci] Do not use Android toolchain for perf testing (#7642) (by Proton)
- [ci] Support Python 3.11 (#7627) (by Proton)
- [build] Setup Android SDK environment for performance bot (#7635) (by Zhanlue Yang)
- [ci] Update perf mon image (#7639) (by Proton)
- [ci] Fix perf mon break (#7638) (by Proton)
- [doc] Add documentation on using ghstack (#7632) (by Proton)
- [build] Static linking libstdc++ on Linux (by Proton)
- [ci] Rewrite Dockerfiles (by Proton)
- [ci] Resolve "Needed single revision" workaround failure when the repo directory is empty (#7633) (by Proton)
- [Vulkan] Fix repeated generation of array ranges in spirv codegen. (#7625) (by Haidong Lan)
- [build] Switch to use docker with Android-SDK for performance bot (#7630) (by Zhanlue Yang)
- [opengl] glfw finalize crash fix (by Proton)
- [ci] build.py: Android support, entering shell, export env (by Proton)
- [ci] Do not run tests with mixed backends (by Proton)
- [refactor] Use f16 function from external lib (by lin-hitonami)
- [refactor] Migrate members from RuntimeContext to LaunchContextBuilder (by lin-hitonami)
- [bug] Fix setting arguments exceeding the max arg num (by lin-hitonami)
- [cpu] Explicitly make cpu multithreading loop for range-fors. (#7593) (by Haidong Lan)
- [aot] Fixed generator for compute graph (#7626) (by PENGUINLIONG)
- [ir] Postpone scalarize and lower_matrix_ptr to after typecheck (#7589) (by 魔法少女赵志辉)
- [aot] Header generator completed (#7609) (by PENGUINLIONG)
- [amdgpu] Initialize AMDGPUContext with defaults (by Proton)
- [build] Remove libSPIRV-Tools-shared.(so|dll) in wheel (by Proton)
- [lang] Removed cpu_device(), cuda_device(), and amdgpu_device() from LlvmRuntimeExecutor (#7544) (by Zhanlue Yang)
- [refactor] Remove the get/set functions in RuntimeContext (by lin-hitonami)
- [aot] Pass LaunchContextBuilder to CompiledGraph::init_runtime_context (by lin-hitonami)
- [gfx] Let GfxRuntime use LaunchContextBuilder (by lin-hitonami)
- Let LaunchContextBuilder be the argument of the kernel launch function (by lin-hitonami)
- [llvm] [refactor] Set the llvm runtime when executing (by lin-hitonami)
- [refactor] Migrate {set, get}_{arg, ret} functions from RuntimeContext (by lin-hitonami)
- [bug] Fix compilation error (#7606) (by PGZXB)
- [aot] Hide map memory failure (#7604) (by PENGUINLIONG)
- [refactor] Fix KernelCodeGen::kernel from Kernel * to const Kernel * (by PGZXB)
- [refactor] Remove legacy implementation of llvm offline cache (by PGZXB)
- [refactor] Impl llvm::CompiledKernelData (by PGZXB)
- [bug] Type check for logical not op with real type inputs (#7600) (by Ailing)
- [bug] Improve ndarray creation to fix segmentation fault (#7577) (by pengyu)
- [lang] Add assembly printer for CPU backend (#7590) (by Zhanlue Yang)
- [misc] Update docker filer (#7598) (by Zeyu Li)
- [aot] Fix absolute path in generated TaichiTargets.cmake (#7597) (by Chenzhan Shang)
- [Doc] Remove deprecated api docstrings (#7596) (by pengyu)
- [llvm] Compile the kernel arguments to a StructType (by Lin Jiang)
- [lang] Fix issue with llvm opaque pointer (#7557) (by Zhanlue Yang)
- [opt] Constant folding for unary ops on host (#7573) (by Ailing)
- [bug] Type check for bit_not op with real type inputs (#7592) (by Ailing)
- [Doc] Fix the cexp docstring (#7588) (by Zhao Liang)
- [Lang] Replace internal representation of Python-scope ti.Matrix with numpy arrays (#7559) (by Yi Xu)
- [bug] Avoid cuda compilation via clang and ship pre-compiled .bc file instead (#7570) (by Zhanlue Yang)
- [aot] Taichi kernel AOT command (#7565) (by PENGUINLIONG)
- [bug] Fix struct members registered to StructField class (#7574) (by Ailing)
- [aot] Mobile platform AOT build scripts (#7567) (by PENGUINLIONG)
- [misc] Revert "Security upgrade ipython from 7.34.0 to 8.10.0 (#7341)" (#7571) (by Proton)
- [test] Add cpp tests for constant folding pass (#7566) (by Ailing)
- [misc] Security upgrade ipython from 7.34.0 to 8.10.0 (#7341) (by Chengchen(Rex) Wang)
- [lang] Refactor CudaCachingAllocator into a more generic caching allocator (#7531) (by Zhanlue Yang)
- [aot] Load GfxRuntime140 module from TCM (#7539) (by PENGUINLIONG)
- [lang] Fixed useless serial shader to blit ExternalTensorShapeAlongAxisStmt on Metal (#7562) (by PENGUINLIONG)
- [aot] Enable Vulkan 8bit storage (#7564) (by PENGUINLIONG)
- [bug] Fix crashing on printing FrontendFuncCallStmt with no return value (by lin-hitonami)
- [refactor] Remove LaunchContextBuilder::set_arg_raw (by lin-hitonami)
- [llvm] Generalize TaskCodeGenLLVM::create_return to set_struct_to_buffer (by lin-hitonami)
- [bug] Fix Cuda memory leak during TiRuntime destruction (#7345) (by Zhanlue Yang)
- [ir] Let void struct type represent void type (by lin-hitonami)
- [aot] Let C-API use LaunchContextBuilder to manage RuntimeContext (by lin-hitonami)
- [ir] Let the reference type declare a pointer argument (by lin-hitonami)
- [Doc] Add doc about returning struct (#7556) (by Lin Jiang)
- [bug] Fix returning struct containing vec3 (#7552) (by Lin Jiang)
- [lang] [ir] Extract and save the format specifiers in the f-string (#7514) (by 魔法少女赵志辉)
- [Lang] Stop letting ti.Struct inherit from TaichiOperations (#7474) (by Yi Xu)
- [aot] Recover AOT CI branch names (#7543) (by PENGUINLIONG)
- [aot] Put TiRT in Python wheel and CMake script to find it in wheel (#7537) (by PENGUINLIONG)
- [refactor] Remove the difficult-to-implement CompiledKernelData::size() (#7540) (by PGZXB)
- [bug] Implement the missing clone function for FrontendFuncCallStmt (#7538) (by PGZXB)
- [misc] Bump version to v1.6.0 (#7536) (by Haidong Lan)
- [doc] Handle 2 digit minor versions correctly (#7535) (by Ritoban Roy-Chowdhury)
- [aot] GfxRuntime140 convention docs (#7527) (by PENGUINLIONG)
- [rhi] Refactor allocate_memory API to use RhiResult (#7463) (by Bob Cao)
- [metal] Choose the proper msl version according to the device capability (#7506) (by Yu Zhang)
- [Lang] Support writing sparse matrix as matrix market file (#7529) (by pengyu)