MyGit

v1.7.0

taichi-dev/taichi

版本发布时间: 2023-11-27 14:46:58

taichi-dev/taichi最新发布版本:v1.7.2(2024-08-22 16:59:46)

1. New features

1.1 Real Function

We are excited to announce the stabilization of the Real Function feature in Taichi Lang v1.7.0. Initially introduced as an experimental feature in v1.0.0, it has now matured with enhanced capabilities and usability.

Key Updates

Limitations

Important Note on Usage: Ensure all arguments and return values in Real Functions are explicitly type-hinted.

Usage Example

The following example demonstrates the recursive capability of Real Functions. The sum_func Real Function is used to calculate the sum of numbers from 1 to n, showcasing its ability to handle multiple return statements and variable recursion depths.

@ti.real_func
def sum_func(n: ti.i32) -> ti.i32:
    if n == 0:
        return 0
    return sum_func(n - 1) + n

@ti.kernel
def sum(n: ti.i32) -> ti.i32:
    return sum_func(n)

print(sum(100))  # 5050

You can find more examples of the real function in the repository.

1.2 Enhancements in Kernel Arguments and Return Values

Support for Multiple Return Values in Taichi Kernel:

In this update, we've introduced the capability to return multiple values from a Taichi kernel. This can be achieved by specifying a tuple as the return type. You can directly use (ti.f32, s0) as the type hint or write the type hint in Python manner like typing.Tuple[ti.f32, s0] or for Python 3.9 and above, tuple[ti.f32, s0] . The following example illustrates this new feature:

s0 = ti.types.struct(a=ti.math.vec3, b=ti.i16)

@ti.real_func
def foo() -> (ti.f32, s0):
    return 1, s0(a=ti.math.vec3([100, 0.5, 3]), b=1)

@ti.kernel
def bar() -> (ti.f32, s0):
    return foo()
    
ret1, ret2 = bar()
print(ret1)  # 1.0
print(ret2)  # {'a': [100.0, 0.5, 3.0], 'b': 1}

Removal of Size Limit on Kernel Arguments and Return Values:

We have eliminated the size restrictions on kernel arguments and return values. However, it's crucial to remember that keeping these small is advisable. Large argument or return value sizes can lead to substantially longer compile times. While we support larger sizes, we haven't thoroughly tested arguments and return values exceeding 4KB and cannot guarantee their flawless functionality.

1.3 Argument Pack

Taichi now introduces a powerful feature for developers - Argument Packs. This new functionality enables efficient caching of unchanged parameters between multiple kernel calls, which not only provides convenience when launching a kernel, but also boosts the performance.

Key Advantages

Usage Example

import taichi as ti
ti.init()

# Defining a custom argument type using "ti.types.argpack"
view_params_tmpl = ti.types.argpack(view_mtx=ti.math.mat4, proj_mtx=ti.math.mat4, far=ti.f32)

# Declaration of a Taichi kernel leveraging Argument Packs
@ti.kernel
def p(view_params: view_params_tmpl) -> ti.f32:
    return view_params.far

# Instantiation of the argument pack
view_params = view_params_tmpl(
    view_mtx=ti.math.mat4(
        [[1, 0, 0, 0],
         [0, 1, 0, 0],
         [0, 0, 1, 0],
         [0, 0, 0, 1]]),
    proj_mtx=ti.math.mat4(
        [[1, 0, 0, 0],
         [0, 1, 0, 0],
         [0, 0, 1, 0],
         [0, 0, 0, 1]]),
    far=1)

# Executing the kernel with the Argument Pack
print(p(view_params))  # Outputs: 1.0

Supported Data Types

Argument Packs are currently compatible with a variety of data types, including scalar, matrix, vector, Ndarray, and Struct.

Limitations

Please note that Argument Packs currently do not support the following features and data types:

2. Improvements

2.1 CUDA Memory Allocation Improvements

Dynamic VRAM Allocation:

Changes in device_memory_GB and device_memory_fraction Usage:

Impact on VRAM Consumption:

2.2 CUDA SIMT APIs

Added the following ti.simt.block APIs:

2.3 Sparse grid APIs

Added helper function to create a 2D/3D sparse grid, for example:

    # create a 2D sparse grid
    grid = ti.sparse.grid(
        {
            "pos": ti.math.vec2,
            "mass": ti.f32,
            "grid2particles": ti.types.vector(20, ti.i32),
        },
        shape=(10, 10),
    )

    # access
    grid[0, 0].pos = ti.math.vec2(1, 2)
    grid[0, 0].mass = 1.0
    grid[0, 0].grid2particles[2] = 123

2.4 GGUI

2.5 AOT

2.6 Error reporting

2.7 Autodiff

3. Bug Fixes

3.1 Autodiff Bugfixes

3.2 AOT Bugfixes

3.3 API Bugfixes

3.4 Build & Environment Bugfixes

3.5 GGUI Bugfixes

4. Deprecation Notice

5. Full changelog

Highlights:
   - **Bug fixes**
      - Fix macro error with ti_import_cpu_memory (#8401) (by **Zhanlue Yang**)
      - Fix argpack nesting issues (by **listerily**)
      - Convert matrices to structs in argpack type members, Fixing layout error (by **listerily**)
      - Fix error when returning a struct field member when the return … (#8271) (by **秋云未云**)
      - Fix Erroneous handling of ndarray in real function in CFG (#8245) (by **Lin Jiang**)
      - Fix issue with passing python-scope Matrix as ti.func argument (#8197) (by **Zhanlue Yang**)
      - Fix incorrect CFG Graph structure due to missing Block wiith OffloadedStmts on LLVM backend (#8113) (by **Zhanlue Yang**)
      - Fix type inference error with LowerMatrixPtr pass (#8105) (by **Zhanlue Yang**)
      - Set initial value for Cuda device allocation (#8063) (by **Zhanlue Yang**)
      - Fix the insertion position of the access chain (#7957) (by **Lin Jiang**)
      - Fix wrong datatype size when writing to ndarray from Python scope (by **Ailing Zhang**)
   - **CUDA backend**
      - Warn driver version if it doesn't support memory pool. (#7912) (by **Haidong Lan**)
   - **Documentation**
      - Fixing typo in impl.py on ti.grouped function documentation (#8407) (by **Quentin Warnant**)
      - Update doc about kernels and functions (#8400) (by **Lin Jiang**)
      - Update documentation (#8089) (by **Zhao Liang**)
      - Update docstring for inverse func (#8170) (by **Zhao Liang**)
      - Update type.md, add descriptions of the vector (#8048) (by **Chenzhan Shang**)
      - Fix a bug in faq.md (#7992) (by **Zhao Liang**)
      - Fix problems in type_system.md (#7949) (by **秋云未云**)
      - Add doc about struct arguments (#7959) (by **Lin Jiang**)
      - Fix docstring of mix function (#7922) (by **Zhao Liang**)
      - Update faq and ggui, and add them to CI (#7861) (by **Zhao Liang**)
      - Add kernel sync doc (#7831) (by **Zhao Liang**)
   - **Error messages**
      - Warn before calling the external function (#8177) (by **Lin Jiang**)
      - Add option to print full traceback in Python (#8160) (by **Lin Jiang**)
      - Let to_primitive_type throw an error if the type is a pointer (by **lin-hitonami**)
      - Update deprecation warning of the graph arguments (#7965) (by **Lin Jiang**)
   - **Language and syntax**
      - Add clz instruction (#8276) (by **Jett Chen**)
      - Move real function out of the experimental module (#8399) (by **Lin Jiang**)
      - Fix error with loop unique analysis for MatrixPtrStmt (#8307) (by **Zhanlue Yang**)
      - Pass DebugInfo from Python to C++ for ndarray and field (#8286) (by **魔法少女赵志辉**)
      - Support TensorType for SharedArray (#8258) (by **Zhanlue Yang**)
      - Use ErrorEmitter in type check passes (#8285) (by **魔法少女赵志辉**)
      - Implement struct DebugInfo and ErrorEmitter (#8284) (by **魔法少女赵志辉**)
      - Add TensorType support for Constant Folding (#8250) (by **Zhanlue Yang**)
      - Support TensorType for irpass::alg_simp() (#8225) (by **Zhanlue Yang**)
      - Support vector/matrix ndarray arguments in real function (by **Lin Jiang**)
      - Fix error on ndarray type check (by **Lin Jiang**)
      - Support real function in data-oriented classes (by **lin-hitonami**)
      - Let kernel support return type annotated with 'typing.Tuple' (by **lin-hitonami**)
      - Support tuple return value for kernel and real function (by **lin-hitonami**)
      - Let static assert be in static scope (#8217) (by **Lin Jiang**)
      - Avoid scalarization for AOS GlobalPtrStmt (#8187) (by **Zhanlue Yang**)
      - Support matrix return value for real function (by **lin-hitonami**)
      - Support ndarray argument for real function (by **lin-hitonami**)
      - Cast the scalar arguments and return values of ti.func if the type hints exist (#8193) (by **Lin Jiang**)
      - Handle MatrixPtrStmt for uniquely_accessed_pointers() (#8165) (by **Zhanlue Yang**)
      - Support struct arguments for real function (by **lin-hitonami**)
      - Merge irpass::half2_vectorize() with irpass::scalarize() (#8102) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after optimize_bit_struct_stores & determine_ad_stack_size (#8097) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::demote_operations() (#8096) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::lower_access() (#8091) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::make_block_local() (#8090) (by **Zhanlue Yang**)
      - Support TensorType for Dead-Store-Elimination (#8065) (by **Zhanlue Yang**)
      - Optimize alias checking conditions for store-to-load forwarding (#8079) (by **Zhanlue Yang**)
      - Support TensorType for Load-Store-Forwarding (#8058) (by **Zhanlue Yang**)
      - Fix TensorTyped error with irpass::make_thread_local() (#8051) (by **Zhanlue Yang**)
      - Fix numerical issue with auto_diff() (#8025) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::make_mesh_block_local() (#8030) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::make_thread_local() (#8028) (by **Zhanlue Yang**)
      - Support allocate with cuda memory pool and reduce preallocation size accordingly (#7929) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::demote_no_access_mesh_fors() (#7956) (by **Zhanlue Yang**)
      - Fix error with irpass::check_out_of_bound() for TensorTyped ExternalPtrStmt (#7997) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::demote_atomics() (#7943) (by **Zhanlue Yang**)
      - Separate out preallocation logics for runtime objects (#7938) (by **Zhanlue Yang**)
      - Remove deprecated funcs in __init__.py (#7941) (by **Lin Jiang**)
      - Remove deprecated sparse_matrix_builder function (#7942) (by **Lin Jiang**)
      - Remove deprecated compile option ndarray_use_cached_allocator (#7937) (by **Zhanlue Yang**)
      - Migrate irpass::scalarize() after irpass::detect_read_only() (#7939) (by **Zhanlue Yang**)
      - Remove deprecated funcs in ti.ui (#7940) (by **Lin Jiang**)
      - Remove the support for 'is' (#7930) (by **Lin Jiang**)
      - Migrate irpass::scalarize() after irpass::offload() (#7919) (by **Zhanlue Yang**)
      - Raise error when the dimension of the ndrange does not equal to the number of the loop variable (#7933) (by **Lin Jiang**)
      - Remove a.atomic(b) (#7925) (by **Lin Jiang**)
      - Cancel deprecating native min/max (#7928) (by **Lin Jiang**)
      - Fix the api doc search problem (#7918) (by **Zhao Liang**)
      - Move irpass::scalarize() after irpass::auto_diff() (#7902) (by **Zhanlue Yang**)
      - Fix Ndarray fill with Matrix/Vector typed values (#7901) (by **Zhanlue Yang**)
      - Add cast to field.fill() interface (#7899) (by **Zhanlue Yang**)
      - Let nested data classes have methods (#7909) (by **Lin Jiang**)
      - Let kernel argument support matrix nested in a struct (by **lin-hitonami**)
      - Support the functions of dataclass as kernel argument and return value (#7865) (by **Lin Jiang**)
      - Fix a bug on PosixPath (#7860) (by **Zhao Liang**)
      - Postpone MatrixType scalarization to irpass::differentiation_validation_check() (#7839) (by **Zhanlue Yang**)
      - Postpone MatrixType scalarization to irpass::gather_meshfor_relation_types() (#7838) (by **Zhanlue Yang**)
   - **Miscellaneous**
      - Make clang-tidy happy on 'explicit' (#7999) (by **秋云未云**)
   - **OpenGL backend**
      - Fix: runtime caught error cannot be displayed in opengl (#7998) (by **秋云未云**)
   - **IR optimization passes**
      - Make merging casts int(int(x)) less aggressive (#7944) (by **Ailing**)
      - Fix redundant clone of stmts across offloaded tasks (#7927) (by **Ailing**)
   - **Refactor**
      - Refactor the argument passing logic of rwtexture and remove extra_args (#7914) (by **Lin Jiang**)

相关地址:原始地址 下载(tar) 下载(zip)

查看:2023-11-27发行的版本