DSL功能调试

功能介绍

若算子运行出错，开发者可参见本章节描述的方法在CPU中进行DSL算子的功能调试，验证算子的算法逻辑是否正确。

使用DSL方式开发TBE算子的用户只需要关注算法逻辑，直接调用TBE DSL提供的auto_schedule接口即可完成算子的自动调度。因此，确认算法描述逻辑正确的情况下，如果算子运行依然出错，则可把错误归结为TBE内部错误，您可在Ascend开源仓中通过issue进行问题反馈。

使用方法

TBE DSL提供了在CPU上验证算子功能正确性的调试框架，方便开发者快速验证算子功能的正确性，具体流程如下：

图1 DSL功能调试流程

进入调试模式并获取算子运行的上下文。
1. 调用tbe.common.testing.debug接口，并配合python with语句进入调试模式，选择CPU作为算子的运行平台。
2. 调用tbe.common.testing.get_ctx接口，获取算子运行的上下文。
定义golden数据，用于后续的中间Tensor数据验证及算子输出验证。
实现DSL算子的计算逻辑，并根据需要进行中间数据验证。
1. 调用TBE提供的DSL接口进行计算逻辑描述，指明算子的计算方法和步骤。
2. （可选）中间Tensor数据验证
   开发者可调用tbe.common.testing.print_tensor接口，将中间Tensor的数据存入文件或直接打屏显示，调用tbe.common.testing.assert_allclose接口进行中间Tensor的数据校验。
调用TVM的create_schedule接口，为算子创建调度实例对象。
算子编译运行
1. 调用tbe.common.testing.build接口，编译生成在CPU上运行的DSL算子。
2. 调用tbe.common.testing.run接口执行算子。
验证输出数据的正确性。
至此，DSL算子的调试代码编写完成。开发者可编写入口函数，调用算子接口，进行算子的功能调试。

调试示例

本示例为了展示如何校验中间tensor，输入两个tensor a与b，实现d=a+b+b的加法运算。先计算c=a+b，然后再计算d=c+b，其中c为中间tensor。

from tbe import tvm
from tbe import dsl
from tbe.common.utils import para_check
from tbe.common.utils import shape_util
# 引入testing模块相关接口
from tbe.common.testing.testing import *
import numpy as np

@para_check.check_input_type(dict, dict, dict, str)
def addtest(input_a, input_b, output_d, kernel_name="addtest"):
    # 进入DSL调试模式，并选择CPU作为运行平台
    with debug(): 
        # 获取算子运行的上下文
        ctx = get_ctx()

        # 获取输入数据的shape与dtype
        shape_a = shape_util.scalar2tensor_one(input_a.get("shape"))
        shape_b = shape_util.scalar2tensor_one(input_b.get("shape"))
        data_type = input_a.get("dtype").lower()

        # 使用numpy定义输入golden数据大小
        a = tvm.nd.array(np.random.uniform(size=shape_a).astype(data_type), ctx)
        b = tvm.nd.array(np.random.uniform(size=shape_b).astype(data_type), ctx)
        # 使用numpy将输出d初始化为全0
        d = tvm.nd.array(np.zeros(shape_a, dtype=data_type), ctx)
        
        # 调用TVM的placeholder接口对输入tensor进行占位，并返回一个tensor对象
        data_a = tvm.placeholder(shape_a, name="data_1", dtype=data_type)
        data_b = tvm.placeholder(shape_b, name="data_2", dtype=data_type)
        # 调用DSL计算接口实现data_a + data_b
        data_c = dsl.vadd(data_a, data_b)
	
        # 中间Tensor数据验证
        sample = open('samplefile.txt', 'w')
        # 将中间tensor data_c存入文件samplefile.txt
        print_tensor(data_c, ofile=sample)
        # 检查中间tensor data_c的值是否正确
        assert_allclose(data_c, desired=a.asnumpy() + b.asnumpy(), tol=[1e-7, 1e-7])
        print("The value of data_c is the same as the expected value.")
					
	# 继续自定义DSL的逻辑撰写,调用DSL接口实现：data_d = data_c + data_b
        data_d = dsl.vadd(data_c, data_b)
        # 调用TVM的create_schedule接口，为算子创建调度实例对象，入参为输出tensor的OP列表。
        s = tvm.create_schedule(data_d.op)

        # 编译生成算子,data_a,data_b,data_d是占位的输入输出列表，AddTest是我们自定义算子的名称
        build(s, [data_a, data_b, data_d], name="AddTest")           

        # 执行算子,将a,b,d按顺序代入编译出来的DSL算子AddTest
        run(a, b, d)  # AddTest(a, b, d)

        # 将输出数据d的值打印出来,并预期结果进行比较，看是否相符
        print("d:", d)
        tvm.testing.assert_allclose(d.asnumpy(), a.asnumpy() + b.asnumpy() + b.asnumpy())
        print("The actual output is the same as the expected output.")

# 编写入口函数，调用addtest函数
if __name__ == "__main__":
    input_output_dict = {"shape": (2, 3, 4),"format": "ND","ori_shape": (2, 3, 4),"ori_format": "ND", "dtype":"float32"}
    addtest(input_output_dict, input_output_dict, input_output_dict, kernel_name="addtest")

在屏幕上的输出为：

======================== debug enter =======================
The value of data_c is the same as the expected value.
Tensor add_0 is saved to file samplefile.txt.
d: [[[1.1099341  2.7536283  0.6441797  1.9604567 ]
  [0.25995332 1.997332   1.4089121  1.7263429 ]
  [0.59042984 1.079533   2.1142702  1.3301892 ]]

 [[1.6043677  1.2658448  1.7424912  0.57584894]
  [1.6016111  1.2254462  0.34414443 1.43648   ]
  [2.1096592  1.4920356  2.0675716  2.458529  ]]]
The actual output is the same as the expected output.
======================== debug exit ========================

由打印信息“The value of data_c is the same as the expected value.”可知中间tensor data_c的计算结果与预期结果相同。

由打印信息“The actual output is the same as the expected output. ”可知最终输出结果与预期结果相同。

中间tensor data_c的数据存储到了samplefile.txt文件中。

父主题： 算子代码实现（TBE DSL）