核函数NPU上板精度验证

NPU调测支持一键PIPE_ALL等调试功能，更多相关介绍参见NPU调测功能。

本场景以AddCustom算子为例，假设输入数据和标杆数据是用户自行提供的bin文件，核函数NPU上板调测过程如下。请根据自身实际情况，按需修改示例代码。

import torch
import numpy as np
import ascendebug

# 设置和清理日志文件
ascendebug.set_log_file('test.log', clean=True)

# 1.导入输入/标杆数据，构建算子信息
debug_op = ascendebug.create_debug_op('AddCustom', 'VectorCore', 'Ascend310P') \
	.custom_input('x', 'int32', [32], '/data_path/x.bin') \
	.custom_input('y', 'int32', [32], '/data_path/y.bin') \
	.custom_output('z', 'int32', [32], '/data_path/z.bin') \
	.attr('mask', 'list_int', [0,0]) \
	.attr('repeatTimes', 'int', 1) \
	.attr('dstBlkStride', 'int', 1) \
	.attr('src0BlkStride', 'int', 1) \
	.attr('src1BlkStride', 'int', 1) \
	.attr('dstRepStride', 'int', 8) \
	.attr('src0RepStride', 'int', 8) \
	.attr('src1RepStride', 'int', 8) \
	.attr('calCount', 'int', 3) \
	.attr('memory', 'int', 0)

# 2.创建调试对象并初始化工作空间
install_pkg = "/usr/local/Ascend/ascend-toolkit/"
customize_path = "/usr/local/Ascend/ascend-toolkit/latest/opp/vendors/add_custom"
op_executor = ascendebug.create_op_executor(debug_op=debug_op, install_path=install_pkg)

# 3.调用Tiling调测接口（可选，若已有Tiling bin文件可跳过本步骤）
tiling_info = op_executor.run_custom_tiling(customize_path)

# 4.调用NPU编译接口
compile_npu_options = ascendebug.CompileNpuOptions()
name, kernel_file, extern = op_executor.compile_custom_npu(customize_path, tiling_info.tiling_key, compile_npu_options)

# 5.调用NPU运行接口，完成上板精度比对
run_npu_options = ascendebug.RunNpuOptions()
compile_info = ascendebug.NpuCompileInfo(syncall=extern['cross_core_sync'], task_ration=extern['task_ration'])
op_executor.run_npu(kernel_file, run_npu_options, npu_compile_info=compile_info, tiling_info=tiling_info)

算子在NPU板端运行调测的精度比对结果示例可以参见“NPU调测功能 > 调测产物”。

父主题： 精度调试