ST测试时ACL单算子匹配模型失败，报错提示“[Match][OpModel]MatchOpModel”-昇腾社区

ST测试时ACL单算子匹配模型失败，报错提示“[Match][OpModel]MatchOpModel”

2023/05/08

298

暂无评分

我要评分

问题信息

问题来源	产品大类	关键字
官方	算子开发	ST测试、ACL单算子

问题现象描述

ST测试时ACL单算子匹配模型失败，报错提示“[Match][OpModel]MatchOpModel”。

放大

原因分析

模型加载时及模型执行时的算子信息不匹配。

解决措施

运行ST测试时通过配置环境变量“export ASCEND_GLOBAL_LOG_LEVEL=0”打开debug日志，在~/ascend/log下查看模型加载时的关键日志“Register model. OpModelDef =”及模型执行时的关键日志“OpExecutor::ExecuteAsync aclOp =”，需要确认加载和执行时模型中的算子信息是否匹配。

单算子模型推理时，模型匹配失败的问题较为常见，以下此类问题的定位方法供大家参考。

查看模型加载关键日志。
此条日志表示加载了静态单算子模型， debug级别：
```
AclOpMap::Insert IN, aclOp =
```
此条日志表示加载了动态单算子模型，info级别：
```
AclShapeRangeMap::Insert IN, aclOp =
```
如果没有开debug级别日志，可以搜索INFO关键字（Register model. OpModelDef = ），无论静态还是动态，加载模型时都会打印此条日志。
查看模型执行时关键日志。
此条日志表示opexecute时用户实际输入的算子shape,info级别。
```
OpExecutor::ExecuteAsync aclOp =
```
要确认加载的动态模型和执行时传的shape是不是匹配。

以静态单算子模型匹配进行分析举例。

查看加载模型关键日志（Register model. OpModelDef =），此条日志打印了加载的模型的全部描述信息，包括shape,attr等。

[INFO] ASCENDCL(30164,execute_op):2020-11-18-22:49:52.009.052 [../../../../../acl/single_op/op_model_manager.cpp:308]30164 RegisterModel: "Register model. OpModelDef = [OpModelDef] Path: op_models/0_StnPre_1_2_4_2_6_8_1_2_4_2_6_8_0_2_4_2_6_8_1_2_4_2_6_8_3_2_4_2_6_8.om, 
OpType: StnPre, 
InputDesc[0]: [TensorDesc] DataType = 1, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = []
InputDesc[1]: [TensorDesc] DataType = 1, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] 
InputDesc[2]: [TensorDesc] DataType = 0, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] 
OutputDesc[0]: [TensorDesc] DataType = 1, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] 
OutputDesc[1]: [TensorDesc] DataType = 3, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] , 
Attr: {align_corners = True, default_theta = [1.3, 1.2, 1.3, 1.2], size = [1, 1, 1, 1], use_default_theta = [False, False, True, False, False, False]}"

查看执行模型关键日志（OpExecutor::ExecuteAsync aclOp =），此条日志打印了用户调用执行接口的输入shape信息，也包括执行的shape和attr等。

[INFO] ASCENDCL(30164,execute_op):2020-11-18-22:49:52.277.245 [../../../../../acl/single_op/op_executor.cpp:177]30166 ExecuteAsync: "OpExecutor::ExecuteAsync aclOp = OpType: StnPre, 
InputDesc[0]: [TensorDesc] DataType = 1, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] 
InputDesc[1]: [TensorDesc] DataType = 1, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] 
InputDesc[2]: [TensorDesc] DataType = 0, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = []
OutputDesc[0]: [TensorDesc] DataType = 1, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] 
OutputDesc[1]: [TensorDesc] DataType = 3, Format = 2, StorageFormat = -1, Shape = [4, 2, 6, 8], StorageShape = [], shapeRange = [] 
Attr: {align_corners = True, default_theta = [1.3, 1.2, 1.3, 1.2], size = [1, 1, 1, 1], use_default_theta = [False, False, True, False, False, False]}"

静态单算子模型匹配要求用户输入的shape信息和加载的模型信息完全一致，才可以匹配成功。要保证以上日志中的每个InputDesc和OutputDesc中的描述完全一致，如果不一致则会匹配失败。

以动态单算子模型匹配进行分析举例。

查看加载模型关键日志。

[INFO] ASCENDCL(4775,execute_mul_op):2020-11-22-00:35:36.262.762 [../../../../../acl/single_op/op_model_manager.cpp:308]4775 RegisterModel: "Register model. OpModelDef = [OpModelDef] Path: op_models/0_Where_3_2_256_3_2_-1_1.om, 
OpType: Where, 
InputDesc[0]: [TensorDesc] DataType = 3, Format = 2, StorageFormat = -1, Shape = [256], StorageShape = [], shapeRange = 256, 256 
OutputDesc[0]: [TensorDesc] DataType = 3, Format = 2, StorageFormat = -1, Shape = [-1, 1], StorageShape = [], shapeRange = [[0, 256], [1, 1]] , Attr: {}"

查看执行模型关键日志。

[INFO] ASCENDCL(4775,execute_mul_op):2020-11-22-00:35:36.516.023 [../../../../../acl/single_op/op_executor.cpp:177]4775 ExecuteAsync: "OpExecutor::ExecuteAsync aclOp = OpType: Where, 
InputDesc[0]: [TensorDesc] DataType = 3, Format = 2, StorageFormat = -1, Shape = [256], StorageShape = [], shapeRange = [] 
OutputDesc[0]: [TensorDesc] DataType = 3, Format = 2, StorageFormat = -1, Shape = [256, 1], StorageShape = [], shapeRange = [], Attr: {}"

动态单算子模型匹配，除了需要保证执行阶段除shape和shapeRange外的所有信息要和模型加载的一致之外，还需要保证执行时的shape在模型的shaperange中的范围内。

单算子aclopExecute端到端流程。

执行单算子匹配时会先匹配静态map，再去匹配动态map，若静态和动态都匹配不上，则匹配失败。

以下日志表示静态表没有命中后再去匹配动态表，并说明静态表中没匹配上的原因是opType没匹配上。

[WARNING] ASCENDCL(51297,python):2021-03-24-22:37:38.638.462 [op_model_manager.cpp:235]51297 Get: Match op type failed. opType = Equal
[INFO] ASCENDCL(51297,python):2021-03-24-22:37:38.638.466 [op_model_manager.cpp:728]51297 MatchOpModel: Match static opModels fail, begin to match model from dynamic opModels. opType = Equal

以下日志表示动态表没有匹配上，动态表没匹配上存在多种原因。

表示动态表中没有该opType对应的模型：

[INFO] ASCENDCL(51297,python):2021-03-24-22:37:38.638.472 [op_model_manager.cpp:253]51297 GetTensorShapeStatus: GetTensorShapeStatus opType is Equal, size of shapeStatus is 0
[WARNING] ASCENDCL(51297,python):2021-03-24-22:37:38.638.475 [op_model_manager.cpp:783]51297 MatchOpModel: MatchOpModel fail from static map or dynamic map

表示inputDesc没有匹配上：

[ERROR] ASCENDCL(51297,python):2021-03-24-22:37:38.840.261 [op_model_manager.cpp:293]51297 Get: Match op inputs failed. opType = Equal, inputDesc = 2~9_2_2_-2_false_0|9_2_false_1|
[ERROR] ASCENDCL(51297,python):2021-03-24-22:37:38.840.266 [op_model_manager.cpp:783]51297 MatchOpModel: MatchOpModel fail from static map or dynamic map

此时需要对比前面加载进来的动态模型中的inputDesc是什么，查找方式可参考步骤1中的方法。

查看已经加载进来的equal模型信息。

Insert: AclShapeRangeMap::Insert IN, aclOp = OpType: Equal, 
InputDesc[0]: [TensorDesc] DataType = 9, Format = 2, StorageFormat = 2, Shape = [-2], StorageShape = [-2], shapeRange = [], memtype = 0, isConst = 0 
InputDesc[1]: [TensorDesc] DataType = 9, Format = 2, StorageFormat = -1, Shape = [], StorageShape = [], shapeRange = [], memtype = 1, isConst = 1 , isConst = true, Const Len = 2 ,Const data = 2,0, 
OutputDesc[0]: [TensorDesc] DataType = 12, Format = 2, StorageFormat = 2, Shape = [-2], StorageShape = [-2], shapeRange = [], memtype = 0, isConst = 0 Attr: {}

查看用户输入的equal模型信息。

aclopCompileAndExecute: ExecuteAsync::aclOp = OpType: Equal, 
InputDesc[0]: [TensorDesc] DataType = 9, Format = 2, StorageFormat = 2, Shape = [1], StorageShape = [1], shapeRange = [], memtype = 0, isConst = 0 
InputDesc[1]: [TensorDesc] DataType = 9, Format = 2, StorageFormat = -1, Shape = [], StorageShape = [], shapeRange = [], memtype = 1, isConst = 0 
OutputDesc[0]: [TensorDesc] DataType = 12, Format = 2, StorageFormat = 2, Shape = [1], StorageShape = [1], shapeRange = [], memtype = 0, isConst = 0 Attr: {}

从以上两个信息的对比可以发现，第二个tensor输入中模型上的isConst为1，用户输入的isConst为0，模型没有匹配上。

本页内容

问题信息
问题现象描述
原因分析
解决措施

问题现象描述

原因分析

解决措施

关于昇腾

新闻与活动

交流与资讯

支持与服务

开源社区

About Ascend

Communication and Information

Links