校准执行过程中提示“IfmrQuantCalibration with offset scale is illegal"或“ IfmrQuantCalibration without offset scale is illegal”
问题描述
在调用Caffe框架执行中间校准模型推理过程中,由于输入数据范围不合法,导致量化算法计算得到的scale不合理,从而校准过程失败,终止Caffe校准流程。
可能原因
- 数据范围 [-inf , +inf]:
因为AMCT的量化算法需要强制过零点,所以计算出的scale也就是inf/255=inf,该情况下量化因子后续无法承载,因此量化算法会提示错误信息,不支持该数据范围。
- 场景1
1 2 3 4 5 6 7 8
I0513 16:12:37.508546 80719 ifmr_layer.cu:58] Doing layer: "Convolution1" calibration, already store 1/1 data. I0513 16:12:37.508558 80719 ifmr_layer.cu:65] Start to do ifmr quant. [ERROR][ProcessScale][48] Not support scale is +inf. [ERROR][IfmrQuantCalibration][338] IfmrQuantCalibration with offset scale is illegal. [ERROR][IfmrQuantInternel][462] IfmrQuantInternel calibration failed. F0513 16:12:37.535028 80719 ifmr_layer.cu:70] Check failed: ret == 0 (-65519 vs. 0) Do IFMR calibration failed *** Check failure stack trace: *** Aborted
- 场景2
1 2 3 4 5 6 7 8
I0513 16:14:48.173506 81053 ifmr_layer.cu:58] Doing layer: "Convolution1" calibration, already store 1/1 data. I0513 16:14:48.173519 81053 ifmr_layer.cu:65] Start to do ifmr quant. [ERROR][ProcessScale][48] Not support scale is +inf. [ERROR][IfmrQuantCalibration][359] IfmrQuantCalibration without offset scale is illegal. [ERROR][IfmrQuantInternel][462] IfmrQuantInternel calibration failed. F0513 16:14:48.199769 81053 ifmr_layer.cu:70] Check failed: ret == 0 (-65519 vs. 0) Do IFMR calibration failed *** Check failure stack trace: *** Aborted
- 场景1
- 数据范围:(其中EPSILON包括DBL_EPSILON double类型,FLT_EPSILON float类型,当前使用的是FLT_EPSILON类型)
AMCT量化支持计算得到的最大,因为在昇腾AI处理器量化动作做的是乘法计算: , 如果scale大于, 会小于FLT_EPSILON,此时量化后结果就不可信。因此AMCT量化算法仅支持原始数据范围在内进行量化,否则会提示不支持并提示错误信息。
1 2 3 4 5 6 7 8 9 10 11 12 13 14
2021-05-11 09:25:42,222 - INFO - [AMCT]:[WeightsCalibrationPass]: Do layer 'Convolution1[arq_quantize]' weights calibration success! 2021-05-11 09:25:42,222 - INFO - [AMCT]:[Optimizer]: Do <class 'amct_caffe.optimizer.insert_ifmr_layer.InsertIFMRLayerPass'> 2021-05-11 09:25:42,223 - INFO - [AMCT]:[Optimizer]: Do <class 'amct_caffe.optimizer.lstm_calibration_replace.LSTMCalibrationReplacePass'> 2021-05-11 09:25:42,223 - INFO - [AMCT]:[Optimizer]: Do <class 'amct_caffe.optimizer.insert_searchn_layer.InsertSearchNLayerPass'> 2021-05-11 09:25:42,223 - INFO - [AMCT]:[Graph]: Doing whole model dump... 2021-05-11 09:25:42,225 - INFO - [AMCT]:[Utils]: The weights_file is saved in xx/data/AMCT15_CAFFE_GPU_GPU_FAQ_LEVEL1_TC_001/AMCT15_CAFFE_GPU_GPU_FAQ_LEVEL1_TC_001_tmp.caffemodel 2021-05-11 09:25:42,226 - INFO - [AMCT]:[Utils]: The model_file is saved in xx/data/AMCT15_CAFFE_GPU_GPU_FAQ_LEVEL1_TC_001/AMCT15_CAFFE_GPU_GPU_FAQ_LEVEL1_TC_001_tmp.prototxt [ERROR][ProcessScale][52] Not support scale greater than 1 / FLT_EPSILON. [ERROR][IfmrQuantCalibration][338] IfmrQuantCalibration with offset scale is illegal. [ERROR][IfmrQuantInternel][462] IfmrQuantInternel calibration failed. WARNING: Logging before InitGoogleLogging() is written to STDERR F0511 09:25:42.292863 80432 ifmr_layer.cu:70] Check failed: ret == 0 (-65519 vs. 0) Do IFMR calibration failed *** Check failure stack trace: *** Aborted
处理建议
根据提示信息,跳过日志中的量化层,例如上述提示信息中的Convolution1层。
父主题: FAQ