vec_mul
功能说明
按element求积:
函数原型
vec_mul(mask, dst, src0, src1, repeat_times, dst_rep_stride, src0_rep_stride, src1_rep_stride)
参数说明
请参见参数说明。
dst/src0/src1的数据类型需要保持一致。
Atlas 200/300/500 推理产品,dst/src0/src1支持的数据类型为:Tensor(float16/float32/int32)
Atlas 训练系列产品,dst/src0/src1支持的数据类型为:Tensor(float16/float32/int32)
Atlas推理系列产品AI Core,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)
Atlas推理系列产品Vector Core,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)
Atlas A2训练系列产品/Atlas 800I A2推理产品,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)
Atlas 200/500 A2推理产品,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)
返回值
无
支持的型号
Atlas 200/300/500 推理产品
Atlas 训练系列产品
Atlas推理系列产品AI Core
Atlas推理系列产品Vector Core
Atlas A2训练系列产品/Atlas 800I A2推理产品
Atlas 200/500 A2推理产品
注意事项
请参考注意事项。
调用示例
此样例是针对数据量较小、一次搬运就可以完成的场景,目的是让大家了解接口的功能,更复杂的数据量较大的样例可参见调用示例
from tbe import tik tik_instance = tik.Tik() src0_gm = tik_instance.Tensor("float16", (128,), name="src0_gm", scope=tik.scope_gm) src1_gm = tik_instance.Tensor("float16", (128,), name="src1_gm", scope=tik.scope_gm) dst_gm = tik_instance.Tensor("float16", (128,), name="dst_gm", scope=tik.scope_gm) src0_ub = tik_instance.Tensor("float16", (128,), name="src0_ub", scope=tik.scope_ubuf) src1_ub = tik_instance.Tensor("float16", (128,), name="src1_ub", scope=tik.scope_ubuf) dst_ub = tik_instance.Tensor("float16", (128,), name="dst_ub", scope=tik.scope_ubuf) # 将用户输入数据从gm搬运到ub tik_instance.data_move(src0_ub, src0_gm, 0, 1, 8, 0, 0) tik_instance.data_move(src1_ub, src1_gm, 0, 1, 8, 0, 0) tik_instance.vec_mul(128, dst_ub, src0_ub, src1_ub, 1, 8, 8, 8) # 将计算结果从ub搬运到目标gm tik_instance.data_move(dst_gm, dst_ub, 0, 1, 8, 0, 0) tik_instance.BuildCCE(kernel_name="vec_mul", inputs=[src0_gm, src1_gm], outputs=[dst_gm])
结果示例:
输入数据(src0_gm): [ 3.025 8.914 5.406 -6.38 8.15 -6.984 -6.617 8.53 -9.63 9.836 7.92 6.707 4.523 0.10803 -4.812 -2.068 6.055 -9.24 -0.6514 0.1265 3.14 5.613 -5.06 -3.182 -3.988 -8.02 -5.363 -5.66 -3.832 -1.793 -4.223 1.123 2.904 7.465 6.945 2.027 1.179 0.5415 -3.656 6.52 8.945 8.695 -9.44 6.918 4.664 -0.4995 -8.49 -0.944 -9.07 1.041 4.812 0.5127 4.242 1.67 9.89 -5.324 3.762 2.36 0.724 -9.375 -7.973 -7.582 -4.746 3.926 6.965 7.375 0.677 5.83 -9.54 7.996 9.19 2.85 1.707 4.438 4.387 3.312 3.082 0.9263 -4.375 7.51 -4.516 -4.547 5.457 -6.19 -1.926 4.723 4.1 0.7617 -4.984 -0.3074 -3.021 -1.843 -5.176 -6.855 -9.805 -0.3064 -6.02 0.829 0.4595 -5.996 0.093 -6.45 -3.408 -1.678 5.215 -8.914 -7.37 -1.069 3.467 -9.72 1.433 5.98 3.916 2.514 -9.55 -1.637 7.285 -4.043 -2.543 -6.43 8.19 -6.902 -5.992 1.99 9.46 -3.037 -3.752 1.33 ] 输入数据(src1_gm): [ 7.363 -7.08 6.867 -9.414 5.273 3.092 -4.59 -4.53 -2.203 -8.33 -1.665 -4.66 2.297 0.794 -2.82 -3.373 -7.867 -9.97 -0.3037 7.92 0.04294 4.41 9.38 -9.734 -8.1 -8.414 -5.008 -5.79 -5.64 -7.41 1.147 3.643 7.66 0.196 5.457 -5.543 -6.973 4.695 -0.7764 -3.834 2.105 -6.758 -4.4 -7.66 -1.375 4.953 -2.65 -5.54 -2.234 5.574 -9.85 -9.45 1.5 -4.48 -3.225 -2.734 -4.22 -8.33 7.09 7.086 -8.984 -9.766 -5.773 8.875 -8.71 3.205 -2.963 1.526 9.7 7.055 -6.13 5.414 7.656 4.516 -4.945 5.414 9.87 0.4692 4.15 2.834 3.29 -2.156 -4.363 -6.06 -5.754 8.97 4.277 8.87 -7.605 7.184 -0.333 -9.734 6.223 2.164 -1.784 9.945 3.555 -1.429 -0.7563 1.945 -6.496 -3.227 7.332 6.96 5.125 -9.77 -9.87 6.01 3.9 -9.7 -0.2052 1.81 -7.242 -7.215 -3.746 -2.734 1.074 8.64 -6.18 -1.421 -1.693 6.965 -4.555 -2.959 9.65 6.645 4.52 9.2 ] 输出数据(dst_gm): [ 2.2281e+01 -6.3094e+01 3.7125e+01 6.0062e+01 4.2969e+01 -2.1594e+01 3.0375e+01 -3.8656e+01 2.1219e+01 -8.1938e+01 -1.3188e+01 -3.1250e+01 1.0391e+01 8.5754e-02 1.3570e+01 6.9766e+00 -4.7625e+01 9.2125e+01 1.9788e-01 1.0020e+00 1.3489e-01 2.4750e+01 -4.7469e+01 3.0969e+01 3.2312e+01 6.7500e+01 2.6859e+01 3.2781e+01 2.1609e+01 1.3289e+01 -4.8438e+00 4.0898e+00 2.2250e+01 1.4639e+00 3.7906e+01 -1.1234e+01 -8.2188e+00 2.5430e+00 2.8379e+00 -2.5000e+01 1.8828e+01 -5.8750e+01 4.1500e+01 -5.3000e+01 -6.4141e+00 -2.4746e+00 2.2500e+01 5.2266e+00 2.0266e+01 5.8047e+00 -4.7406e+01 -4.8477e+00 6.3633e+00 -7.4805e+00 -3.1891e+01 1.4555e+01 -1.5867e+01 -1.9656e+01 5.1328e+00 -6.6438e+01 7.1625e+01 7.4062e+01 2.7406e+01 3.4844e+01 -6.0656e+01 2.3641e+01 -2.0059e+00 8.8984e+00 -9.2562e+01 5.6406e+01 -5.6312e+01 1.5430e+01 1.3070e+01 2.0031e+01 -2.1688e+01 1.7938e+01 3.0406e+01 4.3457e-01 -1.8156e+01 2.1281e+01 -1.4852e+01 9.8047e+00 -2.3812e+01 3.7500e+01 1.1078e+01 4.2344e+01 1.7547e+01 6.7539e+00 3.7906e+01 -2.2090e+00 1.0059e+00 1.7938e+01 -3.2219e+01 -1.4836e+01 1.7500e+01 -3.0469e+00 -2.1391e+01 -1.1846e+00 -3.4741e-01 -1.1664e+01 -6.0449e-01 2.0812e+01 -2.4984e+01 -1.1680e+01 2.6719e+01 8.7125e+01 7.2750e+01 -6.4297e+00 1.3523e+01 9.4312e+01 -2.9395e-01 1.0820e+01 -2.8359e+01 -1.8141e+01 3.5750e+01 4.4766e+00 7.8242e+00 -3.4938e+01 1.5711e+01 9.1328e+00 -1.3867e+01 -4.8062e+01 2.7297e+01 -5.8906e+00 9.1312e+01 -2.0188e+01 -1.6953e+01 1.2242e+01]