下载
中文
注册

vec_mul

功能说明

按element求积:

函数原型

vec_mul(mask, dst, src0, src1, repeat_times, dst_rep_stride, src0_rep_stride, src1_rep_stride)

参数说明

请参见参数说明

dst/src0/src1的数据类型需要保持一致。

Atlas 200/300/500 推理产品,dst/src0/src1支持的数据类型为:Tensor(float16/float32/int32)

Atlas 训练系列产品,dst/src0/src1支持的数据类型为:Tensor(float16/float32/int32)

Atlas推理系列产品AI Core,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)

Atlas推理系列产品Vector Core,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)

Atlas A2训练系列产品/Atlas 800I A2推理产品,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)

Atlas 200/500 A2推理产品,dst/src0/src1支持的数据类型为:Tensor(float16/int16/float32/int32)

返回值

支持的型号

Atlas 200/300/500 推理产品

Atlas 训练系列产品

Atlas推理系列产品AI Core

Atlas推理系列产品Vector Core

Atlas A2训练系列产品/Atlas 800I A2推理产品

Atlas 200/500 A2推理产品

注意事项

请参考注意事项

调用示例

此样例是针对数据量较小、一次搬运就可以完成的场景,目的是让大家了解接口的功能,更复杂的数据量较大的样例可参见调用示例

from tbe import tik
tik_instance = tik.Tik()
src0_gm = tik_instance.Tensor("float16", (128,), name="src0_gm", scope=tik.scope_gm)
src1_gm = tik_instance.Tensor("float16", (128,), name="src1_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor("float16", (128,), name="dst_gm", scope=tik.scope_gm)
src0_ub = tik_instance.Tensor("float16", (128,), name="src0_ub", scope=tik.scope_ubuf)
src1_ub = tik_instance.Tensor("float16", (128,), name="src1_ub", scope=tik.scope_ubuf)
dst_ub = tik_instance.Tensor("float16", (128,), name="dst_ub", scope=tik.scope_ubuf)
# 将用户输入数据从gm搬运到ub
tik_instance.data_move(src0_ub, src0_gm, 0, 1, 8, 0, 0)
tik_instance.data_move(src1_ub, src1_gm, 0, 1, 8, 0, 0)
tik_instance.vec_mul(128, dst_ub, src0_ub, src1_ub, 1, 8, 8, 8)
# 将计算结果从ub搬运到目标gm
tik_instance.data_move(dst_gm, dst_ub, 0, 1, 8, 0, 0)

tik_instance.BuildCCE(kernel_name="vec_mul", inputs=[src0_gm, src1_gm], outputs=[dst_gm])

结果示例:

输入数据(src0_gm):
[ 3.025    8.914    5.406   -6.38     8.15    -6.984   -6.617    8.53
 -9.63     9.836    7.92     6.707    4.523    0.10803 -4.812   -2.068
  6.055   -9.24    -0.6514   0.1265   3.14     5.613   -5.06    -3.182
 -3.988   -8.02    -5.363   -5.66    -3.832   -1.793   -4.223    1.123
  2.904    7.465    6.945    2.027    1.179    0.5415  -3.656    6.52
  8.945    8.695   -9.44     6.918    4.664   -0.4995  -8.49    -0.944
 -9.07     1.041    4.812    0.5127   4.242    1.67     9.89    -5.324
  3.762    2.36     0.724   -9.375   -7.973   -7.582   -4.746    3.926
  6.965    7.375    0.677    5.83    -9.54     7.996    9.19     2.85
  1.707    4.438    4.387    3.312    3.082    0.9263  -4.375    7.51
 -4.516   -4.547    5.457   -6.19    -1.926    4.723    4.1      0.7617
 -4.984   -0.3074  -3.021   -1.843   -5.176   -6.855   -9.805   -0.3064
 -6.02     0.829    0.4595  -5.996    0.093   -6.45    -3.408   -1.678
  5.215   -8.914   -7.37    -1.069    3.467   -9.72     1.433    5.98
  3.916    2.514   -9.55    -1.637    7.285   -4.043   -2.543   -6.43
  8.19    -6.902   -5.992    1.99     9.46    -3.037   -3.752    1.33   ]

输入数据(src1_gm):
[ 7.363   -7.08     6.867   -9.414    5.273    3.092   -4.59    -4.53
 -2.203   -8.33    -1.665   -4.66     2.297    0.794   -2.82    -3.373
 -7.867   -9.97    -0.3037   7.92     0.04294  4.41     9.38    -9.734
 -8.1     -8.414   -5.008   -5.79    -5.64    -7.41     1.147    3.643
  7.66     0.196    5.457   -5.543   -6.973    4.695   -0.7764  -3.834
  2.105   -6.758   -4.4     -7.66    -1.375    4.953   -2.65    -5.54
 -2.234    5.574   -9.85    -9.45     1.5     -4.48    -3.225   -2.734
 -4.22    -8.33     7.09     7.086   -8.984   -9.766   -5.773    8.875
 -8.71     3.205   -2.963    1.526    9.7      7.055   -6.13     5.414
  7.656    4.516   -4.945    5.414    9.87     0.4692   4.15     2.834
  3.29    -2.156   -4.363   -6.06    -5.754    8.97     4.277    8.87
 -7.605    7.184   -0.333   -9.734    6.223    2.164   -1.784    9.945
  3.555   -1.429   -0.7563   1.945   -6.496   -3.227    7.332    6.96
  5.125   -9.77    -9.87     6.01     3.9     -9.7     -0.2052   1.81
 -7.242   -7.215   -3.746   -2.734    1.074    8.64    -6.18    -1.421
 -1.693    6.965   -4.555   -2.959    9.65     6.645    4.52     9.2    ]

输出数据(dst_gm):
[ 2.2281e+01 -6.3094e+01  3.7125e+01  6.0062e+01  4.2969e+01 -2.1594e+01
  3.0375e+01 -3.8656e+01  2.1219e+01 -8.1938e+01 -1.3188e+01 -3.1250e+01
  1.0391e+01  8.5754e-02  1.3570e+01  6.9766e+00 -4.7625e+01  9.2125e+01
  1.9788e-01  1.0020e+00  1.3489e-01  2.4750e+01 -4.7469e+01  3.0969e+01
  3.2312e+01  6.7500e+01  2.6859e+01  3.2781e+01  2.1609e+01  1.3289e+01
 -4.8438e+00  4.0898e+00  2.2250e+01  1.4639e+00  3.7906e+01 -1.1234e+01
 -8.2188e+00  2.5430e+00  2.8379e+00 -2.5000e+01  1.8828e+01 -5.8750e+01
  4.1500e+01 -5.3000e+01 -6.4141e+00 -2.4746e+00  2.2500e+01  5.2266e+00
  2.0266e+01  5.8047e+00 -4.7406e+01 -4.8477e+00  6.3633e+00 -7.4805e+00
 -3.1891e+01  1.4555e+01 -1.5867e+01 -1.9656e+01  5.1328e+00 -6.6438e+01
  7.1625e+01  7.4062e+01  2.7406e+01  3.4844e+01 -6.0656e+01  2.3641e+01
 -2.0059e+00  8.8984e+00 -9.2562e+01  5.6406e+01 -5.6312e+01  1.5430e+01
  1.3070e+01  2.0031e+01 -2.1688e+01  1.7938e+01  3.0406e+01  4.3457e-01
 -1.8156e+01  2.1281e+01 -1.4852e+01  9.8047e+00 -2.3812e+01  3.7500e+01
  1.1078e+01  4.2344e+01  1.7547e+01  6.7539e+00  3.7906e+01 -2.2090e+00
  1.0059e+00  1.7938e+01 -3.2219e+01 -1.4836e+01  1.7500e+01 -3.0469e+00
 -2.1391e+01 -1.1846e+00 -3.4741e-01 -1.1664e+01 -6.0449e-01  2.0812e+01
 -2.4984e+01 -1.1680e+01  2.6719e+01  8.7125e+01  7.2750e+01 -6.4297e+00
  1.3523e+01  9.4312e+01 -2.9395e-01  1.0820e+01 -2.8359e+01 -1.8141e+01
  3.5750e+01  4.4766e+00  7.8242e+00 -3.4938e+01  1.5711e+01  9.1328e+00
 -1.3867e+01 -4.8062e+01  2.7297e+01 -5.8906e+00  9.1312e+01 -2.0188e+01
 -1.6953e+01  1.2242e+01]