aclnnGather-NN算子接口-算子加速库-CANN商用版8.0.RC3开发文档-昇腾社区

[object Object][object Object]

Atlas 推理系列产品。
Atlas 训练系列产品。
Atlas A2训练系列产品/Atlas 800I A2推理产品。

[object Object]

每个算子分为，必须先调用“aclnnGatherGetWorkspaceSize”接口获取计算所需workspace大小以及包含了算子计算流程的执行器，再调用“aclnnGather”接口执行计算。

[object Object]
[object Object]

[object Object]

算子功能：对输入tensor中指定的维度dim进行数据聚集。
计算公式：给定张量 $self$ ，维度 $d$ ，和一个索引张量 $index$ ，定义 $n$ 是 $self$ 的维度， $i_d$ 表示维度 $d$ 的索引， $index_{i_d}$ 表示索引张量 $index$ 在维度 $d$ 上的第 $i_d$ 个索引值。对指定维度d的gather功能可以用如下的数学公式表示：
$gather(X,index,d)_{i_0,i_1,\cdots,i_{d-1},i_{d+1},\cdots,i_{n-1}} = self_{i_0,i_1,\cdots,i_{d-1},index_{i_d},i_{d+1},\cdots,i_{n-1}}$
示例：
- 示例1：假设输入张量 $self=\begin{bmatrix}1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9\end{bmatrix}$ ，索引张量 $index=\begin{bmatrix}0 & 2\\ 1 & 0\end{bmatrix}$ ， $dim = 0$ ，那么输出张量 $out=\begin{bmatrix}1 & 8\\ 4 & 2\end{bmatrix}$ ，具体计算过程如下：
  $\begin{aligned} out_{0,0}&=self_{index_{0,0}, 0}=self_{0,0}=1 \\ out_{0,1}&=self_{index_{0,1}, 1}=self_{2,1}=8 \\ out_{1,0}&=self_{index_{1,0}, 0}=self_{1,0}=4 \\ out_{1,1}&=self_{index_{1,1}, 1}=self_{0,1}=2 \end{aligned}$
- 示例2：假设输入张量 $self=\begin{bmatrix}1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9\end{bmatrix}$ ，索引张量 $index=\begin{bmatrix}0 & 2\\ 1 & 0\end{bmatrix}$ ， $dim = 1$ ，那么输出张量 $out=\begin{bmatrix}1 & 3\\ 5 & 4\end{bmatrix}$ ，具体计算过程如下：
  $\begin{aligned} out_{0,0}&=self_{0, index_{0,0}}=self_{0,0}=1 \\ out_{0,1}&=self_{0, index_{0,1}}=self_{0,2}=3 \\ out_{1,0}&=self_{1, index_{1,0}}=self_{1,1}=5 \\ out_{1,1}&=self_{1, index_{1,1}}=self_{1,0}=4 \end{aligned}$

[object Object]

参数说明：
- self（aclTensor*，计算输入）：公式中的[object Object]，Device侧的aclTensor，数据类型需要与out一致，shape支持0-8维，维度数需要与index一致，支持，支持ND。
  - Atlas 推理系列产品、Atlas 训练系列产品：数据类型支持DOUBLE、FLOAT16、FLOAT32、INT32、UINT32、INT64、UINT64、INT16、UINT16、INT8、UINT8、BOOL。
  - Atlas A2训练系列产品/Atlas 800I A2推理产品：数据类型支持DOUBLE、FLOAT16、BFLOAT16、FLOAT32、INT32、UINT32、INT64、UINT64、INT16、UINT16、INT8、UINT8、BOOL。
- dim（int64_t，计算输入）：公式中的[object Object]，Host侧的整型，取值范围[-self.dim(), self.dim() - 1]。
- index（aclTensor*，计算输入）：公式中的[object Object]，Device侧的aclTensor，数据类型支持INT32、INT64，shape支持0-8维，维度数需要与self一致，且shape需要与out一致，除dim指定的维度外，其他维度的size需要小于等于self对应维度的size，支持，支持ND。
- out（aclTensor*，计算输出）：Device侧的aclTensor，数据类型需要与self一致，shape支持0-8维，且shape需要与index一致，支持，支持ND。
  - Atlas 推理系列产品、Atlas 训练系列产品：数据类型支持DOUBLE、FLOAT16、FLOAT32、INT32、UINT32、INT64、UINT64、INT16、UINT16、INT8、UINT8、BOOL。
  - Atlas A2训练系列产品/Atlas 800I A2推理产品：数据类型支持DOUBLE、FLOAT16、BFLOAT16、FLOAT32、INT32、UINT32、INT64、UINT64、INT16、UINT16、INT8、UINT8、BOOL。
- workspaceSize（uint64_t*，出参）：返回需要在Device侧申请的workspace大小。
- executor（aclOpExecutor**，出参）：返回op执行器，包含了算子计算流程。
返回值：

aclnnStatus：返回状态码，具体参见。

[object Object]

[object Object]

参数说明：
- workspace（void*, 入参）: 在Device侧申请的workspace内存地址。
- workspaceSize（uint64_t, 入参）: 在Device侧申请的workspace大小，由第一段接口aclnnGatherGetWorkspaceSize获取。
- executor（aclOpExecutor*, 入参）: op执行器，包含了算子计算流程。
- stream（aclrtStream, 入参）: 指定执行任务的 AscendCL Stream流。
返回值：

aclnnStatus：返回状态码，具体参见。

[object Object]

无

[object Object]

示例代码如下，仅供参考，具体编译和执行过程请参考。

[object Object]