用于梯度计算的变量被in-place操作

问题现象

打屏日志中存在关键字“one of the variables needed for gradient computation has been modified by an inplace operation”，类似如下屏显信息：

      
           ERROR: test_autograd_backward (__main__.TestMode)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/miniconda3/envs/pt2.1/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 2388, in wrapper
    method(*args, **kwargs)
  File "npu/test_fault_mode.py", line 159, in test_autograd_backward
    torch.autograd.grad(d2.sum(), a)
  File "/root/miniconda3/envs/pt2.1/lib/python3.8/site-packages/torch/autograd/__init__.py", line 394, in grad
    result = Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [5]], which is output 0 of AddBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

故障根因

关键过程：调用torch.autograd.backward的过程中失败。

根本原因分析：原地操作是指直接在原有张量上进行修改，而不创建新的副本。这样做会导致梯度无法正确计算，从而引发上述错误。

处理方法

根据日志信息找到报错的代码行，将原地操作改为非原地操作。比如：将 x += 2 改为 y = x + 2。

Error Code	无
故障事件名称	用于梯度计算的变量被inplace操作
故障解释/可能原因	代码脚本问题
故障影响	反向传播无法正常计算
故障自处理模式	将原地操作的算子改为非原地操作
系统处理建议	无需操作

父主题： 故障案例集