torch_npu.contrib.npu_fused_attention_with_layernorm

torch_npu.contrib.npu_fused_attention_with_layernorm(hidden_states, attention_mask, query_kernel, key_kernel, value_kernel, query_bias, key_bias, value_bias, gamma, beta, scale=1, keep_prob=0)

bert自注意力与前层规范的融合实现。