get_compress_heads

功能说明

执行RARopeCompressor后，可在指定路径下，通过get_compress_heads()函数生成.pt文件。

函数原型

RARopeCompressor.get_compress_heads(save_path)

参数说明

参数名

输入/返回值

含义

使用限制

save_path

输入

长序列压缩时，Head压缩头参数文件保存的路径。

必选。

数据类型：String。

调用示例

import torch
from msmodelslim.pytorch.ra_compression import RARopeCompressConfig, RARopeCompressor
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch_npu
torch.npu.set_compile_mode(jit_compile=False)
 
config = RARopeCompressConfig(induction_head_ratio=0.14, echo_head_ratio=0.01)
 
save_path = "./win.pt" 
model_path = "/home/wgw/Meta-Llama-3.1-70B-Instruct/"
 
model = AutoModelForCausalLM.from_pretrained(
        pretrained_model_name_or_path=model_path,
        torch_dtype=torch.bfloat16, 
        trust_remote_code=True,
        device_map="auto",
    ).eval()
 
tokenizer = AutoTokenizer.from_pretrained(
        pretrained_model_name_or_path=model_path,
        pad_token='<|extra_0|>',
        eos_token='<|endoftext|>',
        padding_side='left',
        trust_remote_code=True
    ) 
 
ra = RARopeCompressor(model, tokenizer, config) 
ra.get_compress_heads(save_path)

父主题： RoPE编码类型