InternVL2-8B

language: - en pipeline_tag: text-generation tags: - pretrained license: other hardwares: - NPU frameworks: - PyTorch library_name: openmind

InternVL2-8B

InternVL2-8B 是一种多模态大模型，具有强大的图像和文本处理能力，通过开源组件缩小与商业多模态模型的差距——GPT-4V的开源替代方案。在聊天机器人中，InternVL可以通过解析用户的文字输入，结合图像信息，生成更加生动、准确的回复。此外，InternVL还可以根据用户的图像输入，提供相关的文本信息，实现更加智能化的交互。

准备模型

目前提供的MindIE镜像预置了 InternVL2-8B 模型推理脚本，无需使用本仓库自带的atb_models中的代码

加载镜像

前往昇腾社区/开发资源下载适配本模型的镜像包：1.0.0-800I-A2-py311-openeuler24.03-lts

完成之后，请使用docker images命令确认查找具体镜像名称与标签。

硬件要求

部署InternVL2-8B模型至少需要1台Atlas 800I A2 32G服务器

新建容器

自行修改端口等参数，启动样例

docker run -dit -u root \
--name ${容器名} \
-e ASCEND_RUNTIME_OPTIONS=NODRV \
--privileged=true \
-v /home/路径:/home/路径 \
-v /data:/data \
-v /usr/local/Ascend/driver/:/usr/local/Ascend/driver/ \
-v /usr/local/Ascend/firmware/:/usr/local/Ascend/firmware/ \
-v /usr/local/sbin/:/usr/local/sbin \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
--shm-size=100g \
-p ${映射端口}:22 \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
${MindIE 1.0.0 镜像} \
/bin/bash

进入容器

docker exec -it ${容器名} bash

安装Python依赖

cd /usr/local/Ascend/atb-models
pip install -r requirements/models/requirements_internvl.txt

纯模型推理

修改/usr/local/Ascend/atb-models/examples/models/internvl/run_pa.sh脚本

# 设置卡数，Atlas-800I-A2-32G必须八卡，Atlas-800I-A2-64G四卡八卡均可
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

运行脚本可参考run_pa.sh同级目录下的README.md。

bash /usr/local/Ascend/atb-models/examples/models/internvl/run_pa.sh --run --trust_remote_code ${权重路径} ${图片或视频所在文件夹路径}

服务化推理

打开配置文件

vim /usr/local/Ascend/mindie/latest/mindie-service/conf/config.json

更改配置文件

{
...
"ServerConfig" :
{
...
"port" : 1040, #自定义
"managementPort" : 1041, #自定义
"metricsPort" : 1042, #自定义
...
"httpsEnabled" : false,
...
},

"BackendConfig": {
...
"npuDeviceIds" : [[0,1,2,3,4,5,6,7]],
...
"ModelDeployConfig":
{
"maxSeqLen" : 50000,
"maxInputTokenLen" : 50000,
"truncation" : false,
"ModelConfig" : [
{
"modelInstanceType": "Standard",
"modelName" : "internvl", # 为了方便使用benchmark测试，modelname建议使用internvl
"modelWeightPath" : "/data/datasets/InternVL2-8B",
"worldSize" : 8,
...
"npuMemSize" : 8, #kvcache分配，可自行调整，单位是GB，切勿设置为-1，需要给vit预留显存空间
...
"trustRemoteCode" : false #默认为false，若设为true，则信任本地代码，用户需自行承担风险
}
]
},
"ScheduleConfig" :
{
...
"maxPrefillTokens" : 50000,
"maxIterTimes": 4096,
...
}
}
}

设置运行多卡环境变量

export MASTER_ADDR=localhost 
export MASTER_PORT=7896

拉起服务化

cd /usr/local/Ascend/mindie/latest/mindie-service/bin
./mindieservice_daemon

容器内新端口测试 VLLM接口

curl 127.0.0.1:1040/generate -d '{
"prompt": [
{
"type": "image_url",
"image_url": ${图片路径}
},
{"type": "text", "text": "Explain the details in the image."}
],
"max_tokens": 512,
"stream": false,
"do_sample":true,
"repetition_penalty": 1.00,
"temperature": 0.01,
"top_p": 0.001,
"top_k": 1,
"model": "internvl"
}'

容器内新端口测试 OpenAI 接口

curl 127.0.0.1:1040/v1/chat/completions -d ' {
"model": "internvl",
"messages": [{
"role": "user",
"content": [
{"type": "image_url", "image_url": ${图片路径}},
{"type": "text", "text": "Explain the details in the image."}
]
}],
"max_tokens": 512,
"do_sample": true,
"repetition_penalty": 1.00,
"temperature": 0.01,
"top_p": 0.001,
"top_k": 1
}'

使用模型资源和服务前，请您仔细阅读并理解透彻《昇腾深度学习模型许可协议 3.0》

InternVL2-8B

language: - en pipeline_tag: text-generation tags: - pretrained license: other hardwares: - NPU frameworks: - PyTorch library_name: openmind

InternVL2-8B

准备模型

加载镜像

硬件要求

新建容器

进入容器

安装Python依赖

纯模型推理

服务化推理

关于昇腾

新闻与活动

交流与资讯

支持与服务

开源社区

About Ascend

Communication and Information

Links