配置硬件IOMMU

Atlas 800 推理服务器(型号:3010)/Atlas 800 训练服务器(型号:9010)/Atlas 200T A2 Box16 异构子框/Atlas 900 A3 SuperPoD 集群基础单元在安装操作系统后,需要执行此操作。

操作步骤

  1. 登录物理机OS环境。
  2. 修改配置文件。

    打开配置文件,增加如下加粗字体标注的代码并保存。

    • Ubuntu 20.04/veLinux 1.1/veLinux 1.1(5.10.200)配置文件:/boot/grub/grub.cfg
      linux   /boot/vmlinuz-5.4.0-26-generic root=UUID=2774bbf9-2b1d-46fd-b23b-a071e5bf6779 ro iommu=pt intel_iommu=on
      initrd  /boot/initrd.img-5.4.0-26-generic
    • openEuler 22.03 LTS/openEuler 20.03 LTS配置文件:/boot/efi/EFI/openEuler/grub.cfg
      echo    'Loading Linux 5.10.0-60.18.0.50.oe2203.x86_64 ...
      linux   /vmlinuz-5.10.0-60.18.0.50.oe2203.x86_64 root=UUID=12d3375b-ec4d-49f3-8b08-12b2d8c114cc ro default_hugepagesz=512M hugepagesz=512M hugepages=128 iommu.passthrough=1 resume=UUID=7340c11b-f4f2-4e1a-9abe-65e3e585e79b rhgb quiet crashkernel=512M default_hugepagesz=512M hugepagesz=512M hugepages=128 iommu.passthrough=1 intel_iommu=on iommu=pt
    • Tlinux3.1/BC Linux 8.2配置文件:/etc/default/grub
      增加如下加粗内容并保存,执行grub2-mkconfig -o /boot/grub2/grub.cfg
      GRUB_TIMEOUT=5
      GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
      GRUB_DEFAULT=saved
      GRUB_DISABLE_SUBMENU=true
      GRUB_TERMINAL_OUTPUT="console"
      GRUB_CMDLINE_LINUX="crashkernel=1G-4G:168M,4G-64G:256M,64G-:512M resume=/dev/mapper/ts-swap rd.lvm.lv=ts/root rd.lvm.lv=ts/swap iommu=pt intel_iommu=on"
      GRUB_DISABLE_RECOVERY="true"
      GRUB_ENABLE_BLSCFG=true
    • Kylin V10 SP3配置文件:/boot/efi/EFI/kylin/grub.cfg
      linux/vmlinuz-4.19.90-52.22.v2207.ky10.x86_64 root=/dev/mapper/klas-root ro resume=/dev/mapper/klas-swap rd.lvm.lv=klas/root rd.lvm.lv=klas/swap rhgb quiet crashkernel=1024M,high audit=0 intel_iommu=on iommu=pt
    • Debian10(veLinux 1.3)配置文件:/etc/default/grub
      增加如下加粗内容并保存,执行update-grub
      GRUB_DEFAULT=0
      GRUB_TIMEOUT=5
      GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
      GRUB_CMDLINE_LINUX_DEFAULT="quiet"
      GRUB_CMDLINE_LINUX="iommu.passthrough=0 default_hugepagesz=16G hugepagesz=16G hugepages=32"

  3. 重启物理机OS。

    reboot

  4. 确认配置是否生效。

    执行如下命令,确认是否打开IOMMU。

    dmesg | grep -Ei "DMAR | IOMMU"

    显示IOMMU enabled或Adding to iommu group,表示IOMMU已打开。
    [    0.023162] ACPI: DMAR 0x000000007377D000 0002E2 (v01 INSYDE PRLY-RF  00000001 ACPI 00000001)
    [    0.464468] DMAR: IOMMU enabled
    [    0.794628] DMAR-IR: IOAPIC id 12 under DRHD base  0xc5ffc000 IOMMU 6
    [    0.794630] DMAR-IR: IOAPIC id 11 under DRHD base  0xb87fc000 IOMMU 5
    [    0.794633] DMAR-IR: IOAPIC id 10 under DRHD base  0xaaffc000 IOMMU 4
    [    0.794635] DMAR-IR: IOAPIC id 18 under DRHD base  0xfbffc000 IOMMU 3
    [    0.794637] DMAR-IR: IOAPIC id 17 under DRHD base  0xee7fc000 IOMMU 2
    [    0.794639] DMAR-IR: IOAPIC id 16 under DRHD base  0xe0ffc000 IOMMU 1
    [    0.794642] DMAR-IR: IOAPIC id 15 under DRHD base  0xd37fc000 IOMMU 0
    [    0.794644] DMAR-IR: IOAPIC id 8 under DRHD base  0x9d7fc000 IOMMU 7
    [    0.794646] DMAR-IR: IOAPIC id 9 under DRHD base  0x9d7fc000 IOMMU 7