Configure OpenSM with ICE ClusterWare#

Note

Using OpenSM with ICE ClusterWare ™ is supported if you have paid for multi-tenant support. Contact Penguin Computing to learn more.

Use the steps in this section to get the OpenSM subnet manager software up and running with the ClusterWare software and an InfiniBand network. Start by creating a new image that includes the OFED drivers, configure OpenSM, and finally confirm that the ClusterWare compute nodes can communicate via the InfiniBand network.

These steps are required to enable multi-tenant cluster support and network isolation between tenancies. See Tenancy Network Isolation for details.

Prerequisites#

  1. Connect your bare metal nodes with InfiniBand cards to an InfiniBand switch.

  2. Add all nodes to ClusterWare as compute nodes. One of the bare metal compute nodes will become the OpenSM node.

Configure Compute Nodes#

Create and deploy an image with the appropriate OFED drivers installed.

  1. Download the MLNX OFED drivers for Rocky 9 using the following instructions: https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/

  2. Create a clone of DefaultImage named IBImage:

    cw-imgctl -i DefaultImage clone name=${IBImage}
    
  3. Create a clone of DefaultBoot named IBBoot:

    cw-bootctl -i DefaultBoot clone name=${IBBoot}
    
  4. Add the IBImage image to the IBBoot boot configuration:

    cw-bootctl -i ${IBBoot} update image=${IBImage}
    
  5. Install required packages to the image:

    cw-modimg -i ${IBImage} --install "perl tk lsof gcc-gfortran libusbx tcl \
    libtool patch kernel-rpm-macros rpm-build autoconf automake" --no-discard \
    --overwrite --upload
    
  6. Find your boot configuration's kernel image version:

    KVER=$(cw-bootctl ls -l --json | jq -r .${IBBoot}.release)
    
  7. Install the kernel-devel package that matches the kernel version used by your image.

  8. Enter the chroot for the new image, add the OFED driver ISO you downloaded previously, and specify the kernel version after --chroot:

    cw-modimg -i ${IBImage} --copyin <OFED driver ISO> /root/ --chroot ${KVER}
    
    1. Within the chroot, mount the OFED driver ISO:

      mount -o ro,loop /root/<OFED driver ISO> /mnt
      
    2. Within the chroot, install the OFED drivers:

      /mnt/mlnxofedinstall --add-kernel-support --skip-repo
      
    3. Exit the chroot. When prompted, keep the changes and say yes to replacing the local image, uploading the image, and replacing the remote image. For example:

      bash-5.1# exit
      exit
      (K)eep changes or (d)iscard? [kd] k
        step completed in 0:36:54.1
      Replace local image? [yn] y
      Repacking IBImage
        fixing SELinux file labels...
          done.
        100.0% complete, elapsed: 0:02:35.6 remaining: 0:00:01.6
      Checksumming...
        elapsed: 0:00:05.8
      Cleaning up.
      Upload image? [yn] y
      Checksumming image IBImage
        elapsed: 0:00:04.8
      Replace remote image? [yn] y
      Replacing remote image.
        100.0% complete, elapsed: 0:00:06.4 remaining: 0:00:00.0
        done.
      
  9. Update initramfs in the IBBoot boot configuration:

    cw-mkramfs --update ${IBBoot}
    
  10. Assign the new image to the ClusterWare compute nodes connected to the InfiniBand network:

    cw-nodectl -in[<nodes>] set _boot_config=${IBBoot}
    
  11. Reboot the nodes:

    cw-nodectl -in[<nodes>] reboot
    

Set Up OpenSM Node#

One of the bare metal compute nodes with the OFED drivers will become the OpenSM node. On that node:

  1. Create an OpenSM configuration file /etc/opensm/opensm.conf and set the priority level and port GUID:

    ## Sets opensm priority to the highest level: 15
    sm_priority 15
    
    ## The local port GUID with which OpenSM should bind.
    ## If you don't add this line, opensm will default to
    ## the first active port on the first adapter.
    guid <local port guid>
    
  2. Enable OpenSM on boot:

    systemctl enable opensmd
    
  3. Start the opensm service:

    systemctl start opensmd
    
  4. Verify that OpenSM has entered MASTER state:

    [admin@n55 ~]# systemctl status opensmd
    opensmd.service - OpenSM
      Loaded: loaded (/usr/lib/systemd/system/opensmd.service; disabled; preset: disabled)
      Active: active (running) since 2s ago
    Main PID: 4151212 (opensm)
       Tasks: 178 (limit: 1228633)
      Memory: 44.8M
         CPU: 209ms
      CGroup: /system.slice/opensmd.service
              ├─4151212 /usr/sbin/opensm
              └─4151215 osm_crashd
    n55.cluster.local OpenSM[4151212]:  Loading Cached Option:guid_2 = 0x1c34da030042ff21
    n55.cluster.local OpenSM[4151212]: /var/log/opensm.log log file opened
    n55.cluster.local OpenSM[4151212]: OpenSM 5.21.12.MLNX20250617.f74e01b8
    n55.cluster.local opensm[4151212]: OpenSM 5.21.12.MLNX20250617.f74e01b8
    n55.cluster.local OpenSM[4151212]: Entering DISCOVERING state
    n55.cluster.local opensm[4151212]: Entering DISCOVERING state
    n55.cluster.local OpenSM[4151212]: Entering MASTER state
    

    If OpenSM does not enter MASTER state, ensure that:

    • The OpenSM instance is given the highest priority (level 15).

    • There are no other OpenSM services bound to the same network that are set to MASTER.

Using OpenSM with Compute Nodes#

After configuring OpenSM and setting up the compute node boot configuration and images, you can review InfiniBand device information and test connections.

View InfiniBand Device Information#

You can validate the InfiniBand device information on any compute node connected to the InfiniBand network.

  1. On a compute node, list the available InfiniBand devices using ibv_devices. The number of devices varies based on the number of devices on the local machine. For example:

    # ibv_devices
    device node GUID
    ------ ----------------
    mlx4_0 0002c903003178f0
    mlx4_1 f4521403007bcba0
    
  2. Use ibv_devinfo to display information about one of the devices. For example, for mlx4_1:

    # ibv_devinfo -d mlx4_1
    hca_id: mlx4_1
      transport: InfiniBand (0)
      fw_ver: 2.30.8000
      node_guid: f452:1403:007b:cba0
      sys_image_guid: f452:1403:007b:cba3
      vendor_id: 0x02c9
      vendor_part_id: 4099
      hw_ver: 0x0
      board_id: MT_1090120019
      phys_port_cnt: 2
      port: 1
        state: PORT_ACTIVE (4)
        max_mtu: 4096 (5)
        active_mtu: 2048 (4)
        sm_lid: 2
        port_lid: 2
        port_lmc: 0x01
        link_layer: InfiniBand
      port: 2
        state: PORT_ACTIVE (4)
        max_mtu: 4096 (5)
        active_mtu: 4096 (5)
        sm_lid: 0
        port_lid: 0
        port_lmc: 0x00
        link_layer: Ethernet
    
  3. Use ibstat to display the status of the device. For example, for mlx4_1:

    # ibstat mlx4_1
    CA 'mlx4_1'
      CA type: MT4099
      Number of ports: 2
      Firmware version: 2.30.8000
      Hardware version: 0
      Node GUID: 0xf4521403007bcba0
      System image GUID: 0xf4521403007bcba3
      Port 1:
        State: Active
        Physical state: LinkUp
        Rate: 56
        Base lid: 2
        LMC: 1
        SM lid: 2
        Capability mask: 0x0251486a
        Port GUID: 0xf4521403007bcba1
        Link layer: InfiniBand
      Port 2:
        State: Active
        Physical state: LinkUp
        Rate: 40
        Base lid: 0
        LMC: 0
        SM lid: 0
        Capability mask: 0x04010000
        Port GUID: 0xf65214fffe7bcba2
        Link layer: Ethernet
    

Test InfiniBand Connections#

Use the ibping utility to ping an InfiniBand address and run as a client/server pair.

  1. On the receiving node, start server mode -S on port number -P with -C InfiniBand channel adapter (CA) name on the host:

    # ibping -S -C mlx4_1 -P 1
    
  2. On the sending node, start client mode and send some packets -c on port number -P by using -C InfiniBand channel adapter (CA) name with -L Local Identifier (LID) on the host:

    # ibping -c 10 -C mlx4_0 -P 1 -L 2