Create Login Nodes#
In many cluster environments, especially in traditional HPC environments, end users require access to a login node to interface with the system. The end user can log into that node to submit jobs, check on job status, and examine the results of completed jobs. After a complex computation, the results can be quite large and users may prefer to visualize the results on the cluster rather than downloading the content to visualize it on their local machine. Some clusters provide individual login nodes to specific end users whereas others allocate larger shared systems accessible to all end users, or use a mixture of both approaches.
Login nodes also provide end users with access to their home directory. The home directory is usually provided on a shared file system so that the same files and paths are available on the compute nodes during jobs. On a small cluster that shared file system may be provided by a simple NFS mount, but in more complex environments a more performant and resilient parallel file system (WekaIO, GPFS, etc.) is preferable. Configuring the shared file system is beyond the scope of this document. For NFS sharing, consult the relevant documentation for your chosen operating system.
Access control to login nodes is highly cluster-specific and also not covered in this document. The usual approaches consist of installing an authentication client into the login node image and/or configuring PAM within the image to utilize a site-wide identity provider. In addition to providing a single point of control for who can submit jobs on the cluster, that identity provider is useful to keep UIDs and GIDs consistent across the cluster.
The following files referenced in this example can be found in
/opt/scyld/clusterware-tools/examples
.
deploy/virsh.sh
Files in the
deploy/
subdirectory are example scripts to install packages into images or locally installed nodes. These scripts assume that the head node has access to the internet. You may need to make modifications to the scripts to install from non-standard sources.partitions.butane
The
partitions.butane
file provides a Butane (https://coreos.github.io/butane) configuration to minimally partition a local/dev/sda
drive for deployment. That script will likely require modifications to match the hardware used for deployment.wipe-disks.sh
The
wipe-disks.sh
script can be used to entirely remove all partitions from a node; however, extreme caution should be used and the script must be modified before it is run to prevent accidental execution.
The ICE ClusterWare™ platform implements “providers” plugins to help cluster administrators allocate and manage login nodes. The following example uses the simplest provider plugin, virsh, to mark a ClusterWare node as a hypervisor, install the appropriate libvirt packages, allocate a virtual login node on that hypervisor, and deploy the newly created login image to the virtual machine’s local disk.
Create an image that includes the necessary libvirt packages.
Clone the DefaultImage to a new name:
scyld-imgctl -iDefaultImage clone name=HypervisorImage
Tip
Although not necessary, defining the hypervisor node with a memorable name will make organizing the cluster easier and also provide a place to set several variables necessary for persistent image deployment.
Create a boot configuration that uses the new image:
scyld-bootctl -iDefaultBoot clone name=HypervisorBoot image=HypervisorImage
Deploy virsh packages into the new image:
scyld-modimg -iHypervisorImage --deploy deploy/virsh.sh \ --discard-on-error --upload --overwrite
Configure the attributes and naming pool for the login node.
Create an attribute group and include attributes used for deploying the "HypervisorBoot" as a persistent installation on the local disk:
scyld-attribctl create name=hypers scyld-attribctl -ihypers set _boot_config=HypervisorBoot \ _ignition=partitions.butane _bootloader=grub \ _boot_style=disked _disk_root=LABEL=root
Create the “hypers” naming pool and define a node using it:
scyld-clusterctl pools create name=hypers pattern=hyper{} group=hypers
Create the hyper0 hypervisor node:
scyld-nodectl create mac=00:28:50:34:0f:ce \ power_uri=ipmi:///root:password@10.110.10.35 naming_pool=hypers
Deploy that image to hyper0 by rebooting hyper0 and forcing a PXE Boot to start deploying the image:
scyld-nodectl -ihyper0 reboot then power setnext pxe then waitfor up
This command may take several minutes depending on how long it takes hyper0 to boot. During this time, the local disk is partitioned, the image is unpacked onto those partitions, grub is installed, and then hyper0 reboots to the local disk.
Once hyper0 boots, define a provider instance associated with that node and report the resources on that hypervisor.
Create a provider pointing at hyper0:
scyld-clusterctl providers create name=hyper0 type=virsh \ spec=’{“server”: “hyper0”}’
Report the resources to confirm the connection works:
scyld-clusterctl providers -ihyper0 resources
Create a login node image.
Clone the DefaultImage:
scyld-imgctl -iDefaultImage clone name=LoginImage
Create a boot configuration that uses the new image:
scyld-bootctl -iDefaultBoot clone name=LoginBoot image=LoginImage
Tip
Although this example only clones the DefaultImage, additional libraries and applications may make sense depending on the cluster hardware and expected workloads. This is also the time to configure the image to work with a site-wide identity provider.
Define attributes and naming for the login node(s).
Create an attribute group and naming pool for login nodes:
scyld-attribctl create name=logins scyld-attribctl -ilogins set _boot_config=LoginBoot
Create the logins naming pool:
scyld-clusterctl pools create name=logins pattern=login{:02d} group=logins
Create a virtual machine using the previously defined provider instance.
Allocate a virtual machine (VM) and attach it to the logins naming pool:
scyld-clusterctl providers -ihyper0 alloc --attach logins \ --cpus 4 --memory 8G --disk 20G
Wait for the new login node to boot:
scyld-nodectl -ilogin00 waitfor up
The created virtual machine can be accessed just like any other ClusterWare node.