Changelog#

See Release Notes for summary information about the latest ICE ClusterWare™ release. This section contains a more detailed ChangeLog history of all recent releases.

12.4.0-g0000 - February 3, 2025#

  • The product is rebranded from Scyld ClusterWare to ICE ClusterWare. Initial changes are reflected in the product GUI and in the documentation. Future releases will introduce additional branding updates, including updates to the command line tools.

  • Implement the first providers plugin, specifically supporting hypervisors running libvirt using virsh and virt-install commands.

  • Include a couple of example deploy scripts in the /opt/scyld/clusterware-tools/examples/deploy directory.

  • Reduce repetitive logging.

  • Implement a new _altmacs reserved attribute that passes alternative MAC addresses for a node to the DHCP server. This attribute may be replaced by a more robust solution in future releases.

  • Significant simplifications and improvements to the scyld-kube tool used for deploying Kubernetes.

  • Mark nodes as "busy" if virsh list shows running virtual machines on the node.

  • A new scyld-modimg --deploy argument allows administrators to execute an Ansible playbook against an image or combine the copy and execute steps for running a shell script inside the image.

  • The scyld-modimg command now accepts a --progress argument to either not print remaining time or to print dots instead of detailed progress.

  • Propagate errors from the debootstrap tool out to the user to simplify Ubuntu image creation debugging.

  • Prevent users with the NoAccess role from even logging in and prevent tmpadmins from minting tokens.

  • Improve parsing of scyld-modimg --run scripts and document the functionality.

  • Add --discard-on-error option to scyld-modimg to facilitate scripting and automation.

  • The scyld-clusterctl nets tool allows admins to define additional networks where ClusterWare nodes may be connected.

  • Improved ClusterWare graphical user interface (GUI) information architecture to help new users navigate the product.

  • Each primitive now presents a set of labeled fields and components within the ClusterWare GUI that are customized to that primitive.

  • Updated ClusterWare GUI colors and logos to match the new product branding.

  • Make ipmitool and rasdaemon weak dependencies of clusterware-node.

  • Implement a new _aim_status reserved attribute and add support in scyld-nodectl status to show status based on that attribute. Contact Penguin Computing to learn more.

  • Rearrange the build system to better isolate Pyramid code.

  • Move image exports from the image to the head that does the export.

  • Replace libvirt power plugin with a version that calls virsh.

  • Remove the deprecated socket-based waitfor code.

  • Add stricter versioned dependencies between some packages.

  • Ensure scyld-clusterctl hosts entries are pushed to scyld-nss.

  • Remove more references to el7 and remove development packages required by el7 builds.

  • Keep the dnsmasq service up during clusterware service restarts.

  • Allow the mosquitto service to start even with missing certificates.

  • Add image locking during modification to prevent administrators from accidentally overwriting each other's changes.

  • Improve scyld-modimg to make conflicts between different instances less likely.

  • Implement shared-key encryption for communications between head nodes using stunnel.

  • Improve our parsing of ip output.

  • Add documentation about communication encryption.

  • Switch telegraf from UDP to HTTP(S) with a new relay service, significantly reducing telemetry gaps.

  • Improved method for deploying client packages to switches.

  • Document how to change the etcd password and create a script to recover if the etcd passwords is lost. Contact Penguin Computing for assistance with the script.

  • Improve the slurm and kubernetes installation scripts.

  • Include the API Reference as a part of our standard documentation.

  • Add a missing dependency required to build newer Ubuntu images.

  • Update the supported distros table to include el8.10 and el9.5.

  • Update documentation information architecture and HTML site design to improve user experience.

  • Assorted other bug fixes and documentation updates.

12.3.0-g0000 - October 4, 2024#

  • Reduce polling in scyld-nodectl status --refresh, but leverage the waitfor framework and MQTT.

  • Switch to a Unix socket to communicate between the ClusterWare backend and etcd to enable updating gRPC.

  • Add a new _bootnet attribute for customizing the name of the bootnet interface.

  • Support --selector to select nodes in slurm-scyld.setup.

  • Introduce an improved clusterware-node deployment mechanism for SONiC switches.

  • Make compute node code scripting less likely to produce a bad parent-head-node line in /etc/hosts.

  • Support creating tmpfs subdirectories in ignition for diskless STIG'd systems.

  • Cleaner handling of the client.sslverify setting.

  • Reduce the head node minimum memory check after removal of Couchbase.

  • Restrict access to the GUI to only accept secure remote connections.

  • Bump the version numbers for most Python dependencies.

  • Correct "frozen" image handling during import and refuse to delete frozen images.

  • Remove deprecated code, including code specific to el7 head nodes.

  • Add functionality for Telegraf to collect ClusterWare node attributes.

  • Change technique for converting node lists into ranges when reporting status.

  • Tighten some directory permissions.

  • Correct the _ipxe_sanboot creation during bootload installation.

  • Fix a scyld-bootctl export failure that previously required a patch.

  • Provide a mechanism for setting a realtime IO priority on etcd.

  • Make it more difficult to modify a cached version of an image unintentionally.

  • Improve gitrepo backend handling to avoid common failures.

  • Stop creating .old.XX files when modifying objects in multi-head clusters.

  • Avoid the MOTD interfering with scyld-nodectl scp.

  • Small fixes to boot chaining failure handling.

  • Wider use of the cluster certificate authority to securing communications.

  • Fixes for netplan configurations in Ubuntu images.

  • Restart Telegraf when moving between head nodes.

  • KeyCloak integration improvements.

  • Assorted other bug fixes and documentation updates.

12.2.0-g0000 - July 26, 2024#

  • Improve Grafana column scaling.

  • Quiet a warning about TripleDES by removing it as an option from paramiko.

  • Support _boot_style=iscsi on el8 and el9 systems.

  • Update CentOS 7 and CentOS Stream 8 URLs to use vault.centos.org since el7 is now also EOL.

  • Improve DNS resolution of head nodes with multiple IPs using localise-queries in the dnsmasq.conf.template but also include a leases.register_heads boolean to disable entire feature.

  • Write NetworkManager connection files on el9 systems and improve netplan configuration file writing on Ubuntu.

  • Initial Redfish support including an aggregation daemon with more changes and documentation coming later.

  • Provide a mechanism to create a bootable ISO from one or more boot configs.

  • Improve handling of slurm uid and gid syncing when installing packages.

  • Add arguments to scyld-nodectl kexec to allow for one-time-booting using a specific image or boot configuration.

  • Improve the scyld-modimg --capture error handling.

  • Downgrade ansible-core to 2.15.10 to match Python 3.9.

  • Small improvements and cleanups across the GUI.

  • Introduce a new RBAC system for administrators, current scoped cluster-wide. All existing admins will now have the FullAdmin role.

  • Support substitution within the power_uri field.

  • Initial support for deploying Harvester nodes from an ISO.

  • Unhide the existing scyld-clusterctl nets functionality.

  • Include the mosquitto MQTT server to publish system events.

  • Confirm keys added through scyld-adminctl can be loaded with paramiko.

  • Improved Ubuntu image handling in scyld-modimg.

  • Expose the limited but existing scyld-nodectl scp functionality.

  • Improve ZTP handling but still only supporting Cumulus.

  • Improve the unknown nodes tab for unrecongized dhcp clients.

  • Include a mechanism to mask attribute values in normal output. Default to masking _remote_pass, _tpm_owner_pass, and _bmc_pass.

  • Make more of an effort to mask the SOL password in output.

  • Prevent the creation of unrecognized reserved attributes and update reserved attributes documentation.

  • Include a sched_watcher agent for collecting node status from slurm.

  • Rework compute node client certificate handling.

  • Clean up dhcp6 error messages.

  • Fix kernel version sorting in sclyd-mkramfs.

  • Update numerous python and npm dependencies.

  • Assorted other bug fixes and documentation updates.

12.1.1-g0000 - January 23, 2024#

  • Assorted fixes for initramfs ignition use when booting el9 nodes.

  • Rework how scyld-nodectl ssh gets node keys allowing for ssh into el9 nodes with FIPS enabled.

  • Print names in place of some UIDs returned by scyld-*ctl tools.

  • Note and handle that ram_total / ram_free are stored in KiB.

  • Check all uses of urlparse().netloc and replace several with urlparse().hostname.

  • Assorted test script and other bug fixes.

12.1.0-g0000 - December 28, 2023#

  • Head node hosted gitrepos can mirror upstream repositories.

  • Several bug fixes around the scyld-nodectl waitfor functionality.

  • Hide the exports section in scyld-imgctl output unless -L is used.

  • Fix a long standing bug during file upload where "Finishing up..." still be displayed after upload was complete.

  • Fix a long standing bug during file upload that caused an additional file checksum computation.

  • Deprecate the nodes.boot_timeout global in favor of a per-node _boot_timeout attribute.

  • Fix head node eject / leave functionality to make it less likely a removed head node will automatically rejoin or try to provide services to compute nodes.

  • Fix PREFER_KMOD handling in /opt/scyld/clusterware-tools/conf/mkramfs.conf

  • Technology preview of a scheduler-watcher that can be used to feed scheduler status into the ClusterWare database. Attribute names and other details may change.

  • Enable the slider to show and hide scheduler status within the GUI if any node has status information.

  • Avoid address-in-use socket errors with multiple backend daemon threads.

  • Fix typos that broke sync-uids and take-snapshot in ClusterWare 12.

  • Make systems for node status, hardware, heath, and monitoring use plugins for easier management.

  • Authenticate with a user's SSH agent if they have already uploaded their public keys into the system.

  • New support for partitioning during boot using ignition. See the documentation for the _ignition reserved attribute for details.

  • Support for installing the GRUB 2 bootloader during boot. See the documentation for the _bootloader reserved attribute for details.

  • Improved image capture capabilities with better error handling and using optional credentials and sudo.

  • Implement a local signing authority for node client certificates stored in node TPMs.

  • Support searching for a node by hostname even when it differs from the ClusterWare node name.

  • Allow matching of naming pools in node selection using the same syntax that already matched dynamic groups.

  • Add support for attaching an attribute group to a naming pool.

  • Add _domain to specify the domain without using _hostname.

  • Confirmed ClusterWare works on Rocky 9.3 and similar distros.

  • Add a mechanism (chroot.env_paths) to define specific environment variables during image creation.

  • Fix several bugs around node renaming that could have permitted multiple nodes with the same MAC or similar issues.

  • Assorted GUI improvements, bug fixes, and performance improvements.

12.0.1-g0000 - July 24, 2023#

  • Reimplement and expose the scyld-nodectl scp functionality.

  • Push scyld-pack-node to systems when running scyld-modimg capture. This also allows us to remove the clusterware-common package.

  • Improve proxy handling during the installation process.

  • Improve the handling of the _hosts attribute.

  • Initial support for scripting scyld-modimg through --run.

  • Provide a mechanism for changing the default hash from sha1 to sha256 or sha512.

  • Deprecate scyld-install --clear in favor of --clear-all.

  • Fix output labelling in scyld-nodectl exec results.

  • Mark node status and the current head node in managed --heads output.

  • Expand image capture to use _remote_user / _remote_pass.

  • Improved Debian / Ubuntu image creation.

  • Use the latest squashfs tools for packing and unpacking images.

  • Assorted bug fixes and performance improvements.

12.0.0-g0000 - April 21, 2023#

  • The first release of ClusterWare version 12. Please see Updating ClusterWare 11 to ClusterWare 12 for more details.

  • Support RHEL / Rocky 9 as a head node and compute node platform.

  • Upgrade to use Python 3.9 on all head node platforms.

  • Entirely rewritten GUI with much more functionality.

  • Switch to Telegraf, InfluxDB version 2, and Grafana instead of TICK. See Grafana Telemetry Dashboard for details about Grafana.

  • Initial support for GRUB 2 as an alternative for iPXE.

  • Configure chrony at install time for time sync within the cluster.

  • Update managedb save to default to saving ONLY the database.

  • Fix selection language matching for attributes[_boot_config].

  • Include a newer (4.6) version of squashfs tools for more recent SELinux-related features.

  • Allow command line clients to authenticate by signing messages with their SSH keys.

  • Remove banner.txt support and use SSH LogLevel to control banner display when executing remote commands.

  • Avoid a crash when two attributes only differ in capitalization.

  • Fix "accept unknown nodes" behavior.

  • Fix behavior of scyld-nodectl exec --label.

  • Implement a new JWT-based authentication system with refresh tokens.

  • New in-memory caching and indexing mechanism to improve document store lookup times.

  • Provide a mechanism to record additional DNS mappings in the ClusterWare database.

  • Default to installing config-less Slurm.

  • Provide a tool to create a scyld-kube.iso for installation on clusters without internet access.

  • Support booting nodes using UEFI in HTTP mode.

  • Implement a restricted status-updater for "busy" nodes in C code, and provide attribute _status_cpuset to restrict cw-status-updater service subprocesses to a specific set of CPU cores.

  • Remove all references to Couchbase and some remaining NFS references.

  • Enable scyld-nss by default on head nodes for name resolution.

  • Use the dracut version native to the image instead of a custom ClusterWare version.

  • Multi-head clusters now automatically rebalance nodes between heads.

  • Many other bug fixes and optimizations.