Changelog#
See Release Notes for summary information about the latest ICE ClusterWare™ release. This section contains a more detailed ChangeLog history of all recent releases.
12.4.0-g0000 - February 3, 2025#
The product is rebranded from Scyld ClusterWare to ICE ClusterWare. Initial changes are reflected in the product GUI and in the documentation. Future releases will introduce additional branding updates, including updates to the command line tools.
Implement the first providers plugin, specifically supporting hypervisors running libvirt using virsh and virt-install commands.
Include a couple of example deploy scripts in the /opt/scyld/clusterware-tools/examples/deploy directory.
Reduce repetitive logging.
Implement a new _altmacs reserved attribute that passes alternative MAC addresses for a node to the DHCP server. This attribute may be replaced by a more robust solution in future releases.
Significant simplifications and improvements to the
scyld-kube
tool used for deploying Kubernetes.Mark nodes as "busy" if virsh list shows running virtual machines on the node.
A new
scyld-modimg --deploy
argument allows administrators to execute an Ansible playbook against an image or combine the copy and execute steps for running a shell script inside the image.The
scyld-modimg
command now accepts a--progress
argument to either not print remaining time or to print dots instead of detailed progress.Propagate errors from the debootstrap tool out to the user to simplify Ubuntu image creation debugging.
Prevent users with the NoAccess role from even logging in and prevent tmpadmins from minting tokens.
Improve parsing of
scyld-modimg --run
scripts and document the functionality.Add
--discard-on-error
option toscyld-modimg
to facilitate scripting and automation.The
scyld-clusterctl nets
tool allows admins to define additional networks where ClusterWare nodes may be connected.Improved ClusterWare graphical user interface (GUI) information architecture to help new users navigate the product.
Each primitive now presents a set of labeled fields and components within the ClusterWare GUI that are customized to that primitive.
Updated ClusterWare GUI colors and logos to match the new product branding.
Make ipmitool and rasdaemon weak dependencies of clusterware-node.
Implement a new _aim_status reserved attribute and add support in
scyld-nodectl status
to show status based on that attribute. Contact Penguin Computing to learn more.Rearrange the build system to better isolate Pyramid code.
Move image exports from the image to the head that does the export.
Replace libvirt power plugin with a version that calls
virsh
.Remove the deprecated socket-based waitfor code.
Add stricter versioned dependencies between some packages.
Ensure
scyld-clusterctl hosts
entries are pushed to scyld-nss.Remove more references to el7 and remove development packages required by el7 builds.
Keep the dnsmasq service up during clusterware service restarts.
Allow the mosquitto service to start even with missing certificates.
Add image locking during modification to prevent administrators from accidentally overwriting each other's changes.
Improve
scyld-modimg
to make conflicts between different instances less likely.Implement shared-key encryption for communications between head nodes using stunnel.
Improve our parsing of
ip
output.Add documentation about communication encryption.
Switch telegraf from UDP to HTTP(S) with a new relay service, significantly reducing telemetry gaps.
Improved method for deploying client packages to switches.
Document how to change the etcd password and create a script to recover if the etcd passwords is lost. Contact Penguin Computing for assistance with the script.
Improve the slurm and kubernetes installation scripts.
Include the API Reference as a part of our standard documentation.
Add a missing dependency required to build newer Ubuntu images.
Update the supported distros table to include el8.10 and el9.5.
Update documentation information architecture and HTML site design to improve user experience.
Assorted other bug fixes and documentation updates.
12.3.0-g0000 - October 4, 2024#
Reduce polling in
scyld-nodectl status --refresh
, but leverage the waitfor framework and MQTT.Switch to a Unix socket to communicate between the ClusterWare backend and etcd to enable updating gRPC.
Add a new _bootnet attribute for customizing the name of the bootnet interface.
Support
--selector
to select nodes inslurm-scyld.setup
.Introduce an improved clusterware-node deployment mechanism for SONiC switches.
Make compute node code scripting less likely to produce a bad parent-head-node line in /etc/hosts.
Support creating tmpfs subdirectories in ignition for diskless STIG'd systems.
Cleaner handling of the client.sslverify setting.
Reduce the head node minimum memory check after removal of Couchbase.
Restrict access to the GUI to only accept secure remote connections.
Bump the version numbers for most Python dependencies.
Correct "frozen" image handling during import and refuse to delete frozen images.
Remove deprecated code, including code specific to el7 head nodes.
Add functionality for Telegraf to collect ClusterWare node attributes.
Change technique for converting node lists into ranges when reporting status.
Tighten some directory permissions.
Correct the _ipxe_sanboot creation during bootload installation.
Fix a
scyld-bootctl export
failure that previously required a patch.Provide a mechanism for setting a realtime IO priority on etcd.
Make it more difficult to modify a cached version of an image unintentionally.
Improve gitrepo backend handling to avoid common failures.
Stop creating .old.XX files when modifying objects in multi-head clusters.
Avoid the MOTD interfering with
scyld-nodectl scp
.Small fixes to boot chaining failure handling.
Wider use of the cluster certificate authority to securing communications.
Fixes for netplan configurations in Ubuntu images.
Restart Telegraf when moving between head nodes.
KeyCloak integration improvements.
Assorted other bug fixes and documentation updates.
12.2.0-g0000 - July 26, 2024#
Improve Grafana column scaling.
Quiet a warning about TripleDES by removing it as an option from paramiko.
Support
_boot_style=iscsi
on el8 and el9 systems.Update CentOS 7 and CentOS Stream 8 URLs to use vault.centos.org since el7 is now also EOL.
Improve DNS resolution of head nodes with multiple IPs using localise-queries in the dnsmasq.conf.template but also include a leases.register_heads boolean to disable entire feature.
Write NetworkManager connection files on el9 systems and improve netplan configuration file writing on Ubuntu.
Initial Redfish support including an aggregation daemon with more changes and documentation coming later.
Provide a mechanism to create a bootable ISO from one or more boot configs.
Improve handling of slurm uid and gid syncing when installing packages.
Add arguments to
scyld-nodectl kexec
to allow for one-time-booting using a specific image or boot configuration.Improve the
scyld-modimg --capture
error handling.Downgrade ansible-core to 2.15.10 to match Python 3.9.
Small improvements and cleanups across the GUI.
Introduce a new RBAC system for administrators, current scoped cluster-wide. All existing admins will now have the FullAdmin role.
Support substitution within the power_uri field.
Initial support for deploying Harvester nodes from an ISO.
Unhide the existing
scyld-clusterctl nets
functionality.Include the mosquitto MQTT server to publish system events.
Confirm keys added through
scyld-adminctl
can be loaded with paramiko.Improved Ubuntu image handling in
scyld-modimg
.Expose the limited but existing
scyld-nodectl scp
functionality.Improve ZTP handling but still only supporting Cumulus.
Improve the unknown nodes tab for unrecongized dhcp clients.
Include a mechanism to mask attribute values in normal output. Default to masking _remote_pass, _tpm_owner_pass, and _bmc_pass.
Make more of an effort to mask the SOL password in output.
Prevent the creation of unrecognized reserved attributes and update reserved attributes documentation.
Include a sched_watcher agent for collecting node status from slurm.
Rework compute node client certificate handling.
Clean up dhcp6 error messages.
Fix kernel version sorting in
sclyd-mkramfs
.Update numerous python and npm dependencies.
Assorted other bug fixes and documentation updates.
12.1.1-g0000 - January 23, 2024#
Assorted fixes for initramfs ignition use when booting el9 nodes.
Rework how
scyld-nodectl ssh
gets node keys allowing for ssh into el9 nodes with FIPS enabled.Print names in place of some UIDs returned by
scyld-*ctl
tools.Note and handle that ram_total / ram_free are stored in KiB.
Check all uses of urlparse().netloc and replace several with urlparse().hostname.
Assorted test script and other bug fixes.
12.1.0-g0000 - December 28, 2023#
Head node hosted gitrepos can mirror upstream repositories.
Several bug fixes around the
scyld-nodectl waitfor
functionality.Hide the exports section in
scyld-imgctl
output unless -L is used.Fix a long standing bug during file upload where "Finishing up..." still be displayed after upload was complete.
Fix a long standing bug during file upload that caused an additional file checksum computation.
Deprecate the nodes.boot_timeout global in favor of a per-node _boot_timeout attribute.
Fix head node eject / leave functionality to make it less likely a removed head node will automatically rejoin or try to provide services to compute nodes.
Fix PREFER_KMOD handling in
/opt/scyld/clusterware-tools/conf/mkramfs.conf
Technology preview of a scheduler-watcher that can be used to feed scheduler status into the ClusterWare database. Attribute names and other details may change.
Enable the slider to show and hide scheduler status within the GUI if any node has status information.
Avoid address-in-use socket errors with multiple backend daemon threads.
Fix typos that broke sync-uids and take-snapshot in ClusterWare 12.
Make systems for node status, hardware, heath, and monitoring use plugins for easier management.
Authenticate with a user's SSH agent if they have already uploaded their public keys into the system.
New support for partitioning during boot using ignition. See the documentation for the _ignition reserved attribute for details.
Support for installing the GRUB 2 bootloader during boot. See the documentation for the _bootloader reserved attribute for details.
Improved image capture capabilities with better error handling and using optional credentials and sudo.
Implement a local signing authority for node client certificates stored in node TPMs.
Support searching for a node by hostname even when it differs from the ClusterWare node name.
Allow matching of naming pools in node selection using the same syntax that already matched dynamic groups.
Add support for attaching an attribute group to a naming pool.
Add _domain to specify the domain without using _hostname.
Confirmed ClusterWare works on Rocky 9.3 and similar distros.
Add a mechanism (chroot.env_paths) to define specific environment variables during image creation.
Fix several bugs around node renaming that could have permitted multiple nodes with the same MAC or similar issues.
Assorted GUI improvements, bug fixes, and performance improvements.
12.0.1-g0000 - July 24, 2023#
Reimplement and expose the
scyld-nodectl scp
functionality.Push
scyld-pack-node
to systems when runningscyld-modimg capture
. This also allows us to remove the clusterware-common package.Improve proxy handling during the installation process.
Improve the handling of the _hosts attribute.
Initial support for scripting
scyld-modimg
through--run
.Provide a mechanism for changing the default hash from sha1 to sha256 or sha512.
Deprecate
scyld-install --clear
in favor of--clear-all
.Fix output labelling in
scyld-nodectl exec
results.Mark node status and the current head node in
managed --heads
output.Expand image capture to use _remote_user / _remote_pass.
Improved Debian / Ubuntu image creation.
Use the latest squashfs tools for packing and unpacking images.
Assorted bug fixes and performance improvements.
12.0.0-g0000 - April 21, 2023#
The first release of ClusterWare version 12. Please see Updating ClusterWare 11 to ClusterWare 12 for more details.
Support RHEL / Rocky 9 as a head node and compute node platform.
Upgrade to use Python 3.9 on all head node platforms.
Entirely rewritten GUI with much more functionality.
Switch to Telegraf, InfluxDB version 2, and Grafana instead of TICK. See Grafana Telemetry Dashboard for details about Grafana.
Initial support for GRUB 2 as an alternative for iPXE.
Configure chrony at install time for time sync within the cluster.
Update
managedb save
to default to saving ONLY the database.Fix selection language matching for attributes[_boot_config].
Include a newer (4.6) version of squashfs tools for more recent SELinux-related features.
Allow command line clients to authenticate by signing messages with their SSH keys.
Remove banner.txt support and use SSH LogLevel to control banner display when executing remote commands.
Avoid a crash when two attributes only differ in capitalization.
Fix "accept unknown nodes" behavior.
Fix behavior of
scyld-nodectl exec --label
.Implement a new JWT-based authentication system with refresh tokens.
New in-memory caching and indexing mechanism to improve document store lookup times.
Provide a mechanism to record additional DNS mappings in the ClusterWare database.
Default to installing config-less Slurm.
Provide a tool to create a scyld-kube.iso for installation on clusters without internet access.
Support booting nodes using UEFI in HTTP mode.
Implement a restricted
status-updater
for "busy" nodes in C code, and provide attribute _status_cpuset to restrict cw-status-updater service subprocesses to a specific set of CPU cores.Remove all references to Couchbase and some remaining NFS references.
Enable scyld-nss by default on head nodes for name resolution.
Use the dracut version native to the image instead of a custom ClusterWare version.
Multi-head clusters now automatically rebalance nodes between heads.
Many other bug fixes and optimizations.