Skip to main content

[Resolved] The system is available after hardware installations and software upgrades

Monday 13 November 18:30 CET (19:30 EET)

As it is not clear yet if a complete solution restoring full functionality of the cray-mpich 8.1.18, 8.1.23 and 8.1.25 is possible or when it would be ready, a workaround has been rolled out on the system. The workaround will basically always load the 8.1.27 module, even if the 8.1.18, 8.1.23 or 8.1.25 modules are requested, and do so for all the programming environments. It is not possible at the moment to develop a more selective workaround that only affects users of the Cray compilers.

Note that on a Cray system, at runtime you are using default versions of many libraries and not the version corresponding with the modules that are loaded unless you explicitly ask for that version by prepending LD_LIBRARY_PATH with the value of CRAY_LD_LIBRARY_PATH or use the lumi-CrayPath module for that purpose. The default version is currently the 8.1.27 module.

Depending on what other modules you have loaded, the cray-mpich 8.1.18, 8.1.23 or 8.1.25 modules may still appear in the output of module avail, but when you try to load them you will get 8.1.27 instead.

Please let us know of any further problems you may experience with these modules by creating a ticket here: https://lumi-supercomputer.eu/user-support/need-help/

This update is also published here https://lumi-supercomputer.github.io/update-202311

 

Tuesday 7 November 8:30 CET (9:30 EET)

We are pleased to let you know that LUMI is back to regular service.

The system was boosted with additional LUMI-C nodes (512 nodes = 65536 cores), improved interconnect capability, Cray Programming Environment 23.09, together with upgrades of the Cray Operating System and the system firmware to improve the system stability and performance.

Please note that currently the system fails to load the cray-mpich 8.1.18, 8.1.23 and 8.1.25 modules when using the Cray compilers (cce modules). This affects all software compiled via EasyBuild using the cpeCray 22.08, 22.12 and 23.03 compilers, but will also affect other job scripts that load these versions in combination with the cce compilers. We will inform you as soon as a solution or workaround is found.

If you need any assistance or observe anomalies, please do not hesitate to contact the LUMI User Support Team: https://lumi-supercomputer.eu/user-support/need-help/

 

Monday 6 November 17:30 CET (18:30 EET)

Unfortunately we won’t be able to bring LUMI back to service today due to a late hardware issue. The whole system is currently closed while corrective actions are being performed by the system administrators.
We expect to get LUMI back in production by Tuesday 7 November. We will keep you informed on further developments.

We apologize for the inconvenience.

 

Thursday 12 October 12:30 CEST (13:30 EEST)

LUMI will undergo the completion of the integration of the third phase hardware, as well as a software stack upgrade. In addition, node repairs and extensive performance benchmarking will be carried out.  These require a major maintenance break.

The upgrade will bring 512 new LUMI-C nodes (65,536 cores), improved interconnect capability, Cray Programming Environment 23.09, together with upgrades of the Cray Operating System and the system firmware to improve the system stability and performance.

The process will start from LUMI-G being removed from service on Friday, the 20th of October, at 21:00 EEST (20:00 CEST). System login and LUMI-C will be available until Sunday, 22nd of October at 11:00 EEST (10:00 CEST), after which the downtime of the whole system, including storage and login nodes, will start. LUMI will, therefore, remain unavailable for the rest of the integration process, which is planned to be completed by Monday, the 6th of November.

We will inform you of any developments that may occur during the maintenance break, for example, if the system returns online earlier.

We will make a maintenance reservation to ensure no jobs are running when the break starts on the different partitions, as described above. Shorter jobs may run before the break if they can finish before the start of the downtime.

We apologize for this inconvenience.

Please don’t hesitate to contact the LUMI User Support Team if you need any assistance. Thank you.