System Changes and News

This page is a log/list of major changes to systems. This page will be updated frequently.

Future/Current

Grid

On 3rd Feb 2023 we will be replacing GridUI2. – On Hold, no ETA

From mid-2023 onwards there will be a slight reduced capacity for DPM based Grid Storage.

In Mid 2024, we intend to migrate to EL9 on all grid compute nodes and GridUI systems in line with the WLCG; please be aware that you may have to recompile your binaries and update your python to Python3. – This is now under way (April 2024).

There is currently (June 2023) a reduced core count on the grid due to a fault in the Data Centre, although the fault has been resolved, the nodes will remain offline until we can allocate significant time to reinitialise that part of the cluster.

From Monday 6th May 2024:

  • GridUI1 will be Rocky9 (EL9) – Delayed

IPPP & Fielding & CfAI

We are intending on migrating Seafile by the end of Jan 2024.

Power work on the 17th May and 31st May will put all services at risk and all WS systems will be taken offline.

IPPP Only

There will be downtime in 2024 to allow for a system replacement for the IPPP home storage, this will bring speed improvements (in both storage and networking) along with a slight capacity bump, as well as ensure we have continued reliability.

April-June 2024 we will be upgrading various WS systems to Rocky9. We will give prior notice and a road map closer to the time.

May 14th WS6 and WS7 will be taken offline to upgrade to Rocky Linux 9.

Fielding Only

Nothing yet.

CfAI Only

There will be downtime in 2024 to allow us to implement the new permanent CfAI home storage, this will bring speed improvements (in both storage and networking) along with a slight capacity bump, as well as reliability.

Past

2024

May

  • 2nd – Fielding ‘Long’ Queue is Rocky Linux 9 (EL9) nodes only.
  • 2nd – CE3 is Rocky Linux 9 (EL9) nodes only.
  • 8th – CE2 (Fielding Default) is Rocky Linux 9 (EL9) nodes only.

April

  • Downtime for power infrastructure work within Physics
  • Downtime for some nodes to enable upgrades to Rocky Linux 9 (EL9)
  • 25th – GridUI3 was upgraded to Rocky Linux 9 (EL9).
  • 24th – CE4 Queue is Rocky Linux 9 (EL9) nodes only.
  • 29th – GridUI2 is now Rocky9 Linux (EL9)
  • 29th – CE1 Queue is Rocky Linux 9 (EL9) nodes only.)

March

  • Downtime for power infrastructure work within Physics
  • ip3-ws5 was upgraded to Rocky 9 as a test deployment.

February

  • Downtime of Grid test queue to enable upgrades to Rocky Linux 9
  • Downtime of vn0 to upgrade to Rocky Linux 9

January

  • The IPPP GPU 2,3 and 4 systems have had their memory upgraded to double the previous capacity.
  • Brief downtime of CE-TEST, CE1, CE2, CE3 and CE4 to allow for software upgrades.

2023

December

  • The new XGW storage system is currently under going testing after reports of missing data.

November

  • 25 – 27th – Complete downtime for the Grid and some IPPP Systems.
  • SE01 (gsiftp, xrootd, webdavs) was retired, please use the new XGW system – See Grid Storage.

October

  • 4 new grid nodes were brought online, giving an additional 700+ cores.

August

  • Complete downtime for Grid and various IPPP resources.

June

  • Gitlab was offline for 48hours during a system migration.
  • IPPP monitor deployment was 98% completed.

May

April

  • 15 new grid nodes were brought online, replacing 2300 cores and gaining an additional 500 cores.

March

  • WS 3,4,6,12 were replaced.
  • CfAI WS Systems were brought online.

Feb

  • CPU 1,2,3,4,5 were replaced with newer systems (Approx 500 cores to 1200 with a better power efficiency per core – 3.5W down to 1.9W.)
  • CPU 6+7 were retired.
  • WS 1,2,5 were replaced
  • WS 7,13,14,15,16 were brought online.
  • Rollout started for the replacement of all IPPP monitors.
  • Grid Storage was reduced to enable migration to a new system ready for DPM retirement.

Jan

  • GridUI1 failures were investigated and resolved.
  • GridUI1 was replaced with an upgraded system.
  • Scratch SSD was transferred from GridUI2 to Gridui2

2022

  • Upgraded the IPPP networking from 10Gb/s to a redundant 80Gb/s
  • Upgraded the Grid Networking from 10Gb/s to a redundant 40Gb/s