Table of Contents

System Changes and News

This page is a log/list of major changes to systems. This page will be updated frequently.

Future/Current

Grid

On 3rd Feb 2023 we will be replacing GridUI2. – On Hold, no ETA

From mid-2023 onwards there will be a slight reduced capacity for DPM based Grid Storage.

In Mid 2024, we intend to migrate to EL9 on all grid compute nodes and GridUI systems in line with the WLCG; please be aware that you may have to recompile your binaries and update your python to Python3. – This is now under way (April 2024).

There is currently (June 2023) a reduced core count on the grid due to a fault in the Data Centre, although the fault has been resolved, the nodes will remain offline until we can allocate significant time to reinitialise that part of the cluster.

From Monday 29th April 2024:

  • GridUI2 will be Rocky9 (EL9)
  • CE1 Queue will be Rocky9 (EL9)

From Friday 3rd May 2024:

  • Fielding ‘Long’ Queue will be Rocky9 (EL9) nodes only.
  • CE3 will be Rocky9 (EL9) nodes only.

From Monday 6th May 2024:

  • GridUI1 will be Rocky9 (EL9)
  • CE2 Queue (Feidling default) will be have severely limited cores left on CentOS 7 (EL7)

From Monday 13th May 2024:

  • CE2 Queue (Feidling default) will be Rocky9 (EL9) nodes only.

IPPP & Fielding & CfAI

We are intending on migrating Seafile by the end of Jan 2024.

IPPP

There will be downtime in 2024 to allow for a system replacement for the IPPP home storage, this will bring speed improvements (in both storage and networking) along with a slight capacity bump, as well as ensure we have continued reliability.

April-June 2024 we will be upgrading various WS systems to Rocky9. We will give prior notice and a road map closer to the time.

Fielding

Nothing yet.

CfAI

There will be downtime in 2024 to allow us to implement the new permanent CfAI home storage, this will bring speed improvements (in both storage and networking) along with a slight capacity bump, as well as reliability.

Past

2024

April

  • Downtime for power infrastructure work within Physics
  • Downtime for some nodes to enable upgrades to Rocky Linux 9
  • 25th – GridUI3 was upgraded to Rocky Linux 9
  • 24th – CE4 Queue is Rocky Linux 9 nodes only.

March

  • Downtime for power infrastructure work within Physics
  • ip3-ws5 was upgraded to Rocky 9 as a test deployment.

February

  • Downtime of Grid test queue to enable upgrades to Rocky Linux 9
  • Downtime of vn0 to upgrade to Rocky Linux 9

January

  • The IPPP GPU 2,3 and 4 systems have had their memory upgraded to double the previous capacity.
  • Brief downtime of CE-TEST, CE1, CE2, CE3 and CE4 to allow for software upgrades.

2023

December

  • The new XGW storage system is currently under going testing after reports of missing data.

November

  • 25 – 27th – Complete downtime for the Grid and some IPPP Systems.
  • SE01 (gsiftp, xrootd, webdavs) was retired, please use the new XGW system – See Grid Storage.

October

  • 4 new grid nodes were brought online, giving an additional 700+ cores.

August

  • Complete downtime for Grid and various IPPP resources.

June

  • Gitlab was offline for 48hours during a system migration.
  • IPPP monitor deployment was 98% completed.

May

April

  • 15 new grid nodes were brought online, replacing 2300 cores and gaining an additional 500 cores.

March

  • WS 3,4,6,12 were replaced.
  • CfAI WS Systems were brought online.

Feb

  • CPU 1,2,3,4,5 were replaced with newer systems (Approx 500 cores to 1200 with a better power efficiency per core – 3.5W down to 1.9W.)
  • CPU 6+7 were retired.
  • WS 1,2,5 were replaced
  • WS 7,13,14,15,16 were brought online.
  • Rollout started for the replacement of all IPPP monitors.
  • Grid Storage was reduced to enable migration to a new system ready for DPM retirement.

Jan

  • GridUI1 failures were investigated and resolved.
  • GridUI1 was replaced with an upgraded system.
  • Scratch SSD was transferred from GridUI2 to Gridui2

2022

  • Upgraded the IPPP networking from 10Gb/s to a redundant 80Gb/s
  • Upgraded the Grid Networking from 10Gb/s to a redundant 40Gb/s