Monitoring – Internal

Last Updated On8 November 2023

Nagios (Grid Only)

We have a nagios based monitoring setup that has been modified and optimised for fast polling, we can poll around a thousand end points upto every 30seconds.

Our nagios instance checks our nodes, servers and infrastructure along with a specially selected number of software stack tests. It has been specifically integrated with our build system to allow us to simply add and remove new devices as the grid continually changes.

LibreNMS

We have a LibreMNS setup that includes many other tooling to give a wide-range of performance based checks.

Ganglia

We have ganglia on both the IPPP and Grid systems, however this monitoring method is due to be phased out in favour of more cloud-native / scale solutions.

Smokeping

We utilise Smokeping to monitor average latencies to various hosts within the grid.

Kibana and Elasticsearch

We operate an internal elasticsearch cluster and kibana front end, we primarily use these tools for long term log storage as well as network monitoring trends.

If you would like to know more or request access to some of the data/interfaces then please get in touch.

Links

Created On24 June 2021