Monitoring – Internal
Nagios (Grid Only)
We have a nagios based monitoring setup that has been modified and optimised for fast polling, we can poll around a thousand end points upto every 30seconds.
Our nagios instance checks our nodes, servers and infrastructure along with a specially selected number of software stack tests. It has been specifically integrated with our build system to allow us to simply add and remove new devices as the grid continually changes.
We have a LibreMNS setup that includes many other tooling to give a wide-range of performance based checks.
We have ganglia on both the IPPP and Grid systems, however this monitoring method is due to be phased out in favour of more cloud-native / scale solutions.
Kibana and Elasticsearch
We operate an internal elasticsearch cluster and kibana front end, we primarily use these tools for long term log storage as well as network monitoring trends.
If you would like to know more or request access to some of the data/interfaces then please get in touch.