What is server monitoring, and what are you looking for?

by | Mar 29, 2017 | Archived Articles

IT systems are critical to most of your business processes – if you have a company server, keeping it running smoothly is vital. Server outages are hugely disruptive and expensive, particularly if you don’t know why the machine crashed in the first place.

Server monitoring can be thought of as an early warning system, intended to identify potentially serious problems so that they can be resolved before the server crashes. But what are you looking for?

1. Hardware failures

Because servers typically run 24x7x365, the hardware components are subject to significant wear and tear over time. Moving parts like hard drives the hard drives which actually store your data are prone to failure, particularly if they have been running for years.

Most hardware can now report problems before completely failing, and server monitoring will record these notifications so you can act. Hard drives will report bad sectors (where sections of the physical disk have become damaged), or read and write errors (suggesting further damage or an intermittent fault), so that you can replace the disk before it fails and potentially loses data in the process.

2. Operating system failures

The operating system – like Microsoft Windows Server – is the software that actually makes the server hardware ‘work’. Just like your desktop PCs, it is sometimes prone to corruption or damage that causes the system to stop functioning normally.

Again, server monitoring can detect these errors early, giving you an opportunity to act before the corruption can take the system offline. If your machine does crash regularly, server monitoring should be able to tell you why. Your IT support partner can then arrange for the necessary OS reinstallation or recovery from backup to correct matters.

3. Software errors

Often server crashes or hangs are caused by the interactions between the applications you use and the operating system on the server. A bad update or security fix can cause low-level problems that you cannot ‘see’, but which are recorded in the server event logs.

Monitoring will uncover these problems and give you an opportunity to resolve them before they cause a larger problem.

