Shield Debuts Nagios Monitoring Solution for IBM i
December 1, 2021 Alex Woodie
Shield Advanced Solutions last week rolled out a new system monitoring solution based on open-source Nagios technology that gives customers quick insight into the state of their IBM i server. Dubbed AAG, for At A Glance, the product currently monitor 65 IBM i parameters, and can be used with existing Nagios deployments running on Linux or deployed in a standalone manner with an included Linux runtime.
Nagios is a free open-source software project that provides monitoring and alerting for servers, networking devices, applications, and other IT gear. The software was originally developed in 2002 to monitor Linux systems, and over the years, its usage has grown considerably. Today it’s being used to keep an eye on just about anything in the data center via thousands of available plug-ins, including ones that monitor IBM i servers and the applications that run on it.
IBM rolled out a Java-based Nagios plug-in back in 2018 that collects and feeds IBM i server metrics into Nagios Core or Nagios XI, two open source Nagios packages that run on Linux. However, when Shield Advanced Solutions president Chris Hird started working with the IBM offerings (as well as a couple of other open source Nagios products developed by the community), he was not particularly impressed.
“We found them to be lacking in many areas, and support from IBM for theirs was not great,” Hird tells IT Jungle via email, adding that most of the technical support came from China and was not timely. “We did build a NEMS for Linux appliance (with little help from the developer) that included the IBM-provided plug-in. That ended after a number of problems and gotchas made it impossible to guarantee a level of support we would be happy with for our customers.”
Faced with dead-end third-party Nagios solutions, Hird took the time-honored approach and decided to build his own. The result of this effort is AAG, which Shield unveiled in late November and is now selling access to via subscription.
The AAG offering consists of several components, including one that runs on IBM i and three that run on Linux.
On the IBM i server, AAG consists of NG4i, which was developed by Shield and is in charge of collecting data from IBM i and feeding it to the Nagios plug-in (which runs on Linux). According to Hird’s November 17 blog post, this component includes a TCP/IP server.
The three Linux components consists of the AAG plug-in for Nagios, which receives data from NG4i; a copy of the open source Nagios Core server; and a pre-built Linux image. (Customers that already have a Nagios server running on Linux won’t need the last two Linux items in that list — they will just need the AAG Nagios plug-in developed by Shield.)
NG4i, which is provided as a licensed program product (LPP) that uses traditional installation techniques on IBM i, functions as a “responder” application for AAG, according to Shield. The customer installs NG4i on all of the IBM i servers that he wants to monitor.
During the configuration process, the AAG customer will define what aspects of IBM i operations he wants to monitor. Customers are able to specify the exact thresholds that will trigger the Nagios solution to generate an alert, although the software will suggest parameters too. Shield provides out-of-the-box support for 65 different metrics, including some generic IBM i metrics as well as a handful for its own high availability and enterprise messaging products, HA4i and EM4i, respectively.
AAG customers have several choices when it comes to how to receive the alerts. One option recommended by Shield is to use Pushover notifications, which will allow customers to view alerts through a Web browser interface or through iOS or Android applications. Another option is to use NagiosTV, which provides a Web browser interface and which Shield includes in its AAG Linux distribution.
“We are particularly fond of the NagiosTV interface which allows us to monitor everything through a single screen/page,” Hird says, adding that he uses it to monitor not just Shield’s IBM i LPARs, but other servers, switches, NAS, and IoT devices too. “It’s nice because it has verbal output that can be used to alert verbally any problems it sees. Just let it run in the corner of the office.”
While Nagios can monitor just about anything under the sun at this point, monitoring the high availability environments of Shield’s HA4i customers clearly was a priority. Given the industry’s general lack of appetite for testing role swaps, having another set of eyes on replication just makes sense.
“Ensuring the system is always available requires notification of any system events that could impede application processing or lead to a system loss,” Hird states in a press release. “This is not only pertinent for the production system, but also for the recovery system. We have seen several occasions where an unmonitored recovery system has been offline for a significant period of time due to a lack of early warning monitoring.”
AAG will monitor several aspects of an HA4i installation, including the transfer rate between source and target systems; the apply status of the target systems; object and spool file replication statuses; the number of HA4i responder jobs running; the number of spool files waiting for replication; the number of inactive journals configured for replication; and the server status for each critical server in the HA4i subsystem.
AAG also brings monitoring for Shield’s “general use” commands, including metrics like: the number of active jobs in a given subsystem or job queue; the number of messages in a “message wait” state or awaiting a reply; the number and size of receivers in a given library; the cache battery state; the number of days until an LPP license key expires; and additional metrics around CPU usage and QTEMP size.
Finally, AAG also monitors basic IBM i statistics, such as: available disk capacity; total disk capacity; percent of processor used; number of jobs running on a system; the percentage of permanent and temporary addresses used; size of system ASP; size of unprotected storage; the number of partitions on a system; the number of active jobs and active threads on a system; the number of interactive transactions per job; the number of internal machine lock waits per job; the amount of auxiliary I/O requests per job; and many more.
AAG requires IBM i 7.1 or higher. It also requires is 577OSS1 option 34, the digital certificate manager, which is required to manage TLS security certificates. Pricing starts at $50 per month. For more information, see Shield’s website at www.shieldadvanced.com/.