Understanding IBM i Options For High Availability
June 5, 2017 Alex Woodie
There are many options when it comes to high availability (HA) for IBM i. Should you use logical replication software like MIMIX, or a hardware-based solution like PowerHA? Should you deploy to the cloud, or stay on premise? Is remote journaling the way to go or should you roll your own? IT decision-makers must do their homework if they’re going to find the right solution for them.
If you’re in the market for a HA solution – and recent surveys indicate that many IBM i shops are shopping for HA in 2017 – then you should probably start at the very beginning. And that means defining high availability.
High availability refers to the capability to keep applications running in the event of an outage, either planned or unplanned. A HA solution helps avoid downtime by building redundancy, failure detection, and failover capabilities into the low-level system architecture.
Most HA solutions in the IBM i market achieve this goal by using real-time replication methods that copy each individual transaction that hits the production database to a secondary database. In the event of an outage, the HA solution provides a “role swap” facility to redirect users to the new secondary copy.
HA is related to disaster recovery (DR), in that a properly implemented and executed HA setup could save you from needing to implement your DR strategy in the event of an unplanned outage. But there are important differences between the two, the biggest one being that having an HA setup is optional, whereas having a DR strategy is an IT requirement. A HA system will also not protect you from corrupted data from human error, viruses, or ransomware, as the damage will quickly be propagated to the secondary machine.
In HA and DR, effectiveness of solutions are measured with recovery time objective (RTO, i.e. how long does it take to recover) and recovery point objective (RPO, i.e. how much data are you willing to lose). If your business simply cannot stand to lose access to core IBM i applications for a given length of time, whether it’s minutes or hours, then it has a low RPO, and HA could be for you.
Types of HA Solutions
Broadly speaking, there are two main classes of HA software on IBM i: logical replication software and hardware-based solutions.
The vast majority of HA solutions use logical replication software to duplicate the transactions from one IBM i environment to another IBM i environment, which is almost always located on a separate physical IBM i server.
Logical replication-based HA setups traditionally involve at least two IBM i servers: one to serve as the primary box and another to serve as the backup. Bigger shops may have three or more IBM i servers in their HA setup, configured in various manners, while smaller shops may rent an LPAR on a cloud provider’s IBM i system to serve as the backup.
Hardware-based HA is a relative newcomer to the IBM i scene. The method used in hardware-based HA is different from logical replication, but is widely accepted as the mainstream method in the Windows and Linux worlds.
Inside Logical Replication
As we mentioned before, logical replication is the older and more widespread form of HA in the IBM i world. The technology and techniques used for logical replication have been honed over decades of real-world use across tens of thousands of installations, which has resulted in a rich ecosystem of logical replication software and service providers ready to address the HA needs of IBM i shops around the world.
Nearly all of the logical replication solutions today use IBM’s remote journaling technology as the core data replication method under the covers, but there is one exception, which we’ll discuss later.
Logical replication is used to replicate changes made to data and objects stored in the IBM i server. When a database field is created, updated, changed, or deleted on the primary system, the change is written to the primary server’s local journal receiver. (This is one of the principal uses of journaling on the IBM i server; the other is for auditing.)
When a change happens in the local journal, it’s automatically replicated over the network link to the remote journal of the secondary server. Because IBM’s remote journaling technology runs underneath the operating system layer, it simplifies the development and maintenance of the logical replication solution that sits on top of it. It “just works,” and is considered to be bullet-proof.
Once the changes are present on the remote journal, it’s up to the third-party HA software to apply them to the secondary server. Most of the processing overhead in logical replication solutions is incurred on the secondary system, which many tout as a benefit.
Besides the remote journaling support, logical replication solutions work on top of the operating system, and is designed to ensure that applications are available, as opposed to protecting the entire system (as hardware-based HA does). The initial implementation of logical replication is relatively straightforward, which is a bonus compared to hardware-based HA.
Downsides of this approach primarily center around the need for users to continually check to make sure that everything they need to recover is being journaled, and that the journal receivers are being applied in the right order.
Logical Replication Market
There are six primary providers selling seven logical replication-based HA solutions to the global IBM i community. All of them are third-party software vendors. The list includes:
- HelpSystems: recently bought Bug Buster’s RSF/HA and renamed it Robot HA
- iSam Blue: sells a modified version of BugBusters RSF/HA product
- Maxava develops and sells Maxava HA
- Rocket Software: develops and sells Rocket iCluster (former DataMirror product)
- Shield Advanced Solutions: develops and sells HA4i
- Trader’s: develops and sells Quick-EDD
- Vision Solutions: develops and sells iTera HA and MIMX (includes Vision’s legacy OMS/ODS product)
Among these products, only Trader’s Quick-EDD product does not use IBM’s remote journaling, opting instead for its own proprietary journal scrap method. The other primary areas where these solutions differ include: the coverage of IBM i objects; the monitoring capabilities offered; role-swap test capabilities offered; the type of user interface (some offer only 5250 green screens); and the management capabilities offered.
The vendors also differ in terms of their installed bases, the number of technical support professionals backing the product, and the overall size of the company. Vision Solutions is by far the biggest vendor here, thanks to its acquisitions of iTera and Lakeview Technologies, but HelpSystems’ entry into field with last year’s acquisition of Bug Busters, as well as steady growth from the likes of Shield and Maxava promises to shake things up. We’ll drill further into these offerings in a future article in The Four Hundred.
Inside Hardware HA
As we mentioned before, hardware-based HA, or disk-clustering HA, is a different beast entirely compared to logical replication. It’s also quite new to the IBM i installed base, and at this point has been adopted primarily by larger IBM i shops that have greater familiarity with SANs and the advanced virtualization technologies that come with them, although that is starting to change.
There are two options when it comes to hardware-based HA for IBM i: IBM‘s PowerHA SystemMirror suite, and Dell EMC‘s Symmetrix Remote Distance Facility (SRDF). These hardware-based HA solutions require the customer to store their data in a SAN like IBM’s DS8000 or Storwize V7000, or Dell EMC’s Symmetrix or newer VMAX.
IBM’s PowerHA, which by far the most popular solution here, provides high availability to IBM i shops in a fundamentally different way than logical replication. Here’s how IBM describes it:
“A PowerHA cluster is created by taking the database out of SYSBAS and placing it into an Independent Storage Pool (IASP) and adding SYSBAS objects into the administrative domain. The data in the IASP is shared between the systems in the cluster. When configured into an IBM storage server, the IASP can be switched (LUN level switching) between partitions (nodes) in the cluster, and it can also be replicated to systems dispersed between remote locations. Replication via storage server is accomplished with Metro Mirror or Global Mirror; if one is using an internal disk the technology is called geographic mirroring.”
PowerHA works by replicating the entire page of memory for the DB2 for i database, as opposed to replicating individual transactions. While the PowerHA implementation is more difficult than a logical replication implementation, it can result in a solution that requires less maintenance going forward.
We’ll dig more into hardware-based HA, and into all the different options that IBM offers in its PowerHA suite, in a future issue of The Four Hundred. Stay tuned!
Editor’s note: This story was corrected. PowerHA is not dependent on SAN technology, and does implement data replication between hosts. IT Jungle regrets the errors.
I’d like to see more coverage on solutions involving a scalable, on-demand LPAR as the backup target. The IBM i HA space seems to cater more to big companies whose operations warrant or can justify a second off-site, physical machine or a dedicated 24×7 LPAR. If there was a de-facto standard of doing near-realtime mirroring to a cold-spare LPAR, ISVs would have less of an excuse to try and demand a second license instance for installing their software on the backup system.