DR Testing As A Service: One More Thing That You Don’t Have To Do
September 25, 2023 David Fahrenkrug
If not having a disaster recovery plan for your IBM i system is scary to think about – and clearly a lot of IBM i shops are not thinking about it beyond restoring from tape or virtual tape library because if they did, they would get together a disaster recovery plan – then maybe between the fifth and tenth most scary things in the datacenter today is actually making sure that the disaster recovery software works and the plan for recovery in the event of something going haywire actually works.
High availability clustering and disaster recovery software is, by necessity, a bit complex and it is definitely outside of the wheelhouse of most IBM i shops, who have rightfully concentrated on implementing business functions and providing batch and transactional systems that manage and drive the business and analytical systems that help steer it. Setting up a DR cluster is not really in the job description, and that is one of the reasons why somewhere around 70 percent of the IBM i shops in the world – and perhaps more – do not have proper disaster recovery software installed on their production machines. This is why even though the cost of Power Systems machines has come down over three and a half decades thanks to Moore’s Law improvements in semiconductors and even though the cost of HA/DR software has come down almost as fast thanks to competition.
But starting now, the math on DR is changing thanks to managed service providers, who are taking on installing and managing DR solutions on behalf of customers and also providing managed services for this DR software and often target machines for the DR recovery and HA replication, so customers don’t have to think about that, too. We would go so far as to say that IBM i customers who are shopping for an MSP to help with HA/DR should go with a vendor who can supply the hardware for replication and recovery – or have a very tight partnership with a public cloud that supports IBM i platform – as well as supplying the experts who can babysit the HA/DR software on your behalf no matter where it runs.
At Focal Point Solutions Group, we have expertise in the high availability MIMIX and iTera logical replication software from Precisely and the HA and DR software from Maxava and we can also do SAN to SAN replication with IBM’s PowerHA or even Symmetrix Remote Data Facility (SRDF) if customers want to use Dell/EMC storage arrays. (Some do, most don’t.) We can host your failover box in our datacenter or manage a logical partition instance on a cloud as the target system if that is the way you want to go.
The thing to remember is that IBM i shops are dealing with planned as well as unplanned outages.
The planned outages are significant, and growing. IBM i shops have to do maintenance. They have to do operating system upgrades, and they have to apply PTFs as appropriately as they come out each week from IBM. But they can’t afford the number of hours it takes to do these things. We have customers who were taking 18 hours to 22 hours to back the system up before you could do an OS upgrade. You can’t take a machine offline for that long just to do a backup so you have something to rollback to, not when it takes four or five or six hours to do the OS upgrade and then probably a few hours to test ensuring that the upgrade didn’t mess up the system or the applications.
If you let us manage the HA/DR software using either logical replication or SAN replication, we replicate to a target box and we can and do run daily audits to verify that the production and target machines are in sync and you can upgrade one while the other is running the applications.
Part and parcel of any disaster recovery setup is testing, which as we pointed out above causes a certain amount of trepidation with for customers – especially those who are new to HA/DR and that means most of you reading this over the next few years as you buy HA/DR services.
When we bring people on board, we try to reduce their downtime, but they must test that migration to the box to make sure all of the little details are in place. For instance, do the third party application license keys work? Does the application work and is the data in sync with what it was before I did that virtual role swap? You can do some comparisons to see if your data and programs are in sync between the boxes. We can do a test using transactions on a backup box that never messes with the production box and the users don’t even know that you are doing a role swap. We have people that perform virtual role swaps every day until they’re satisfied that the replication is working as it should.
With every one of our agreements that we have with our clients, we give them one full test switch every year and we give them the audit reports that prove the role swap worked right. This is a full-on switch of users between production and target machines, and at some point, when it is convenient, we switch them back to their production machine.
That is a bare minimum. Others do more and it depends on the client and their particular industry and situation. Some companies run on the switched system for a couple of weeks. Some companies switch every 24 hours, and some switch every month, bouncing back between machine one and machine two. Others can’t do this because they cannot afford to put the same horsepower on the target box as they have on their production machine, and they think of the failover as a bare minimum to keep the software working until the primary production machine can be brought back online after a role swap. Our recommendation is if you can’t do a role swap once a quarter, you shouldn’t do it less than once a year.
David Fahrenkrug is chief disaster recovery officer at Focal Point Solutions Group.
This content was sponsored by Focal Point Solutions Group.
RELATED STORIES
We Are Filling Our Talent Pool Because Yours Is Going To Drain
When You Need Us, We Are Ready To Do Grunt Work
Get Help To Batten Down The Hatches On Your IBM i
The Security Awareness Of People Is The Important Firewall In IT
Managed Cloud Saves Money By Cutting System And People Overprovisioning
With IBM i Security, You Don’t Know What You Don’t Know
Focal Point Buys UCG Technologies, On The Hunt For More IBM i Deals
Focal Point Emphasizes Security Assessments, Documents In The Cloud
Managed Service Provider Picks Its Niche
Focal Point Updates DR FlashCopy
Startup Looks To Take the Pain Out Of HA Testing
Hit A Fiduciary Home Run With A Backup, DR, Cybersecurity Triple Play
Don’t Forget About The Co-Lo Alternative To Cloud
Ransomware Epidemic Hits Epic Proportions, And IBM i Shops Take Notice
Do The Math When Looking at IBM i Hosting For Cost Savings
Disaster Recovery, At Your Service
Taking The Pulse Of The IBM i Market
If You Can’t Get To The Tape, It Doesn’t Matter If It Is Dead Or Not
Industry Speaks: IBM i Predictions for 2020, Part 1
UCG Becomes The Guardian Of Contract Management
A Better Way To Skin The IBM i Cloud Cat
UCG Technologies Takes Off To The Great White North
UCG And HelpSystems Make Acquisitions
Spreading A Wider IT Net At UCG Technologies
Keeping Ransomware Out of the VAULT
UCG And Expedient Partner For Expanded IBM i Hosted Services
Vaulting Service Replaces Mirroring For IBM i Shop
IBM i Shops Climb Into the VAULT
IBM i Shops Turn to Vault400 for Protection
UCG Grows BaaS Biz with VAULT400
Mountains Of Data Bring Recovery Issues