Lost Data – The Silent Killer
May 8, 2024 Richard Dolewski
DATA is the BACKBONE of today’s organization. INFORMATION is the most valuable asset of the CORPORATION. When data is lost, held RANSOM, or simply unavailable, it negatively affects (and potentially halts) all desired business outcomes.
Today’s dynamic business environments are based on a seamless flow of information. Organizations have invested in IT because they rely on technology to conduct business to sustain profitability.
Systems of record that previously functioned in the background, include self-service and real-time connections, with endless application endpoints spanning across numerous clouds. Transaction flow of data now originates and is completed across numerous cloud interactions vs a single hosted database. In our “always-connected” world, the flow of information and commerce means around-the-clock access to “data” because our business never sleeps. Consumer appetite for self-service and business systems availability means IT must deliver game-changing commercial-ready ERP applications, quick, secure transactional response times, compliance, and a variety of secured mobile and social capabilities.
IT must consider resilience, operational continuity, and data protection to support today’s digital transformation. Data protection and disaster recovery are your lines of defense against severe business impact.
What’s Your Data Posture
Traditional tape media and onsite VTL or cloud-based point-in-time backups have been viewed as an essential part of IT infrastructure architecture since its inception, and that will never change. Data protection is not inclusive of just natural disasters, or a data center failure. The risk lies with the accompanying Data Loss between when the data was protected (The point in time backup) and the time when the system loss occurs. Coupled with ever-increasing restore times driven by data growth and the need for off-site data retention for continuity compliance, businesses cannot afford permanent unrecoverable data loss. These solutions typically deliver an RPO of 24 hours, coupled with the likelihood of human errors and missed backups, the inherent RPO weaknesses increase. The cost of permanently lost data is high and includes the cost of lost revenue.
The Bureau of Labor reports that 93 percent of businesses that suffer a significant loss of data go out of business within five years. Point-in-time backups should never be your line of defense for real-time transaction-based operations driven by our systems of record. Yes, they are essential for historical restoration, but not for system loss. Given the rapid growth and volume of corporate data, it is increasingly critical for stakeholders to protect this data with no loss of transactions. Downtime impacts the loss of daily transactions and the very integrity of your databases and the ERP applications that use them.
“ITIC’s Hourly Cost of Downtime survey” indicates a single hour of server downtime totals $300,000 or more for 91 percent of mid-sized enterprises (SMEs) and large enterprises. Among that 91 percent majority – nearly half or 44 percent – of corporations said, hourly outage costs exceed $1 million to over $4 million. This is very costly to the business bottom line and stock valuation.
Protecting The Business
IT resilience goes far beyond the boundaries traditionally covered by data protection. It includes both the traditional data protection solution, which is reactive, and eliminating planned outages, which is proactive. In a world where security breaches and ransomware are the new threats in our digital environments, the scope of IT resilience is much broader than just equipment failures or FEMA-related events.
Understandably, FEMA-related disasters account for less than 10 percent of downtime in IT. The majority of system outages result from planned downtime, software issues, power loss, hardware failures, and human errors. So, consideration must also be given to the 90 percent of issues that may result in downtime. Bottom line: DR planning remains a challenging undertaking that companies must consider if they want to survive the interruptions that may represent only 10 percent of FEMA-related outages but 100 percent of all financial outages. You shouldn’t deceive yourself about probabilities or frequencies of disaster events. Prepare for the 10 percent, and you will also be able to manage the 90 percent. Manage the loss of uptime and data to protect today against major financial consequences of tomorrow.
Organizations today continue to deal with the backup technology and its limitations:
- Unacceptable lost productivity due to restoration time – RTO
-
- Recovery time is long, complicated, and NOT repeatable.
- Significant revenue loss due to unrecoverable data – RPO
-
- Unrecoverable transactions from the current business day
-
- Transaction span numerous servers / Clouds
- Backup policies must be reviewed
-
- Full nightly backups are never feasible.
-
- SNAP Technology
-
- Immutable backups
Why RPO And RTO Still Matter
Restoration of IT services will depend on your organization’s decision on the level of service required as measured by the following recovery metrics:
Maximum Allowable Downtime (MAD) = RTO + UAT
- Recovery Time Objective (RTO) is the maximum length of time that a computer system, network, and application(s) can be down after a failure or disaster occurs.
- User Acceptance Testing (UAT) is the amount of time from when the computer system are made available and the time to validate the end-to-end business flow confirming ALL applications are in the same useable state prior to the outage.
- Recovery Point Objective (RPO) is the point in time when data is restored, which reflects the amount of data that the business absolutely cannot afford to lose data. It points to the starting place where information must be available and ensures that the database and application are in good standing.
You must ask: Does the business understand how long it will take IT to recover the business in a disaster related event? Is IT still aligned with the business? Communicate, in simple terms, your capabilities and success criteria of the DR Solution and emphasize the delivery to your business.
Dear Mother Nature
The tornado season has already arrived, with devasting impacts across the Midwest. Red flag fire warnings are up, and tropical storms are on the horizon.
Every time an unplanned impact occurs, we get a gut-wrenching reminder of the importance of Business Continuity and repeatable disaster recovery solutions. Specifically, a solution that guarantees your applications will function the same way they do “today” in production with the currency of data in a disparate geographic region.
Ask yourself:
- Does our current Disaster Recovery (DR) solution demonstrate confidence to the business that we can failover our IT Systems within stated recovery objectives?
- Does our DR solution minimize Data Loss?
- Does the solution deliver Resiliency Security and Availability
It’s imperative to replicate data outside of the primary business site’s FEMA region so that they are available for systems failover, data protection to the last transaction, and deliver business resilience. Companies should also ensure they have committed like-for-like infrastructure they can recover to in an alternate FEMA geographic region.
Readiness Assessment: Prevention
The objective of a readiness assessment is to formally review your current infrastructure reference architecture and availability state for the entire datacenter. This includes assessing every system per LPAR and per Host that delivers the current business application environments. Re-discover and re-architect your server platforms, network, and storage in the datacenter and cloud foundation for resiliency.
Performing an assessment is also disaster prevention. In other words, money well spent. With so many interdependencies, IT should examine all production workloads, potential single points of failure, and associated risk factors. Companies with mission-critical applications with little or NO tolerance for downtime must maintain precise system availability and disaster recovery prevention measures. This means that COMPLETE data protection, resiliency, and uptime strategies, when fully implemented and tested, will provide you with the availability of your business demands.
Conduct a workshop with your IT staff oriented toward gaining an understanding of the current operating environment to deliver availability and resiliency.
The workshop must review underlying technologies and network dependencies, profile, flow of information between applications, monitoring, and runbooks.
- Review on-premises and cloud compute requirements.
- Gap Analysis and remediation: Technology Refresh
- Critical server definition: Mapping of servers, dependencies & interfaces
- Data protection, logical/hardware replication, and recovery models
- System, application, and data: COMPLIANCE VALUE
- Design for IT resiliency versus disaster recovery: BUSINESS VALUE
- Monitoring and centric message management: UPTIME VALUE
- Examine hybrid, internal, and external cloud options: CLOUD-ENABLED
You must rationalize your organization’s ability to utilize its Cloud architecture to achieve new business milestones. Findings from your assessment will provide a basis for an acceptable level of risk, technology or configuration improvements, and remediation activities, which equates to a positive impact on the business.
Modernizing IT Resilience
IT resilience today requires more than just nightly backups, and DR. There is a significant business need for flexible and cost-effective technology foundation deployed Continuous Data Protection (CDP), SAN Tooling with Global Mirror and PowerHA, and DB2 Mirror, bundled with Managed Disaster Recovery Services to deliver the best resiliency available. Instead of restoring from last night’s backup, you can recover to the journal checkpoint minutes before a bad DB update or failover to your Active-DB on another host IBM Power system. CDP utilizes journaling technology to keep track of all transactions, while DB2 Mirror uses remote direct memory access (RDMA) across two nodes paired together to create a synchronous environment. With Db2 Mirror, Automated clones, and Global Mirror IASP = Systems Availability is keeping production environments online during planned maintenance, and near zero data loss as any DB and System updates would automatically be mirrored to another server.
The use of these technologies provides multiple benefits over traditional data protection technologies. These are all powerful tools in developing IT resilience by combining it with orchestration capabilities from Managed Services, thus allowing numerous recovery points.
Lastly, by implementing Virtual Protection Groups (VPGs), you can group applications to be protected and recovered together. This provides complete application consistency, regardless of the physical location of the servers and storage. Protecting workloads with VPGs allows you to use an application-level granularity as opposed to a LUN-level granularity. Availability drives everything you do – from the business processes that drive revenue to the applications that drive productivity and the data that drives business decisions.
Benefits in the digital world:
- Eliminates most planned and unplanned outages
- Granular recovery of files/tables
- Complete systems and database failover capabilities
- Resiliency Strategy includes application tiers, on-premises and hybrid cloud
- Orchestrated testing for non-disruptive capabilities
- Infrastructure performance validation
- Security and ransomware capabilities
Because today’s business environments are much more reliant on IT systems to deliver value than ever before, companies are very risk-averse regarding change because of the potential loss of data or outages caused by failed updates and upgrades, which can lead to financial and reputational damage.
“The datacenter is down.” These dreaded words are something everyone in the company never wants to hear. Losing your engine room – the IBM Power System – to business operations is a critical failure in service delivery. How much data and downtime can you afford without significant revenue loss? The answer to these vital business questions will drive your Availability and Data Protection requirements.
One wise Master of Disaster once said: Changes are driven by the business, requiring a fundamental change in our thinking about how IT delivers value.
Cloud Resiliency delivers application uptime and data protection to the last business transaction! Ensure this message clearly aligns with Business outcomes – not IT outcomes.
Richard Dolewski is vice president of hybrid cloud solutions at Connectria.
This content is sponsored by Connectria.
RELATED STORIES
LightEdge Acquires Connectria To Round Out Each Other’s Power Play
AWS Inks Deal With Connectria To Have a Power Play
How Technical Debt Can Affect Your Business In 2022
Getting Started With Connectria’s Hybrid IBM i And AIX Architecture
A Year Later: Diving Deeper into How Connectria Brings IBM i Workloads Closer To AWS
Your CIO Wants To Move To The Cloud, Here’s How You Really Get There
From Here To There: Securing Your Future On IBM i (And In The Cloud)