Disaster recovery

Disaster Recovery involves a set of policies, tools and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster. Disaster recovery focuses on the IT or technology systems supporting critical business functions, as opposed to business continuity, which involves keeping all essential aspects of a business functioning despite significant disruptive events. Disaster recovery can therefore be considered a subset of business continuity.

IT Service Continuity

IT Service Continuity is a subset of business continuity planning and encompasses IT disaster recovery planning and wider IT resilience planning. It also incorporates those elements of IT infrastructure and services which relate to communications such as telephony and data communications.
The ITSC Plan reflects Recovery Point Objective and Recovery Time Objective.

Principles of Backup sites

Planning includes arranging for backup sites, be they hot, warm, cold, or standby sites, with hardware as needed for continuity.
In 2008 the British Standards Institution launched a specific standard connected and supporting the Business Continuity Standard BS 25999 titled BS25777 specifically to align computer continuity with business continuity. This was withdrawn following the publication in March 2011 of ISO/IEC 27031 - Security techniques — Guidelines for information and communication technology readiness for business continuity.
ITIL has defined some of these terms.

Recovery Time Objective

The Recovery Time Objective is the targeted duration of time and a service level within which a business process must be restored after a disaster in order to avoid unacceptable consequences associated with a break in business continuity.
In accepted business continuity planning methodology, the RTO is established during the Business Impact Analysis by the owner of a process, including identifying options time frames for alternate or manual workarounds.
In a good deal of the literature on this subject, RTO is spoken of as a complement of Recovery Point Objective, with the two metrics describing the limits of acceptable or "tolerable" ITSC performance in terms of time lost from normal business process functioning, and in terms of data lost or not backed up during that period of time respectively.

A Forbes overview noted that it is Recovery Time Actual which is "the critical metric for business continuity and disaster recovery."
RTA is established during exercises or actual events. The business continuity group times rehearsals and makes needed refinements.

A Recovery Point Objective is defined by business continuity planning. It is the maximum targeted period in which data might be lost from an IT service due to a major incident.
If RPO is measured in minutes, then in practice, off-site mirrored backups must be continuously maintained; a daily off-site backup on tape will not suffice.

Relationship to Recovery Time Objective

Recovery that is not instantaneous will restore data/transactions over a period of time and do so without incurring significant risks or significant losses.
RPO measures the maximum time period in which recent data might have been permanently lost in the event of a major incident and is not a direct measure of the quantity of such loss. For instance, if the BC plan is "restore up to last available backup", then the RPO is the maximum interval between such backup that has been safely vaulted off-site.
Business impact analysis is used to determine RPO for each service and RPO is not determined by the existent backup regime. When any level of preparation of off-site data is required, the period during which data might be lost often starts near the time of the beginning of the work to prepare backups, not the time the backups are taken off-site.

Data synchronization points

Although a data synchronization point is a point in time, the timing for performing the physical backup must be included. One approach used is to halt processing of an update queue, while a disk-to-disk copy is made. The backup reflects the earlier time of that copy operation, not when the data
is copied to tape or transmitted elsewhere.

How RTO and RPO values affect computer system design

RTO and the RPO must be balanced, taking business risk into account, along with all the other major system design criteria.
RPO is tied to the times backups are sent offsite. Offsiting via synchronous copies to an offsite mirror allows for most unforeseen difficulty. Use of physical transportation for tapes comfortably covers some backup needs at a relatively low cost. Recovery can be enacted at a predetermined site. Shared offsite space and hardware completes the package needed.
For high volumes of high value transaction data, the hardware can be split across two or more sites; splitting across geographic areas adds resiliency.

History

Planning for disaster recovery and information technology developed in the mid- to late 1970s as computer center managers began to recognize the dependence of their organizations on their computer systems.
At that time, most systems were batch-oriented mainframes. Another offsite mainframe could be loaded from backup tapes pending recovery of the primary
site; downtime was relatively less critical.
The disaster recovery industry developed to provide backup computer centers. One of the earliest such centers was located in Sri Lanka.
During the 1980s and 90s, as internal corporate timesharing, online data entry and real-time processing grew, more availability of IT systems was needed.
Regulatory agencies became involved even before the rapid growth of the Internet during the 2000s; objectives of 2, 3, 4 or 5 nines were often mandated, and high-availability solutions
for hot-site facilities were sought.
IT Service Continuity is essential for many organizations in the implementation of Business Continuity Management and Information Security Management and as part of the implementation and operation
information security management as well as business continuity management as specified in ISO/IEC 27001 and ISO 22301 respectively.
The rise of cloud computing since 2010 continues that trend: nowadays, it matters even less where computing services are physically served, just so long as the network itself is sufficiently reliable. 'Recovery as a Service' is one of the security features or benefits of cloud computing being promoted by the Cloud Security Alliance.

Classification of disasters

Disasters can be the result of three broad categories of threats and hazards. The first category is natural hazards that include acts of nature such as floods, hurricanes, tornadoes, earthquakes, and epidemics. The second category is technological hazards that include accidents or the failures of systems and structures such as pipeline explosions, transportation accidents, utility disruptions, dam failures, and accidental hazardous material releases. The third category is human-caused threats that include intentional acts such as active assailant attacks, chemical or biological attacks, cyber attacks against data or infrastructure, and sabotage. Preparedness measures for all categories and types of disasters fall into the five mission areas of prevention, protection, mitigation, response, and recovery.

Importance of disaster recovery planning

Recent research supports the idea that implementing a more holistic pre-disaster planning approach is more cost-effective in the long run. Every $1 spent on hazard mitigation saves society $4 in response and recovery costs.
2015 disaster recovery statistics suggest that downtime lasting for one hour can cost

small companies as much as $8,000,
mid-size organizations $74,000, and
large enterprises $700,000.

As IT systems have become increasingly critical to the smooth operation of a company, and arguably the economy as a whole, the importance of ensuring the continued operation of those systems, and their rapid recovery, has increased. For example, of companies that had a major loss of business data, 43% never reopen and 29% close within two years. As a result, preparation for continuation or recovery of systems needs to be taken very seriously. This involves a significant investment of time and money with the aim of ensuring minimal losses in the event of a disruptive event.

Control measures

Control measures are steps or mechanisms that can reduce or eliminate various threats for organizations. Different types of measures can be included in a disaster recovery plan.
Disaster recovery planning is a subset of a larger process known as business continuity planning and includes planning for resumption of applications, data, hardware, electronic communications, and other IT infrastructure. A business continuity plan includes planning for non-IT related aspects such as key personnel, facilities, crisis communication, and reputation protection and should refer to the disaster recovery plan for IT-related infrastructure recovery/continuity.
IT disaster recovery control measures can be classified into the following three types:

Preventive measures – Controls aimed at preventing an event from occurring.
Detective measures – Controls aimed at detecting or discovering unwanted events.
Corrective measures – Controls aimed at correcting or restoring the system after a disaster or an event.

Good disaster recovery plan measures dictate that these three types of controls be documented and exercised regularly using so-called "DR tests".

Strategies

Prior to selecting a disaster recovery strategy, a disaster recovery planner first refers to their organization's business continuity plan, which should indicate the key metrics of Recovery Point Objective and Recovery Time Objective. Metrics for business processes are then mapped to their systems and infrastructure.
Failure to properly plan can extend the disaster's impact. Once metrics have been mapped, the organization reviews the IT budget; RTO and RPO metrics must fit with the available budget. A cost-benefit analysis often dictates which disaster recovery measures are implemented.
Adding cloud-based backup to the benefits of local and offsite tape archiving, the New York Times wrote, "adds a layer of data protection."
Common strategies for data protection include:

backups made to tape and sent off-site at regular intervals
backups made to disk on-site and automatically copied to off-site disk, or made directly to off-site disk
replication of data to an off-site location, which overcomes the need to restore the data, often making use of storage area network technology
Private Cloud solutions which replicate the management data into the storage domains which are part of the private cloud setup. These management data are configured as an xml representation called OVF, and can be restored once a disaster occurs.
Hybrid Cloud solutions that replicate both on-site and to off-site data centers. These solutions provide the ability to instantly fail-over to local on-site hardware, but in the event of a physical disaster, servers can be brought up in the cloud data centers as well.
the use of high availability systems which keep both the data and system replicated off-site, enabling continuous access to systems and data, even after a disaster

In many cases, an organization may elect to use an outsourced disaster recovery provider to provide a stand-by site and systems rather than using their own remote facilities, increasingly via cloud computing.
In addition to preparing for the need to recover systems, organizations also implement precautionary measures with the objective of preventing a disaster in the first place. These may include:

local mirrors of systems and/or data and use of disk protection technology such as RAID
surge protectors — to minimize the effect of power surges on delicate electronic equipment
use of an uninterruptible power supply and/or backup generator to keep systems going in the event of a power failure
fire prevention/mitigation systems such as alarms and fire extinguishers
anti-virus software and other security measures
Disaster Recovery as a Service (DRaaS)

Disaster Recovery as a Service DRaaS is an arrangement with a third party, a vendor. Commonly offered by Service Providers as part of their service portfolio.
Although vendor lists have been published, disaster recovery is not a product, it's a service, even though several large hardware vendors have developed mobile/modular offerings that can be installed and made operational in very short time.

Axcient
BASELAYER has a patent on software defined modular data center.
Cisco Systems
Databarracks
Google has developed systems that could be used for this purpose.
Bull
HP
Huawei,
IBM
Schneider-Electric
Sun Microsystems
SunGard Availability Services
ZTE Corporation

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...