Web Services Continuity Plan

Note: Many of the tasks described in this document have been successfully completed. We provide excerpts from the first ICPSR Web Continuity Plan as an example for others.

Summary

Many of our clients experience ICPSR through interaction with the website. Given the importance of ICPSR's Web presence, the site must be robust and reliable. ICPSR's website demonstrates a high-level of reliability, typically exceeding 99% availability each month. However, like other sites, it is exposed to two types of failures: hardware failure in a key component and environmental failure in building services. Creating a plan that addresses these two types of issues is critical. Our goal is to reduce our maximum time to recovery to less than 60 minutes.

ICPSR is addressing these issues by implementing a three-phase plan. Phase I deploys a replica of our Web server in an off-site location along with controls to redirect clients to the replica during a failure of ICPSR's primary Web services. Phase II redeploys ICPSR's primary Web services from the Perry Building to another location. Phase III explores alternative methods for delivering ICPSR Web services, such as a distributed architecture that could plug into content delivery networks.

Phase I: Deploy a Replica of ICPSR's Web Services

Overview

In Phase I of the solution, we deploy a replica of ICPSR's Web services in an off-site location, and we create the necessary infrastructure so that Web browsers can be redirected to this replica during a loss of service with the primary systems.

Phase I Tasks

Tasks fall into two main groups: building and testing the replica, and deploying infrastructure to effect the redirection to the replica. Below is a brief list of the tasks involved.

  • Purchase replica hardware
  • Configure replica hardware
  • Design synchronization system
  • Deploy synchronization system
  • Alpha-test replica
  • Deploy replica at partner location
  • Create Memo of Understanding (MOU) between ICPSR and partner to document the
  • arrangement
  • Extensively test replica
  • Design simple mechanism that quickly redirects clients to the replica
  • Deploy redirection system
  • Implement "Control and Information Center" (CIC) on outside infrastructure
  • Implement 24 x 7 monitoring of our primary content delivery system
  • Implement real-time notification system for system failure
  • Implement fail-safe notification system
  • Implement on-call rotation to respond to failure notices
  • Redeploy search engine to a highly available system
  • Add visual cue to replica to distinguish warm backup site from the production site

Phase II: Move ICPSR Systems to Another Location

Overview

In Phase II of the plan we relocate all public-facing ICPSR Web delivery systems from the Perry Building to another location that offers standard data center amenities such as raised floors, generator power, cable trays, high-speed networking, card-key and retina-scan access controls, and more.

Phase II Tasks

  • Apply for space with a new vendor
  • Decide on the specific computing equipment to relocate
  • Decide on the specific computing equipment to purchase
  • Complete and return necessary forms that enumerate equipment and power requirements
  • Research and purchase auxiliary pieces of equipment, such as remote console access
  • Schedule move window and announce any downtime to community
  • Arrange movers
  • Execute move
  • Test equipment that has moved
  • Restore production service

Phase III: Explore Alternative Services to Deliver Content

Overview

In this phase we change the architecture of the ICPSR content delivery system so that it can be delivered via a network of arbitrary size. In the short-term this is a "blue sky" exercise during which we can consider many different alternatives. One option includes moving much of our content delivery services into cloud systems or professional content delivery networks.

Phase III Tasks

The main task in this phase is conducting strategic planning discussions among the organization's executive team about ICPSR's content delivery goals.