What is Configuration Drift?
by Brian Schwarzentruber
Configuration drift is a data center environment term. At a high level, configuration drift happens when production or primary hardware and software infrastructure configurations “drift” or become different in some way from a recovery or secondary configuration or visa versa. Production or primary and recovery or secondary configurations are designed to be identical in certain aspects is order for business resumption should there be a disaster or major failure in production. When these infrastructure configurations drift from another, they leave a gap between them which commonly called a configuration gap.
Configuration drift is a natural condition in every data center environment due to the sheer number of ongoing hardware and software changes. Configuration drift accounts for 99% of the reasons why disaster recovery and high availability systems fail. Unidentified configuration drift exposes an organization to high risk of data loss and extended outages. Configuration drift needs to be identified and corrected when it happens to eliminate these risks.
There are basically two approaches to identifying configuration drifts when they occur. One method involves manually reviewing each production configuration and comparing it to the recovery or secondary configuration. This is often done prior to a disaster recovery test and is very time consuming and expensive. During the test planning process different spreadsheets and other data that list and describes the individual hardware and software devices that make up the configurations are brought together from the different functional infrastructure groups for comparison and reconciliation. These groups include, but are not limited to, Data Storage, Server/Platform/Mid-range and Database. There are often large discrepancies between these different lists, which serves to compound the difficulty of the effort and miss configuration gaps entirely. This explains why 3 out of 4 disaster recovery tests fail, leaving large amounts of data exposed to unnecessary risk.
The other method involves simply identifying the configuration gaps in the environment. Some organizations have recognized this and have developed and maintain scripts that run periodically to search for these gap “signatures” left by a configuration drift. This works well, however, it is often limited to a few gaps, and each script typically looks for one gap. Their scripts only grow as more configuration drifts are discovered by failed disaster recovery tests or worse failed production recovery efforts.
How do I know this?
I know this because Continuity Software maintains an ever-growing database of gap signatures. Currently there are about 2,000 gap signatures (today over 5,000) that can be used to analyze your environments on a daily basis. Once a gap is identified the configuration drift can be corrected and your valuable data is once again protected.