Archiving is hardly a new idea. People have been storing records with the thought that they might have historical value in the future for as long as writing has been around and maybe longer. Within the context of computing, archiving has existed for a long time as well. Traditionally, archiving has essentially represented the end of the line for data. Information was offloaded onto tape, then carted away to a safe location, presumably never to see the light of day again except under dire circumstances.
But that was then. As with so many other areas of the storage infrastructure, the nature of archiving is changing rapidly. Archiving can no longer be treated as an afterthought. Although it's still the last stop before data is expunged, archiving now plays a more critical role in the overall information infrastructure and consequently must be more closely integrated within the entire system.
When viewed in the context of all stored data, computer storage is actually more about archived data than the storage of production data. According to a recent study by market researcher Robert Abraham of Freeman Reports, the total installed base of all archived computer data is 15.28 exabytes, which represents 80.3 percent of all stored data. (An exabyte is 1024 petabytes; a petabyte is 1024TB--so 15.28 exabytes is a lot of data!) The installed base of all stored computer capacity is projected to grow at a compound annual rate of 49 percent through 2007. At that rate, the installed base capacity of disk and tape will top the 90-exabyte mark in 2007. Take 80 percent of these figures, and you get the projected numbers for archived data. Finally, 99.9 percent of all archived data is stored on tape media. As measured by the amount of data stored on its products, StorageTek is the market leader in archived digital data with 36.2 percent, followed by IBM with 16 percent, ADIC with 15.3 percent, and Overland Storage with 10 percent.
Two factors are driving renewed attention and interest in archiving: regulatory compliance and the sheer volume of data growth. Of course, compliance issues have been well aired for several years now. It seems as if every month, some high-profile lawsuit revolves around email evidence exhumed from an archive. A raft of regulations have put IT managers on notice that information that they once thought could be tucked away for safekeeping might have to be retrieved within a relatively short timeframe. Not only must they keep track of data, they must also ensure that specific records are accessible.
As daunting as the new management requirements for archived data driven by compliance concerns may seem, they're only part of the challenge, and a small part at that. In fact, argues Russ Kennedy, director of software product marketing and management at StorageTek, "Compliance generates 95 percent of the discussion, but it's only 5 percent of the problem."
The real issue, according to Kennedy, is the sheer volume of data growth. With data being generated at such a rapid clip, he notes, it's impossible to archive everything. "You can't accumulate data forever," he says.
The solution, according to Kennedy, is what he calls intelligent archiving. Intelligent archiving has many facets. Perhaps the most important is the ability to pull information from disparate systems into a common repository while maintaining the metadata needed to preserve the context within which the data was created and used. Ironically, that process has been the essence of archiving from the beginning, as anybody who's rummaged through the personal papers of historical figures knows.
Secondly, in addition to aggregating data from multiple sources, intelligent archiving requires the ability to eliminate duplicate copies of data. By some estimates, as much as 80 percent of all stored data consists of redundant copies and copies of copies.
The third aspect of intelligent archiving is implementing policies that can effectively eliminate data at the appropriate time. Although many regulations mandate that data be stored for specific periods of time, those periods are finite. Certain HIPAA regulations, for example, require data about children to be stored for as long as 25 years. But when 25 years has passed, that data should be expunged, probably without human intervention.
Interestingly, intelligent archiving might be just the first step in the reorganization of archiving. Increasingly, archived data is stored on devices still attached to a network and not in an offsite location. As such, it could represent an important corporate asset if it could be analyzed and understood. In that scenario, the value of archived information and the ability to apply it to meet business needs could increase exponentially.
Calling All Windows IT Pro Innovators!
Have you developed a solution that uses Windows technology to solve a business problem in an innovative way? Enter your solution in the Windows IT Pro Innovators Contest! Grand-prize winners will receive a host of great prizes and a write-up in the November 2005 issue. Contest ends June 24, 2005.
To enter, go to Windows IT Pro Innovators Contest.