Information Life Cycle

Assessing the cost of retaining enterprise data over time. Building an information life cycle program requires organization to understand the…

Assessing the cost of retaining enterprise data over time.
Building an information life cycle program requires organization to understand the actual cost of retaining information accumulated during the course of business operation for a period of time that is dictated by regulatory constraint and of course business. From a business operation point-of-view it appears that data involving operations conducted five years ago may not be as relevant as business data generated last week. At the same time regulatory constraints may dictate information retention policies mandating seven years as an example. Given the above-mentioned requirements, information retention policies will define ‘information latency’ as the time needed for information in storage to be available. This concept helps discover the real cost of storing enterprise information according to their value to the business, to the regulatory environment, to their ease-of-access from storage and the cost of each type of technology while this information transitions from one storage technology to another during its entire life cycle.

Building a scalable low cost storage technology
The maturity and large scale adoption by large internet companies of fault-tolerant highly distributed file systems makes possible the use of low cost servers nodes equipped with off-the-shelf storage disks for in-line storage and archiving purpose. In this scenario the scale-out model permits to accommodate growth of data by simply adding nodes. The software based nature of advanced distributed file systems eases information classification by employing embedded meta data repositories while also providing fault tolerance. This use of open source, open standards and low cost storage disks and server nodes brings the best operational value compared to proprietary models.
A robust set of solutions
Our Information Life cycle Management framework incorporates automated processes and policies for managing the information over its entire life cycle with the following considerations:
-automated provisioning of multi-tier storage architecture
-availability of data with respect to its criticality in value-creation to the enterprise (SAN storage for inline rapid access, HDFS for longer term and data warehousing)
-identify, classify and store the data according to their respective applications (mail vs accounting)
-making data easily available for innovation and re-use for business value creation (data warehousing to assess the residual business value of historical data)
-incorporating archiving and long term storage solutions to the regulatory demands
-incorporating storage solutions to lessen the time needed to conduct a recovery operation after disruption
-develop policies and processes for information disposal and media sanitation for security requirements and re-use of storage assets
-the ability to match various open standards , low cost storage technologies to any combination of requirements
-we take advantage of advanced storage technology in Linux to define and implement the most cost effective storage solutions keeping in mind that freedom from vendor lock is a viable approach decreasing enterprise storage cost
Our powerful storage solutions: from SAN with data multi-path capabilities, logical volume implementations over directly attached and/or network attached block-based arrays of disks. NAS, and node-based distributed storage on COTS on a high performance distributed file system.
Logical Volume Management (LVM2) has shown tremendous flexibility in allowing the installation of data files on logical volumes that can be assigned to a wide variety of storage devices. This allows a hierarchically organized policy-based storage allocating higher performance devices to high value data volumes while lower valued data can be stored in volumes representing storage space from lower performance devices.