Veritas Technologies recently released its inaugural Data Genomics Index which provides a view of the composition of enterprise data.
The Index revealed that over 40 percent of corporate data files have remained untouched for three years, creating an opportunity for businesses to positively impact their bottom line costs.
The Data Genomics Index, according to Veritas, is the first report to provide insights into today’s enterprise data environment and act as a comparison standard. These insights can potentially jump start an organisation’s initiatives to act intelligently and devote remediation efforts where they will get the best return. Key findings from the report include:
Developers dominate and presentations have had their day
The Index reveals that images, developer files and compressed files take up almost one third of the total environment. Developer files from a file count perspective are a massive 20 percent of the total number. When we look at trends over the past 10 years, relative to other file types, presentations have declined 500 percent. Finally we are trending away from death by PowerPoint.
We’re busiest in the fall
Fall dominates from a file creation perspective. The most drastic increases are 91 percent more text files, 48 percent more spreadsheets, and 89 percent more geographic and information system files. We apparently do most of our videography in summer and fall, and then save it to company disk. Videos jump 68 percent in the fall.
41% of the data environment goes untouched
With the exception of regulatory or compliance requirements, three years is a general standard for when data goes from potentially relevant to stale. Incredibly, 41 percent of the average environment is stale, or unmodified in the past three years.
Orphaned data is overly burdensome
Data without an attributed owner, either through role changes or employee departures, is orphaned. This data is often out of sight and out of mind for organisations and it is costing them. Based on the insights from the Index, orphan data tends to be content rich file types like videos, images and presentations – risky stuff to leave unattended. It also is taking up more than its fair share of disk space based on file count distribution – over 200 percent more.
Small changes can have a big impact on storage costs
With similar insights into their own data, organisations can prioritise areas to achieve significant returns. Traditional ‘office’ formats like presentations, spreadsheets and documents take up more stale space then they should, costing organisations unnecessarily. Visual formats like videos and images are also extra burdensome. These are where archiving, deletion or migration efforts are best spent. Considering the average 10 petabyte environment, an archive project focused on just stale presentations, documents, spreadsheets and text files, could return as much as $2 million a year in storage savings.
“One thing we hear all the time from our customers is they’re struggling with two competing forces of nature – the exponential data growth curve, and the restriction of resources and budget to fight it with new servers and applications,” said Steve Vranyes, CTO, Veritas. “By aggregating Veritas’ unique understanding of key metadata characteristics we can surface an accurate representation of the average data environment. These insights will change the crippling growth dynamic enterprises are faced with today.”
The Data Genomics Index is the first research that benchmarks accurate details of real environments – from the file type composition, to the average age distribution to the size proportions of their individual files. To provide a community and forum for this research and discussion, Veritas is also announcing the Data Genomics Project, a research initiative to help organisations understand the true nature of the unstructured data that they create, store, and manage on a daily basis. The inaugural Data Genomics Index is the first contribution to this cause. The Project will be a community of data scientists, industry experts and thought leaders that further builds the data genome for information management, and shares their research and discussions with organisations worldwide that are struggling to solve tremendous data growth challenges. While Veritas is a founding member and contributor, the project will remain commercially separate from the business.