Recently, a major technology vendor sent out questionnaires to senior business managers about data and decision-making. A number of them came back with additional comments, most of them variations on a theme: “Data is buried in a sea of noise.” “Swamped in information.” “I’m drowning.” Despite—or perhaps partly because of—a sizable drop in the cost of storing and retrieving information, many corporations are in danger of being swamped by information. Software applications from ERP to CRM to SCM may generate great efficiencies, but they also generate great floods of data. So great, in fact, that nowadays CIOs speak of petabytes (quadrillions of bytes) of storage rather than mere terabytes (trillions), a trend that must surely worry the branding heads at Dayton-based Teradata, a subsidiary of NCR Corp. But not the sales heads: in a survey released by the technology company in September, more than half of 158 corporate executives said their businesses have two or three times the amount of information available to them as they had last year.
What’s more, a lot of that data is useless, or worse. Experts estimate that anywhere from 10 percent to 30 percent of the data flowing through corporate systems is bad—inaccurate, inconsistent, formatted incorrectly, entered in the wrong field, out of a value range, and so on. In its most recent study of corporate data integrity, the Seattle-based Data Warehousing Institute found that nearly half the surveyed companies had suffered “losses, problems, or costs” due to poor data. The estimated cost of the mistakes? More than $600 billion.
Now, the potential cost of poor data management is about to rise. Under Section 404 of the Sarbanes-Oxley Act of 2002, which goes into effect in June 2004, publicly traded companies will be responsible for providing “full, fair, accurate, timely, and understandable disclosure” in their periodic reports.
Obviously, you can’t have accurate financials without accurate financial data. But identifying all Sarbox-relevant financial data and funneling it into a single report is no small feat. “It’s a big dumping of data,” says Mark Nagelvoort, an internal-controls manager who is heading up the Sarbox-compliance effort at Mahwah, New Jersey-based Hudson United Bank.
And that’s nothing compared with what companies may be forced to do with their unstructured data—the E-mails, contracts, and PowerPoint files that account for 80 percent of corporate information. Right now it appears that courts will treat such information as discoverable evidence in Sarbox-related prosecutions—an ugly prospect. Hence, many companies are now scrambling to archive as many E-mails, letters, and memos as possible. Warns James Watson, CEO at Chicago-based consultancy Doculabs Inc.: “Some companies are going from saving nothing to saving everything. It’s phenomenally dangerous.”
Dirty Rotten Data
Finance chiefs have been down this path before. In the mid-1990s, senior executives began routing data from far-flung financial, supply-chain, and customer-information systems into data warehouses and data marts.
On the drawing board, the projects made sense. By analyzing slices of company data, managers could spot trends and make better decisions. But in reality, data warehousing was, and is, fraught with difficulties; for every successful project, there is a failed one. And even with the successful ones, getting the right number can take forever. True, search and query speeds have improved dramatically in the past few years. Likewise, the cost of the software used to store the data has dropped. “The dot-com bust has brought down the cost of mature technologies like data warehousing,” says Danny Siegel, New York-based senior manager (global business technology) for Pfizer Inc.’s pharmaceuticals group. Siegel says data-mining tools cost one-fourth what they did a few years back (see “White Goods for Data?” at the end of this article).