Randy Collica is a modern-day treasure hunter. As a senior business analyst in Palo Alto–based Hewlett-Packard Co.’s customer data and knowledge services department, his job is to mine data in search of insight that can help marketers better understand various customer segments. He stumbled upon a veritable gold mine a few years ago as he riffled through notes taken by HP’s call-center representatives. “I just knew there had to be nuggets of valuable information in there, given the volume of data we had,” says Collica. “But I also knew that finding them would be impossible if we didn’t have a tool to automate the analysis.”
Although standard data-mining systems can detect patterns hidden within structured tables of information, such as the transactional data of an ERP system, they are essentially useless with unstructured data — and notes taken during a phone call are about as unstructured as data gets. So Collica turned to text mining, a type of data-mining technology that combs through text and gives it structure so it can be analyzed.
Collica’s hunch turned out to be right: text mining revealed, as one example, that customers in lower-value segments ask a lot more questions about business processes, such as HP’s contract-negotiation procedures, than do the company’s best customers. “That insight has been invaluable in helping marketers come up with solutions and campaigns targeted at different customer groups,” says Collica.
The latest generation of technology, developed by vendors flush with post-9/11 government investment (see “Parsing the Text Market” at the end of this article), is still far from perfect. But it is allowing corporations with large data sets to perform important feats they couldn’t before. “It really is the next frontier of understanding in business intelligence,” says Martin Schneider, an analyst at The 451 Group in New York.
Key to the improvements have been advances in natural language processing, a method of extracting meaning from printed words that now allows the software to “understand” complex phrases about 80 percent of the time. Text-mining systems can also be programmed to assign value to expressions. Suppose a telesales representative has entered the following note: “Nov. 15 – Cstmr not happy w/cell phne. Wants to switch to Yellow Inc.” The software can recognize that November 15 is a date; that “cstmr” is a customer; that he has a cell phone and is unhappy, which is bad; and that he wants to switch a competitor, which is worse.
Once that kind of information is extracted, it can be structured in a format similar to a database and further analyzed, often more quickly than a human analyst can locate his reading glasses.
And the possibilities aren’t limited to customer service. San Francisco–based LoanPerformance, a provider of credit-risk-decision support tools for residential mortgage operators, uses text mining to offer its clients improved predictive analytics. Traditional risk-scoring solutions for loss mitigation and delinquency management incorporate only structured data such as a borrower’s interest rate, outstanding balance, and monthly payments. That ignores rich information that could help a mortgage servicer better determine how likely a delinquent borrower is to miss more payments or, ultimately, default. “If someone says they missed a payment because they lost their job, that’s different from ‘I forgot to send my check,’” explains Damien Weldon, director of mixed-data analytics at LoanPerformance. When the company included data mined from call-center conversations in its scoring calculations, accuracy rose by 15 to 20 percent.