Data: It's growing, stretching our ability to store it. A new wave of "mining" tools just might lead us to data's real value - as information
by Lisa Lewison, Database Programming and Design, February 1994.
For over three decades, companies have been collecting massive amounts of data on a myriad of topics, from customers to inventory to invoices. This glut of data often lands in the nether reaches of the data world - on tape, in vaults of memory on mainframes, and in specially commissioned storage shops. The data typically sits there, unused, until an auditor or government regulation deems it safe to throw out.
This scenario is changing radically as corporations around the world take a second look at the power hidden in their databases. Instead of files figuratively gathering dust, historical data is now being viewed as an invaluable, proprietary resource that can uncover patterns and hidden meanings that can actually predict the future.
Businesses place a high value on forecasting. These organizations want to use their data to make accurate predictions about several critical issues: inventory amounts, response to mailings or bonus offers, fraudulent credit card usage, the cost of insurance claims, which loans will go bad, or product demand, which customers will "churn" (that is, leave one vendor for another). Armed with accurate forecasts, businesses can save millions of dollars.
Analyzing historical data to find patterns that shed light on the present is loosely described as "data mining". Data mining not only answers predictive business questions; it can also reveal the most important attributes influencing the predicted answer. In many cases, being able to reach this level of understanding proves at least as important as the prediction itself. This fact is why, says Casey Klimasauskas, president of NeuralWare Inc., "Data mining is a powerful levee to help hold back the Mississippi flood of information."
The increasing power of PCs has already begun to usher in new ways of presenting and accessing data, text, multimedia, and other types of information. Powerful PCs have also made data mining commercially viable. Calculations that would take days on mainframes (assuming time was available to run the calculations) or would be impossible on less-powerful PCs or workstations can often be done within several hours on a typical 80486 PC. While some companies have already employed statisticians to build "models" that project trends in the business, traditionally, only two to four models are completed in a year. This figure hardly keeps up with the demand to understand business problems. Data mining technology can roll out predictive models in a day or a week.
One of the best-known ways to mine data is with a neural network. Neural network software is a complex mathematical computer model of the way a collection of brain cells, called neurons, operate - that is, learn from experience, develop rules, and recognize patterns. Neural "nets" are designed for pattern recognition among complex data elements. Using various types of algorithms, neural nets are typically applied to between 300 and 6,000 rows of data (although considerably more or less data is possible) and use selected attributes (typically from two to 200) as "inputs." The neural network user must experiment to find the best set of inputs to "train" the net.
To operate effectively, a neural network requires clean data and considerable data preparation. Neural nets also work only on numeric data. If, for example, symbolic data such as "type" columns are included in the input attributes, each type must be converted into numeric format. Both new and experienced users find that building an effective neural network model usually requires a number of tries, with the attributes used as inputs being continually honed and refined. Neural networks are not learned overnight, but those who have worked through the process report that the effort is well worth it.
A leading vendor of neural net software is NeuralWare of Pittsburgh, Pennsylvania. Founded in 1987 by the husband and wife team of Jane and Casey Klimasauskas, the company has designed its business to make knowledge about neural nets as accessible as possible. "In many companies," points out Casey Klimasauskas, "neural networks are a natural for data analysts, business users, and statisticians who work with the data to understand how it relates to the company's business. One issue that arises among those who have read about neural networks and still don't understand them is this: If I give you a book on COBOL programming and then ask you, 'how do you balance a checkbook with it,' the answer is not obvious. Similarly, MIS people may read a book on neural networks, look at the problem they're trying to solve, and not see a connection". Klimasauskas related that in developing NeuralWare's training course, they knew they would have to bring potential users through a systematic, start-to-finish methodology, showing building blocks for reaching a solution.
Klimasauskas points out that one of the keys to success in using neural networks is having access to corporate data. "The data organization can facilitate this process," he states, "or they can massively hinder it. The MIS department should realize that with neural networks, they are using a mathematical technique for clustering things together. That means that a lot of fields in databases that are begrudgingly kept in customer databases, for example, really become important. MIS has a big role in validating and enhancing the quality of data. Data mining technology puts more emphasis on the importance of clean data." In other words, to use sophisticated tools of prognostication, end-user analysts are dependent on MIS making all of a corporation's data accessible, usable, and as clean as possible.
NeuralWare starts its training with a general, four-day course titled "Applying Neural Computing in Business, Industry, and Government." Most of the time is spent in understanding the types of problems to use in predicting and building a methodology for using a neural net. The trainees are then told to spend a month using the software, then to return for training in their specific application.
"A neural network is math, not magic," adds Jane Klimasauskas. Yet, she claims that a degree in statistics and modeling is not required to master neural network software. However, the user should have a good knowledge of the domain, a willingness to learn the new technology, and the time to spend experimenting.
The financial industry has been a fertile ground for neural nets. Traders and asset managers have been using neural nets for trend analysis and pattern recognition. Susan Garavaglia, a director in the analytical services department at Dun & Bradstreet Information Services, N.A., has been using NeuralWare products more than a year for credit evaluation and marketing. "Our customers asked us about neural nets," says Garavaglia. "We use the software to deliver a model to them. It has given us the resources and capability to do case studies and work more closely with our customers."
Another leading vendor of neural network technology is HNC of San Diego, California. HNC produces the DataBase Mining Workstation (DMW). HNC gears its efforts toward providing customized applications for customers, and primarily works through value added resellers (VARs) who tailor the DMW to specific applications.
Randy Richardson, president of Customer Insight Co. Inc. of Englewood, Colorado is a VAR for HNC. Richardson was already in the business of selling sophisticated, stand-alone customer databases for large corporations when he encountered HNC in 1992. Richardson dedicated his best programmer for six months to writing an interface between his proprietary database and the DMW in the belief that the DMW would provide a valuable service to his customers.
Richardson is actively selling the combined package-at prices in the six- to seven-figure range. He explains to chief financial officers that while a neural network might not produce better results than an in-house statistician, a high "opportunity cost" is accrued to the business from not having necessary models built. At one large cellular phone company, Richardson convinced the CFO that the DM could save $450 million in one year simply by accurately predicting customers who would "churn."
Richardson sells the customized package, the DMW, and 20 days of consulting for $80,000. "If I only sell the workstation," Richardson says, "I haven't solved their problem." This fact is because, adds Allen Jost, vice president of HNC's Decision Systems division, "In many cases, customers spend far more time organizing their data than modeling it. Getting the data organized is where [Richardson's] Customer Insight Co. really makes a difference."
Carol Klenke, micro marketing manager for First Commerce Corp., a large banking concern in Louisiana, is another DMW customer. Introduced to the DMW through Richardson's Customer Insight Co., she attended a three- and a five-day training session with HNC; afterward, she felt empowered to predict business problems. One of Klenke's first challenges was to determine the best customers for a marketing campaign involving auto loans. The attempt was so successful that the news traveled throughout the bank's 90 branches. Now, an associate of Klenke's works full-time building predictive models. "We are exploring how we can predict customer retention - who will stay with the bank and who will not. We're using the DMW for tracking direct mail. We experiment with different offers we create. When the results come in, we analyze them on the DMW to see if we had targeted the right group and if we should use that group again." Klenke says that she uses data samples of 1,000 records to come up with the results.
"At First Commerce, we believe in investing in technology, and we definitely plan to illustrate the payback we've achieved with the DMW," Klenke says. "We're happy about the buy-in throughout the bank. The bank card area, mortgage area, and the branches are all enthused. All the branches would like the retention analysis."
Another technology that performs predictive data modeling is fractal geometry. Fractal geometry is based on work originally applied to compression of terrain images for cruise missile projects. It is a mathematical means of compressing data. The compression occurs with no data loss, so an entire set of records, rather than a sample, can be analyzed. Since this technology can work on many gigabytes of data at once, it offers intriguing possibilities for companies that, for instance, may want to locate the three customers out of 30 million who responded to a particular promotion in a specified manner. This type of query could take days to process, even on hardware designed for gigabytes of data.
One vendor of fractal technology is Cross/Z International Inc. of Great Neck, New York. Cross/Z takes a selected portion of a client's entire database and, using its own IBM MVS-based mainframe, transforms the database into a fractal database that can be used as a PC-based file view of the data. Clients can then access the file via a DOS-based front-end tool called Private Eye, which designs and examines views of the fractalized data. For example, using Private Eye, you can determine that the largest response to a massive mailing came from a certain ZIP code and age bracket. A market analyst could then determine how customer response related to occupation for this subgroup. Analysts would have the security of knowing that they are dealing with full counts, rather than samples.
Do neural nets and fractal geometry products work on the same genre of problems? According to William Gillett, Cross/Z's vice president of business development, "Neural networks typically work on a subset of the data. We work on mission-critical problems with millions of rows of data. A client who wants to optimize a mailing may send us five million records, including who they mailed to and who responded. We build a fractal database on the entire file, splitting out a validation group, if required. A fractal model is then built, which is integrated with a front-end tool to display the results: a ranked file on the most likely persons to respond to the next mailing."
Jane Blume, senior marketing manager at American Express, confirms that Cross/Z was used to help identify customers who would upgrade their Executive Corporate card from green to gold. American Express's original mailing had received a four-percent response. This group, plus two million additional customer records, were turned over to Cross/Z, to determine who should receive the next mailing. According to American Express's Blume, "Cross/Z broke them out into 10 categories, with a number one as the most likely to respond. We sent out another mailing using the Cross/Z model, and it beat our plan. We had hoped for a 4.84 percent response; [the Cross/Z engineered plan] came in at 5.3 percent, exceeding our plan by 11 percent. Now the model will be updated again, looking at the 5.3 percent who did respond and those who didn't. Our goal is to upgrade constantly."
Stuart Spencer, marketing manager at American Express, adds, "The people at Cross/Z recognize that what they put together is complex. They bring it down to userfriendly levels. Without them, we would still be waffling in the vagaries of direct mail."
Cross/Z's Gillett says the company charges $13,500 to build a single model and offers a volumebased discount to build a series of models. Gillett says that the company builds more than 200 models per year. In addition to American Express, current customers include Allstate and Federal Express. In 1994, Cross/Z plans to introduce a software product that will enable companies to build their own fractal models onsite.
Cognitive Systems of Boston, Massachusetts takes yet another approach to data mining. The firm produces ReMind, a case-based reasoning tool. Case-based reasoning uses past experiences (as reflected in textual data) to solve current problems. In case-based reasoning, past cases are represented, indexed, and stored in a computer so they can be retrieved in the best possible manner. Case-based reasoning software builds up its own set of historic examples; each case added to its list of examples helps the computer learn. Using past examples, a case-based reasoning system is able to justify and explain how it arrived at a result and lets a user look at real instances of past occurrences.
Steve Mott, President of Cognitive Systems of Stamford, Connecticut, points out that companies have spent hundreds of billions of dollars over the past 10 years to create relational databases in the hopes of capturing a range of data. "Only 10 percent," he asserts, "are really delivering benefits of the original investment. Tools for current relational databases do not lend themselves to mining data. SQL is not accessible to the average person unless you know the structure and format of the data, and SQL packages don't do complex queries. When people want to know trends and patterns, you can't get that from SQL."
In addition, says Mott, "Data as it is collected today has a strong textual component. But neural nets convert text into numbers. Textual codes are given numeric equivalents. This translation can cause problems because the text's subtle distinctions and context are often important. Categories might have inherited relationships. A neural net has no prayer, for example, of trying to process input text in a help field, including all the problem reports and calls that go into the database."
In order to resolve these problems, Cognitive Systems gave its ReMind tool a natural-language component. A number of ReMind's customers use the software for powerful help desk applications. "One customer has a 50,000 query case library," says Mott, "meaning that new queries can be matched against 50,000 samples in the library. It has a 95- to 96-percent level of accuracy." Cognitive Systems is now building case-based reasoning templates for the banking industry in the areas of investment selection, bankruptcy prediction, and credit risk.
At a major food manufacturing company, Peter Ducksbury Smith, a principal senior scientist, is using ReMind in a process control application, where a new, hightech system installed in a manufacturing plant is logging 100 analogs every 30 seconds for seven machines. Before Smith began work on the project, the company was throwing the data away every three days because it was impossible to interpret or store it.
No more. Before the data is scrapped, Smith uses neural nets and statistical induction to come up with a model of what is actually taking place in the massive data-gathering mechanisms. He then uses ReMind to interpret the findings of the neural nets, thus overcoming the "black box" (limited explanation capabilities) of neural networks. "ReMind makes the data more useful and finds things out that people didn't know about," says Smith. "The results are very promising. I'm not seen as a crazy scientist sitting off in the corner. I'm now seen as making the data useful to the manufacturing plant managers."
Neural nets, fractal geometry, and case-based reasoning are by no means the only technologies available for data mining. Three other examples of software vendors now donning a data miner's helmet include: Abtech Corp. of Charlottesville, Virginia, which is using an abductive network modeling approach; TeraNet of Nanaimo, B.C., Canada, which is marketing ModelWare, the Universal Process Modeling Algorithm; and Reduct Systems Inc. of Regina, Saskatchewan, Canada, which has released Datalogic/R, based on rough sets.
Many companies are finding that a suite of data mining tools is most helpful in generating predictive models for business problems. The vendors, although excited by the wide application of data mining technologies, realize that some problems are better solved by other technologies, and are trying to advise their clients wisely.
Whatever approach is taken to data mining, however, more business users and CFOs are waking to the real, bottom-line gold in "them thar" databases. If the rush to test and use data mining techniques increases, DBAs will find the spotlight on them. The years of effort already spent in building a richer set of attributes in relational databases, enforcing ranges and type codes, and analyzing and codifying data's meaning and content will pay off more handsomely than ever before.
Lisa Lewinson is president of Northstar Consulting Inc., a Chicago-based firm that develops data mining applications and knowledge-based systems. She can be reached at (708) 786-3922.