When Wawa, the $6 billion convenience-store chain, came up with a new flatbread breakfast product in 2008, the company’s marketing department was flat-out excited. The product had performed exceptionally well in spot testing and seemed more than ready for a systemwide rollout.
At that very time, though, Wawa was trying out a software-based approach to designing and evaluating customer-behavior tests. After it redid the flatbread test using the technology, the breakfast item was killed. “We found it was cannibalizing other, more-profitable products,” says Wawa CFO Chris Gheysens.
What made the difference was a more-scientific approach to selecting stores for test and control groups, as well as regression analyses to weed out irrelevant “noise” from test results. The software that provided the heightened sophistication, called Test & Learn, is from Applied Predictive Technologies (APT), which has established a strong following among large retailers.
Before signing on with APT, Wawa routinely performed what Gheysens calls a “good old manual” testing process, with financial analysts using spreadsheets to select stores and evaluate flux and trending data.
That process was problematic, says Gheysens. “We were never really confident as a finance group giving approval for new products or other initiatives, and we didn’t have a strong-enough voice to kill them either, because with that kind of manual analysis and all the noise in the data, you really rely on influence and emotion more than facts,” he says.
The software isn’t cheap. The average annual cost for the typical three-year license is between $700,000 and $1 million, notes Scott Setrakian, APT’s managing director. That price point defines its market: Fortune 1000 companies.
It takes anywhere from two weeks to a few months to establish a daily data feed from a customer to APT. Companies provide information by store, market, or merchandise class, and some also provide transaction-level data.
Test & Learn is designed to provide visibility into the impact of any kind of program, investment, or activity that may influence customer behavior. It provides three levels of understanding, Setrakian says: how an action will affect overall sales or profitability; how to tailor the action for maximum effectiveness, such as whether a 5%, 10%, or 15% discount is best; and the impact by market, store, or even by customer.
For designing tests, the software calculates the optimal number of locations to test and the test duration. For example, if sales vacillate greatly every day under normal conditions, more stores must be included in the test for a relatively longer period than if sales typically don’t vary much outside test environments.
The software then picks test stores that are representative of the full-rollout population in any number of selected attributes, such as store size, store age, or sales volume. And for each test store, the software finds others very much like it to compose a custom control group where the program will not be tested.