CFO
Menu
  • Accounting & Tax
  • Banking & Capital Markets
  • Growth Companies
  • Human Capital & Careers
  • Risk & Compliance
  • Strategy
  • Technology
  • Sign InSign Up
CFO
  • Conferences
  • Webcasts
  • Research
  • White Papers
  • Jobs
  • Training
  • Newsletters
  • Magazine
CFO
The Ongoing Evolution of FP&A
Global Survey Identifies 7 Key Insights
How to Spot a Fraudulent M&A Target
Here are some of the red flags of fraud that CFOs…
Does Diversity Pay Off?
CFOs Look to Quantify Inclusion Initiatives
  • Accounting & Tax
  • Banking & Capital Markets
  • Risk & Compliance
  • Human Capital & Careers
  • Growth Companies
  • Strategy
  • Technology
Technology

From Factoids to Facts

At last, a way of getting answers from the web.

Economist Staff
August 31, 2004 | The Economist
share
Tweet
Print

Email this article

What is the next stage in the evolution of internet search engines? AltaVista demonstrated that indexing the entire world wide web was feasible. Google’s success stems from its uncanny ability to sort useful web pages from dross. But the real prize will surely go to whoever can use the web to deliver a straight answer to a straight question. And Eric Brill, a researcher at Microsoft, intends that his firm will be the first to do that.

Dr Brill’s initial crack at the problem is a system called “Ask MSR” (MSR stands for Microsoft Research). This program uses information on web pages to respond to questions to which the answer is a single word or phrase — such as “When was Marilyn Monroe born?” Ask MSR starts by manipulating the question in various ways: by identifying the verb, for example, and then changing its tense or moving it into different positions in the sentence (“Marilyn was Monroe born”, “Marilyn Monroe was born” and so on). The resulting phrases are then fed into a search engine, and documents containing matching strings of words are retrieved. It sounds a promiscuous strategy, but gibberish phrases produce few matches, so, as Dr Brill puts it, “being wrong is very cheap.”

Recommended Stories:
  • The Summer of Our Content
  • MCI’s Former CFO Pleads Not Guilty
  • Search Is On

Once accumulated, the pile of documents is scanned for possible answers, and these are ranked by frequency. In practice, the correct answer appears in one of the first three places around 75% of the time. That might not sound very good, but human intelligence provides a second filter, since wrong answers are often obvious. If you ask how many times Bjorn Borg won Wimbledon, for example, “1980” is not a plausible answer, but “5” is. If in doubt, clicking on an answer produces a list of links to pages which provide support for that answer.

Ask MSR is still a prototype, although Microsoft is trying to improve it and it may be launched commercially under the name AnswerBot. Dr Brill, meanwhile, has moved to a more difficult task. One of his most recent papers, written jointly with Radu Soricut of the University of Southern California, is entitled “Beyond the Factoid”. It describes his efforts to build a system capable of providing 50-word answers to questions such as “What are the rules for qualifying for the Academy Awards?” This is harder than finding a single-word answer, but Dr Brill thinks it should be possible using something called a “noisy channel” model.

Such models are already employed in spell-checking and speech-recognition systems. They work by modelling the transformation between what a user means (in spell-checking, the word he intended to type) and what he does (the garbled word actually typed). Just as a telephone line distorts the voice of the person at the other end of the line, this process can be thought of as being a noisy channel that transforms the user’s intention into something rather different.

By analysing many pairs of correct and mis-spelled words using statistical techniques, it is possible to predict how such transformations work in general cases. A system can then be designed to work the process backwards. Given a mis-spelled word, it can guess what that word is most likely to be a mis-spelling of.

Dr Brill’s question-answering system does something similar. Many question-and-answer pairs exist on the web, in the form of “frequently asked questions” (FAQ) pages. Dr Brill trained his system using a million such pairs, to create a model that, given a question, can work out various structures that the answer could take. These structures are then used to generate search queries, and the matching documents found on the web are scanned for things that look like answers.

The current prototype provides appropriate answers about 40% of the time. Not brilliant, but not bad. And it should improve as the web grows. Rather than relying on a traditional “artificial intelligence” approach of parsing sentences and trying to work out what a question actually means, this quick-and-dirty method draws instead on the collective, ever-growing intelligence of the web itself.

Post navigation

← The Week Ahead
The Tract of the Matter →

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Advertisement

Popular Articles

  1. 10 Habits of Highly Effective CFOs
  2. No Mystery How to Restrain Health Costs
  3. Zero-based Budgeting Is Surging
  4. Pay Ratio Disclosures Mislead Investors
  5. No More Tax Deductions for Bad Actions
Advertisement
 

Topics

  • Accounting & Tax
  • Banking & Capital Markets
  • Human Capital & Careers
  • Growth Companies
  • Risk & Compliance
  • Strategy
  • Technology

Media

  • Videos
  • Whitepapers
  • Research
  • Magazine

Events

  • Conferences
  • Argyle Events
  • Webcasts

Services

  • Reprints
  • Back Issues
  • Mobile
  • Widgets
  • RSS

About CFO

  • About CFO
  • Editorial Staff
  • Press
  • Advertise
  • Contact Us

Want the Magazine?

Relax and unplug with our award-winning coverage.

Subscribe Now
Follow Us