IBM’s Web Fountain – analysis software

From the same IEEE Online Spectrum Issue is a description of “Web Fountain”, IBM’s new analysis engine, A Fountain of Knowledge – 2004 will be the year of the analysis engine [By Stephen Cass]. This technology makes the Grokker program I pointed to the other day look like a child’s toy.Before I go into that, I want to put in a plug for that whole issue… the January IEEE presents a special report on technologies — going out on a limb and presenting “Winners, Losers and Holy Grails” for each of six categories of technology: communications, electric power, computers, bioengineering, semiconductors, and transportation. As well, they survey their tech leaders for the 2004 trends. Good issue!

Back to Web Fountain, one of their winners… software that provides a means of making sense of the overwhelming amount of data available online in disparate sites and data formats…

What such a researcher needs is not another search engine, but something beyond that–an analysis engine that can sniff out its own clues about a document’s meaning and then provide insight into what the search results mean in aggregate. And that’s just what IBM is about to deliver. In a few months, in partnership with Factiva, a New York City online news company, it will launch the first commercial test of WebFountain with a service that will allow companies to keep track of their online reputation–what journalists are reporting about them, what people are writing about them in blogs, what people are saying about them in chat rooms–without having to employ an army of full-time Web surfers.

Kind of wild to think that what I am writing now will be found and categorized by a system like Web Fountain… although I should be used to that idea by now.

Before you think…. I’ll have that, thank you… As the article indicates, the technology is not for the casual web surfer. It uses

“…half a football field’s worth of rack-mounted processors, routers, and disk drives running a huge menagerie of programs”

… and has a budget of over $100 million US, with 120 personnel.

The dollar figures for the pilot Factiva Service, which focuses on tracking a company’s online reputation, is between $150K and $300K a year. Be interesting to see how an ROI is calculated for that!

IBM is looking to partner with other industry sectors to expand out from this pilot service, including processing data that is publicly available online, as well as using internal company documents.

This article fascinates me and make me queasy. To be able to perform the kinds of analysis that is described is a technology, and human, marvel. To know that listservs, blogs, etc. are being monitored is disturbing.

I know, Michelle, grow up…. that’s happening already… but currently, at least, I think that most of those eavesdroppers are human… maybe…

Kind of a cool read, tho!

This entry was posted in Networking & Other Technologies. Bookmark the permalink.