May 11, 2008 9:25 PM PDT

Powerset brings the Semantic Web to Wikipedia

Amid speculation that Microsoft is looking to make an acquisition, Powerset launched a public beta of its Wikipedia search engine. It brings a new, rich semantic dimension via natural language query processing to Wikipedia that greatly improves the search and reading experience.

The company calls it a first step in changing the way users search and consume Web content. "It's a complete shift. You see this and you want to experience all content in this way," Barney Pell, co-founder and CTO of Powerset, told me. "And, as an introduction, it will drive huge investment in semantic and linguistic technology, just as investments were made in information retrieval and scalable databases in the past. People working in this space will be very marketable."

Users can enter keywords, phrases, or simple questions in Powerset's search box. Like many Web startups, Powerset is currently free of advertising.

Powerset's natural language search technology is based on patents licensed exclusively from PARC and its own proprietary indexing. Powerset's engine has read 2.5 million Wikipedia pages and extracted "meaning" from the sentences, creating a navigation and semantic layer on top of the popular Web encyclopedia. Following is a pictorial tour of Powerset features:

Powerset has also indexed Freebase, Metaweb's evolving, open database of structured information. The search result page presents Factz, a summary of key information extracted from Wikipedia pages.

Factz can be expanded to display more of the extracted verbs and their associated words and concepts.

Powerset creates a summary of information, or Dossier, on the right side of the page with Freebase and Wikipedia to give users a quick outline view about a topic. Clicking on an item takes the user to the location in the article and highlights the reference.

Powerset generates a summary of the key Factz to create a kind of Cliff's Notes version of Wikipedia article. Clicking on a summary item takes the user to the reference location in the article and highlights the key words. Powerset also includes a page for disambiguation of queries.

Powerset also shows a tag cloud of things and actions found by its linguistic analysis engine on the page. Clicking on a word shows related Factz in the outline.

Powerset can provide direct answers to queries from its Wikipedia and Freebase index, and highlight the most relevant search results based on the meaning of the query. Hakia, another semantic search engine, as well as Google can also surface the date Picasso was born at the top of their results pages.

Powerset's Wikipedia search engine isn't going to slow down the Google in the near term, but it will raise the bar on the search experience for all players. "There are implications beyond Wikipedia," Pell said. " Search is not done. You can see the emerging Semantic Web with our integration of Wikipedia and Freebase. We will add other components with structured data and ways to answers questions."

Powerset has said that the longer term plan is to read, linguistically analyze and index 20 billion documents on the Web, which will be a costly and ambitious undertaking. (Getting acquired by Microsoft would be helpful for that project. Powerset has received $12.5 million in Series A funding from Foundation Capital, Founders Fund, and angel investors in 2006.)

Recent posts from Outside the Lines
The OpenSocial roadmap
Zoho's last stand
WordCamp in a nutshell
Coming up: Live Webcast coverage from the Democratic and Republican national conventions
Is Google leaving billions of dollars on the table?
Add a Comment (Log in or register) 4 comments (Page 1 of 1)
by BlackSand May 12, 2008 1:04 AM PDT
Search for "Quantum Mechanic":

Factz from Wikipedia: we found that Quantum Mechanic destroyed Earth

Quantum Mechanic Quantum Mechanic destroyed
Earth.
Results for Quantum Mechanic destroyed Earth

Captain Universe It then possessed Evan Swann to stop the Quantum Mechanic from destroying the Earth.
Reply to this comment
by Codonology May 12, 2008 5:35 AM PDT
I see this as an extra step for a set of complete learning and reasoning tools in future. However, we still have to ask this simple question: What is a concept? Before such fundamental question is truly answered, all the tools to be made are flawed in the value of its entirety.
Have a good future!
Hua Fang
(a codonologist , refer to my website: www.Codonology.com)
Reply to this comment
by michael_o May 12, 2008 10:09 AM PDT
I must be missing something: entering the same search results into the Wikipedia search directly seems to lead to better results: more intelligently grouped, equally germane, and longer descriptions.
Reply to this comment
by ghostofitpast May 12, 2008 10:32 AM PDT
I do not think you are missing anything. I discovered that I had to be really inventive at coming up with a question that could not be answered directly through Wikipedia. I wrote up my results at:

http://therehearsalstudio.blogspot.com/2008/05/playing-with-powerset.html
Reply to this comment
Powered by Jive Software
advertisement
  • About Outside the Lines

  • Dan Farber is the editor in chief of CNET News. He has covered technology for more than two decades, and he previously served as editor in chief of ZDNet, PC Week and MacWeek. Outside the Lines explores the intersection of business and technology.

Add this feed to your online news reader
Google
Yahoo
MSN

Latest tech news headlines

Featured blogs

Beyond Binary by Ina Fried

Coop's Corner by Charles Cooper

Defense in Depth by Robert Vamosi

Geek Gestalt by Daniel Terdiman

Green Tech

One More Thing by Tom Krazit

The Iconoclast by Declan McCullagh

The Social by Caroline McCarthy

Underexposed by Stephen Shankland

advertisement
On GameSpot: Wii Fit tells 10-year-old she's fat
Advanced
search
Advanced
search
Visit other CBS Interactive sites