• On The Insider: Sexiest Magazine Covers of All Time

August 18, 2006 4:00 AM PDT

Spying an intelligent search engine

While most would agree that Google has set the current standard for Web search, some technologists say even better tools are on the horizon thanks to advances in artificial intelligence.

Search is like oxygen for many people now, and considering Google's breakthroughs in Web document analysis, supercomputing and Internet advertising, it can be easy to think this is as good as it gets. But some entrepreneurs in artificial intelligence (AI) say that Google is not the end of history. Rather, its techniques are a baseline of where we're headed next.

Proponents of AI techniques say that one day people will be able to search for the plot of a novel, or list all the politicians who said something negative about the environment in the last five years, or find out where to buy an umbrella just spotted on the street. Techniques in AI such as natural language, object recognition and statistical machine learning will begin to stoke the imagination of Web searchers once again.

"This is the beginning for the Web being at work for you in a smart way, and taking on the tedious tasks for you," said Alain Rappaport, CEO and founder of Medstory, a search engine for medical information that went into public beta in July.

"The Web and the amount of information is growing at such a pace that it's an imperative to build an intelligent system that leverages knowledge and exploits it efficiently for people," he added.

Medstory is not alone. Other young companies such as stealth start-up Powerset and image specialist Riya are also looking to turn arcane computing techniques into business success stories.

Unlearning "keywordese"
In the eyes of a search engine, the Web is essentially a body of words on billions of pages, along with the hyperlinks that connect the words. One of Google's big breakthroughs was to link those words efficiently, measuring relevance by the appearance of words on a page, and the number of hyperlinks pointing to that page, or its popularity.

As a rule, search engines don't understand the words--they're merely programmed to match keywords that are more significant on a page, closer together or linked more often from other pages. So in essence, when someone types in "woolly mammoth," he or she sends the search engine on a wild goose chase for those words, not the animal.

As a result, search engines miss the nuance of the human language. For example, Google might render a simple search for "books by children" by scouting for the pages that include the words "books" and "children," but it would eliminate the so-called stop word--in this case, "by"--because stop words are those that occur on almost every page. Yet those stop words occur so often because they are important to the meaning of a phrase. "Books by children" is different from "books about children," and different still from "children's books."

Barney Pell Barney Pell

Barney Pell, founder of a yet-to-be-launched AI search engine, calls the restrictive language of search engines "keywordese."

"Search engines try to train us to become good keyword searchers. We dumb down our intelligence so it will be natural for the computer," said Pell, whose company, Powerset, is based in Palo Alto, Calif.

"The big shift that will happen in society is that instead of moving human expressions and interactions into what's easy for the computer, we'll move computers' abilities to handle expressions that are natural for the human," he said.

Powerset, which hasn't divulged its launch date yet, is using AI to train computers not just to read words on the page, but make connections between those words and make inferences in the language. That way a search engine could think through and redefine relevance beyond the most popular page or the site with the most occurrences of keywords entered in a search box.

See more CNET content tagged:
Artificial Intelligence, search engine, word, children, Google Inc.

Add a Comment (Log in or register) 8 comments
AI
by Drewky August 18, 2006 6:55 AM PDT
Artifical Intelligence is a meaningless buzz word. A rule based system would be accurate but no where near the emotional context of artifical intelligence. Inference Engine would be just as buzzy but much more accurate. I am sure others could come up with other terms. Maybe you should hold a contest to see who comes up with the best descriptive term. I submit Inference Engine.I hear AI and I instantly think Vapor.
Reply to this comment
Face recognition has great potential
by WesFlash August 18, 2006 8:52 AM PDT
Forget finding a girl that looks like my last girlfriend and will remind me of that pain. Law enforcement searches will benefit from facial recognition. From matching a missing kid to a kiddie porn image to finding a fellon that changed his/ her identity. For law enforcement it will be an new and invaluable tool, trolling the mug shots and internet in search of possible matches to all kinds of case files.
Reply to this comment
AI is just code word for vapor/hoax ware.
by Sea of Cortez August 18, 2006 9:19 AM PDT
I agree with other poster, AI is just code word for vapor/hoax ware.

If I had gotten a dollar for each time I heard of AI and then nothing of sort was ever delivered I would be a rich man.
There is no such a thing as AI (Artificial Intelligence) software and there wont be any, because we have no idea of how to emulate complex intelligent thinking in software, there is only good software engineering.

So it is not AI that is going to make a better search engine to Google & Yahoo, it is innovative new search engine ideas which are implemented based on good software engineering work that it is going to do it. And I will tell you about a search engine that is better than Google or Yahoo,
it is called Anoox, and it is better based on these points:
1- It is powered by the Knowledge of the people
2- It is operated in an Open fashion, so NO one company owns/controls it but 100's of different company's from around the world will
It is here in case:
www.anoox.com
Ah, another reason it is better, it is also not-for-profit.
Reply to this comment
Goolge has pushed the limit of html search
by darmik August 18, 2006 12:17 PM PDT
google has pushed the limit of html search. as long as pages and sites continued to be created in a way and format that does not allow for descprtive classification any real AI will be impssoble
Reply to this comment
True Natural Relevance not AI
by Rob at Collarity August 18, 2006 2:23 PM PDT
I think AI as a search descriptor is a little oxymoronic as well. The comments here demonstrate how each person has their own contextual relationship to the keyword "AI".

I do however agree that search as a science is in its infancy and that current cpu/storage efficiencies now make it possible to deliver an events-based (usage) search architecture connected to individual users, instead of the current link-topology-connected-to-no-one system we?ve all learned to love and hate.

We?re also not ready to throw the ?keywordese? out the door. We believe there is a lot of natural intelligence in keyword associations (i.e, John Battelle?s ?data base of intentions?) that can provide an order-of-magnitude better relevance if they can be properly distilled ? whether it?s applied to search, feeds, or media. We think the key is making it implicit (no explicit tagging or rating) and to make sure you have 100% participation ? every user is both an information consumer and an information provider for every other user in the system.
Reply to this comment
Its time to roll on the next generation Search Engine
by guyfrom2006 August 19, 2006 1:15 AM PDT
The popularity of the internet is that it offers a variety of content not available in any other medium. A search engine assists users to locate the information.

The Internet is supposed to be for fun not serious business.

And that is why most people use research tools and offline content to achieve their results. Internet search is the last place a true researcher would go to find top rated content. Only after all other options are closed.

What if the internet offered this as the very first option and people could be sure that all the information that the internet and internet search throws up is Top rated quality stuff and it is available free/subscribed/paid. I am myself a researcher and would love to have such a tool in my hands instead of having to spend a huge amount on buying proprietary research content.

Once I finished using the content, I need not pay for it. There are many websites such as Janes.com which provide this sort of quality info.

Present Internet Search is neither intelligent nor smart. Cluster search engines such as Vivisimo and clusty.com help to some extent but stull trash out the same stuff.

In this respect, I admire the efforts of NetAlter which is bringing a radical new search engine that would offer quality content and meta information. According to NetAlter, their search engine would offer a variety of pre-search and post search tools that enable sorting, comparing and analyzing of search results and also offer a single click ecommerce connection.

Check out the NetAlter search whitepaper and presentation.

http://www.netalter.com/Solutions/search.pdf

http://www.netalter.com/Solutions/Search.pps
Reply to this comment
new search required
by Himanshu_Joshi August 21, 2006 1:36 AM PDT
3D search idea by Micosoft is the one i find quite useful and it invloves user interaction as well... currently many search engine do give anonymous links to increase the number of pages but only some r useful.
thanx
Himanshu joshi
http://montoojoshi.googlepages.com/onlinecompiler
Reply to this comment
AI is already being added to search engines
by readyforthefuture August 31, 2006 9:13 AM PDT
People need to start following the AI bouncing ball at Google and other search innovators. The integration of the CYC taxonomy, for example, appears to be well under way with available frameworks that allow natural language and human logic to prevail over "keywordism" in the very near future.

http://www.cyc.com/cyc/cycrandd/areasofrandd_dir/distributedai
Reply to this comment
Powered by Jive Software
advertisement

Latest tech news headlines

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

More feeds available in our RSS feed index.

advertisement

Inside CNET News

Scroll Left Scroll Right