Web search five years later

I was interested to reread this old blog post of mine (“You Still Working On That?”), written five years ago when I was working at Excite Search. The main startlement on my part was how little has changed, and how much of it I still would agree with.

So what *has* changed? Well, the most obvious thing is that I wrote this cheery post in May of 2001, and half a year or so later Excite@Home was a huge smoking crater in the landscape of publicly-traded tech companies. For years later its see-through headquarters right off of 101 in Redwood City was one of the most obvious monuments to Bubble 1.0. Only now do you see begin to see a few cars parked near the southern end – the first colonization by Stanford Hospitals, its current owner.

(By the way, Excite@Home alumni can be pretty reliably partitioned into two groups: Excite people who blame the company’s failure on the merger with @Home, and @Home people who blame it on the merger with Excite. Me, I think that both companies were doomed – Excite by the fact that it had already lost the portal wars, and @Home by the fact that it was owned, dominated, and having its prices set by its customers (the cable companies). But I digress.)

Some other very obvious things have changed since my post – for example, I estimated the size of the web at 4 billion, which is now wrong by (only) an order of magnitude or two. But among things that remained true:

o Web search is not “done”, in any of a number of senses of being done: that it’s perfect, or that no one can think of ways to improve it, or that no one is willing to invest in making it better. It shouldn’t be surprising that is not finished in the sense of being perfect – a perfect web search would answer all the informational needs of any querier with any possible query intent, across all languages of the world. We’re still working on that, still.

o The competitive frontiers of websearch quality are still dual: both about scaling and about relevance issues that have nothing to do with making a literal-minded interpretation of the websearch task bigger and faster. As I said in the last post:

[A]t a minimum you can wonder: what kinds of documents will users like in general? what kinds of documents can you screen out from the very beginning? what can you infer about what the user _meant_ to ask for, but didn’t? what kinds of mistakes by the user could you correct? what kinds of documents might make the user happy, even if they didn’t include the search terms? what kinds of documents will seem frustratingly “the same” to users even if they are not literal copies? And so on.

Two things are true about these questions: 1) Today’s web search engines have better (and better-staffed) approaches to address every one of them, and 2) Not a single one of those issues is perfectly nailed.

Reading that five-year-old post now, though, what strikes me most is the competitive frontier it never mentions: monetization. Hey, I was just a callow engineer then! (Not young, even then, but still callow.). I had some dim sense that advertising paid for all this work, and I’d heard something or other about this GoTo model where people paid (paid!) directly per keyword for the privilege of ranking (and it sounded kinda sleazy to me at the time for some reason). But other than that, I really mentally left biz to the biz people.

The GoTo model (which later became Overture, and which Google successfully copied and improved), of course, is what has made all the last five years of investment in search possible, and why I now can sit in engineering meetings that are completely focused on one sub-aspect of web search and marvel at the fact that there are more people in the room than ever worked at all of Excite Web Search engineering.

Five years ago, many many people (including, importantly, the executive management of both Excite@Home and Inktomi Corporation) thought that web search was both Done and a Commodity. The subsequent revolutions in both quality and monetization proved them wrong. I know the risk here is of “fighting the last war”, but whenever I hear anyone say that websearch _today_ is both Done and a Commodity, I get a small headache. (I’ll return to this in a follow-up post.)

4 responses to “Web search five years later”

Smith

2006.07.17

Its really nice piece of info that Google has copied the GoTo model of Overture, but what really circle me round is Overture paid Search engine model than what does that resemble, is that the reason that google losing it relevance in saerch results.
—
How google profits from irrelevance –
http://www.organicspam.com/how_google_profits_from_irrelevance.asp
Greg

2006.07.19

I look forward to your next post. I can’t imagine there being a “last war” in this field, given the money and number of folks looking to game the engines.
Toby Sodor

2006.07.26

Although webmasters and SEO may complain about the quality of the SERPs, I genuinely believe that users are being served well by the search engines. I would say that for 99% of searches, relevant results are presented. And if not, users are savvy enough to refine the search themselves. It’s the few irrelevant or spammy results that do stand out though. The problem is, for a computer, the spammy and relevant results can look very similar. Now that’s a challenge.
Search Engines WEB

2006.08.17

In reference to EXCITE – please do a topic on *CONCEPT* Search Technology.

Excite was innovative in its time as attempting this….and the SERPs were unique enough to make this one of the several search choices used when looking for as many relevant sites as possible (the way power searchers searched back then)

Also, excite was innovative in assigning *RELEVANCY* numbers for each search SERP listing.

It would be interesting to search historians, if a topic would elaborate on the ideas for these innovations.