Transcript
Google Bamboozle
March 30, 2002
BOB GARFIELD: We're back with On the Media. I'm Bob Garfield.
BROOKE GLADSTONE: And I'm Brooke Gladstone. The Internet search engine Google wound up in the middle of a religious spat last week. After receiving a letter from groups representing the Church of Scientology, Google briefly pulled the main page of Xenu.net, an anti-Scientology website, from their data base until it could check out the Church's claim that the site was in violation of copyright law. The claim turned out to be unfounded, and Xenu.net is once again popping up at number four in search results for the term "scientology." But both Xenu.net and the Church of Scientology have been battling for supremacy atop Google for a long time by using key knowledge of how Google works to manipulate its results. It's not the first time and certainly won't be the last that people have tried to dupe search engines. Joining us now is Daniel Sullivan, the editor of searchenginewatch.com. (Isn't it wonderful such a thing exists?) Dan, welcome to OTM.
DANIEL SULLIVAN: Hi. Thank you very much.
BROOKE GLADSTONE: So first, a primer on Google. What makes one web site pop up ahead of another site in a search?
DANIEL SULLIVAN: Well there's no easy answer, because Google uses a, a whole variety of factors. It looks at the words that are on the page, the location of the words that are on the page, how often they may appear. But it does give some extremely heavy weight to links across the web, and so a web site that has a lot of links pointing at it may be deemed as more important than a web site that has only a few. Or a web site that has only a few links pointing at it may still be seen as very important if those links are from very important web sites. So if the only links you got were from, let's say, Yahoo, CNN and perhaps from NPR, even though you had only 3 links, those 3 web sites are very important and so that might be much better than having 50,000 links from people who have little tiny web logs.
BROOKE GLADSTONE:So how are Xenu.net and the Church if Scientology battling it out to top each other when you search for the word "scientology" on Google?
DANIEL SULLIVAN: There was a suggestion that in the case of Xenu.net that a lot of people started linking to the site to try to get it up there in the rankings, and in fact I think that some of this linking that started to happen was one reason why it did com-- pop up there.
BROOKE GLADSTONE: Well, and if it's not being done for strictly informational purposes, that's called "Google-bombing," isn't it?
DANIEL SULLIVAN:Google itself would tend to refer to that activity as "link spam," which is the idea of creating artificial link structures to try to manipulate the results. And it can be a fine line, because if all these people are linking and they honestly believe this is a great site, then what they're doing may not actually be a bad thing.
BROOKE GLADSTONE: What other kinds of maneuvers have web sites pulled off in the past to get them higher in the search results?
DANIEL SULLIVAN:Well there's very basic things such as repeating a word over and over again. Some people would then hide those words by making them match the same color as the background page. A more advanced type of thing that would happen would be the creation of what's known as, say, "doorway pages," and this is where you create a page that may not make any sense if you looked at it as a human being -- it might look like a bunch of gibberish. But you've tried to concoct a page that seems to please a search engine's algorithm in a way that it will rank well for a particular term. The search engine's "spider," the thing that comes to you automatically and reads your web pages, it saw the gibberish and when it ranked your web page, it ranked your web page based on that content -- but when a human being clicked through, they never saw what the search engine spider saw at all. They saw a page that might be very attractive - might lead them to a sales form or any sort of other kind of material.
BROOKE GLADSTONE: Holy cow, Daniel. It seems only a matter of time before search engines become sentient.
DANIEL SULLIVAN:Well in terms of trying to hunt down and eliminate this sort of stuff, they've actually had to be that way for some time. These battles against search engine spammers have gone on almost since as soon as we had our first crawler-based search engine. Google's popularity is in large part because when it started making use of links, it made it harder for people to try to manipulate Google.
BROOKE GLADSTONE:And we're talking about a search engine that has 3 billion web documents to sort through. It's an enormous data base used by a huge pool of people so where you stand on those rankings is no trivial matter!
DANIEL SULLIVAN: No, not at all. Indeed, if you are ranking well for a particular term and you're in the first page, that can make and break some businesses, and if you suddenly fall off onto the second page, relatively few people will go through.
BROOKE GLADSTONE:Something else that Google seems to do, I notice when I use it, is when you put in a few key words, sometimes some pages will pop up that actually don't have those key words in them! One that seems to happen a lot, and I guess Ira Glass [sp?] has to deal with this, is This American Life is actually distributed by Public Radio International -- P R I -- but everybody assumes it's distributed by NPR. So if you hit NPR and This American Life when you're searching, This American Life's web site will come up, even though there's no mention of NPR anywhere in it.
DANIEL SULLIVAN: Sure. A lot of people, when they're linking to the site, may be misdefining it, by saying this is a great NPR show, and that underscores a problem that we have with the links! They have been helpful in some regards, but at the same time they can be misleading.
BROOKE GLADSTONE: It's been great talking to you. How good is business for a search engine guru these days?
DANIEL SULLIVAN: [LAUGHS] Well it keeps me busy. There's a lot to write about.
BROOKE GLADSTONE: Daniel Sullivan is the editor of searchenginewatch.com. Where do you rank on Google when people are searching search engines?
DANIEL SULLIVAN: I usually do make it up into the first page there. [MUSIC]