Confessions to a Beloved Search Engine

(Written in 2006 )

Launching my browser I’m greeted by the words: Google. Sometimes decorated with spiders for Halloween, holly at Christmas and clovers on St. Patrick’s day, whatever fripperies adorn it, Google has ooched its way into being my preferred portal to the world wide web.

It’s difficult to imagine a time without the infamous search engine. Google is now so ubiquitous, it is has become a word synonymous with search. I don’t look for things on the web anymore, and I certainly don’t surf. No, surfing is so passé. Instead, I Google! I Google for homeopathic remedies for what ails me, I Google for quick and easy recipes, and I Google for the weather forecast before packing for my vacation. Six multi-colored letters flanked by a tiny registered trademark, have become my lexicon to the data universe.

But how has this constant companion woven itself so seamlessly into my life, and what are the consequences of our seemingly natural symbiosis? This collection of anecdotes, references and observations is an attempt to get my heard around these basic but essential questions.

The Surf was up but I wasn’t on the waves.

Although it’s hard to remember, there was a time without Google. After all, it was only a little over ten years ago that this household name was nothing but a university experiment conducted by two Stanford doctoral students, Larry Page and Sergey Brin. And while they were cooking away with the university’s equipment, I’d just bought my first Mac, one of the Performa series, which I believed was an amazingly souped-up typewriter with graphic capabilities. When I purchased my computer, I had no idea what the world wide web had to offer and only bought a modem along with the other kit, because a persuasive salesman said he could cut me a deal.

The 56 K modem was stealth for its time but rinky-dink when compared to today’s technology. Sometimes, I still feel nostalgic for the sound of a dial-up connection. With all its bleeps screeches and stutters, it was the kind of noise that makes data feel substantial. It was as though information had to physically squeeze itself into a tiny telephone cable to then reconfigure itself intelligibly into your computer. Also, the sound represented a clear divide between being off or online.[ref]Sound file from the Freesound Org, which is dedicated to sound belonging to the creative commons – all attributions can be found here: http://www.freesound.org/people/Jlew/sounds/16475/[/ref]

Here, just listen to its tune:

Once this dissonant melody played, my glorified typewriter was instantly transformed into something else. I wasn’t quite sure what that “else” was, but I knew it was my channel to surfing the web. But surf to where and for what, those were the questions. As a novice to the web, I had no idea what might be out there. And like most people back then, I gravitated towards Alta Vista for guidance. However, search in those days was not very refined. If I typed in “art”, a few pearls were rendered, but also a lot of schlock and porn.

Why was searching the web such a bad experience? Well, to answer this question, you have to look at how a search engine works in general. I’m not mentally wired to grasp the intricacies of algorithms, but luckily John Battelle, in his book The Search, describes the basic mechanics of query in digestible layman’s terms:

A search engine consists of three major pieces – the crawl, the index, and the runtime system or query processor, which is the interface and related software that connects a user’s queries to the index. The runtime system also manages the all-important questions of relevance and ranking. [ref]Battelle, 2005, p.20[/ref]

In a nutshell, these elements, coupled with the refinement of an individual query, make or break the efficacy of any search. How these parts are tuned in relation to each other, dictate whether I land where I want to go, or whether I’m banished to the outer regions of pornographic Siberia.

Although a show piece of its time, Alta Vista frequently left me in the chill. Later I dabbled with Yahoo, but it also lacked precision. The miss rate coupled with the high costs of connecting via my precious 56K modem, meant surfing remained a minor part of my internet experience until late 1999.

He said: Hey, hi, how’s it going?
She said: Okay, I guess…

Chatting with a friend on the phone, I was talking about a kind of tree. Describing the foliage in detail to him, I just couldn’t remember the name. Then I said: “Wait a minute, let me go to Yahoo.” And he said: “Oh my God, you still use Yahoo! Go to WWW.GOOGLE.COM.” Once there, I quickly typed a few descriptive qualities into the prompt: evergreen tree, spiky branches, found in Chile. Then, it rendered what I was looking for: The Monkey Puzzle Tree.

No more shooting in the dark, and no more getting caught in endless porn pop-up loops (at least not without my direct solicitation). Page and Brin, had polished their algorithms and perfected their crawlers, index and runtime system. It had become deus ex machina. Despite the banality of my query, my first encounter with Google was profound and represented a complete paradigm shift in how I now navigate the web.

But what makes Google as search experience so unique? The incontestable quality elevating it in the industry is PageRank, a system of hierarchical value showing ‘this’ is more relevant to a given query than ‘that’.

I’ll rub your back, if you rub mine :-)

Before Google, Larry Page created Backrub. In that prototype are the roots of today’s PageRank. In recounting the history of search, Battelle writes:

Page was naturally aware of the concept of ranking in academic publishing, and he theorized that the structure of the Web’s graph would reveal not just who was linking to whom, but more critically, the importance of who linked to whom, based on various attributes of the site that was doing the linking.

[ref]Battelle, 2005, p.74[/ref]

Where previous engines crawled a url tracking links to other online sources, Backrub traced who was linking back, meaning who was referring to a particular text, article or web object . And the culmination of those linked citations coupled with their textual embedding or metadata, marked a urls relevance to a given query.

Where Lycos, Yahoo and AltaVista were not particularly efficient at separating the wheat from the chaff, Google, even in its nascent form of Backrub, was right on the mark. [ref]Battelle, 2005, p.74[/ref] By 1996, Backrub was officially christened Google, the search engine we know today.

I joined the cult of Googlemania after my first query, my 56k modem was swapped for ISDN, and offline was a phenomenon of the past. No more ring a ding ding screech, just a constant flow of data exchange. With speed, bandwidth and Google by my side, ‘surf’ was elevated to ‘search’, something precise and targeted. Ask Jeeves, Lycos, Yahoo, AltaVista and Hotbot, were forgotten portals, and I never looked back.

Now, before you think I’m about to drift into the sing-along of the California Ideology, actually here’s the point where the Inkspots chime in crooning:

Someone’s rocking my dreamboat,
someone’s invading my dream.
We were sailing along,
so peaceful and calm,
suddenly something went wrong.

Okay, saying “something went wrong” might be a bit dramatic. Maybe I should say, something wasn’t quite right. *Warning* This is not about to become one of those stories about a wonderfully angelic nanny who looked like she could provide everything, but instead turned out to be a psycho sex-crazed killer. Google is far from a whacked out nanny. That said, like the beginning of those nanny horror flicks, Google slowly cozied itself into a prominent, yet unquestioned position in my world. Its presence in my life felt (for the lack of a better word) ‘natural’.

Oh yes, that’s exactly what I meant….

And I was not the only one. By 2005, all other engines had been allocated smaller slices of the global search market. Google had “51%” of the pie, and I was an infinitesimally small spec calculated amongst its mass of users. (Battelle p. 30).But let me clarify the word ‘user’ in this context. In the industry of search, I am not just a user, but also an unintentional co-producer, tweaking the apparatus with every query I make. As Matthew Fuller notes in Behind the Blip: “Just as the engine helpfully provides a field in which you can type your search string, the engine is also opening up an aperture through which the user can be interrogated.” [ref]Fuller, 2003, p.70[/ref] In other words, in that subtle interrogation, also known as an exchange of cookies, I tell Google not only what I am looking for, but I also give them my IP address. Identifying myself as a unique user, I allow them to track my queries in order to perfect their search engine and better target their pay for click advertising.Most of that gathered information is known only to the company, but some of the most popular queries can be found on Zeitgeist, an ongoing Google survey. If you go back to the earliest records, the summary of 2001, the queries are quite revealing. Amongst the top 20 were: Nostradamus, The World Trade Center, anthrax, Osama Bin Laden, the Taliban, Afghanistan and the American Flag. The words represent a kind of query portrait of at that time predominately English speaking web-users. Zeitgeist is now arranged very differently. Aside from an international edition, it is divided into a survey of Google.com, Google News and Froogle. In 2005, the last year to be consolidated into a complete annual archive, Bin Laden was no where to be found (both physically and virtually on Zeitgeist), and Janet Jackson topped both Hurricane Katrina and the great Indian Ocean tsunami which killed so many.Zeitgeist, and the statistics that Google does not reveal for proprietary and civil liberty reasons, are certainly a virtual ethnographer’s dream. As John Battelle says, we are in the process of building a “Database of Intentions”; with every query, we state an intent, a desire to know. [ref]Battelle, 2005, p.2[/ref]And all of it is being meticulously documented.

Forgive me father for I have sinned both in my thoughts and in my queries…

But when I look at my own queries, they’re are not only intentions. As Fuller notes, our “confessions of fear and desire” are emptied into spaces like Google. [ref]Fuller, M. 2003, p. 74[/ref] Yes, I’ve been telling Google my secrets. Whispering things through keywords that I would never reveal in public. Alone on my laptop, I show my shadow-side curiosities; things I want to know, but don’t want other people to know that I WANT to know it.

Although I’m in a happy relationship, I Google for past lovers. I am not a terrorist, but have researched Islamic militants. Forensic science shows give me nightmares, but I’ve viewed serial killer profiles. I wouldn’t want to be considered a bimbo, but I’ve Googled for Britney Spear’s ‘money shot’. And while I wouldn’t buy the video, I’ve Googled for Paris Hilton’s sex tape. With each click, I pour confessions, some innocent and others more incriminating, and Google is the dutiful priest, silently taking notes of everything I say.

Actually, Google and I have entered into implicit pact. I get the results I want, in exchange for my trackability to tweak the search/click advertising apparatus. Why shouldn’t I enter into such an agreement, after all, this is the company that declares: Our informal corporate motto is “Don’t be evil.” We Googlers generally relate those words to the way we serve our users.” But let’s say the company remains on the up and up, and keeps to its motto, there could be third parties that see otherwise. Data in its raw form is always neutral. It is human interpretation which turns data into information, meaning something useful, incriminating, or proof to be acted upon.

In 2005, Sergey Brin lectured at Berkeley as a part of a course on Search Engines: Technology, Society and Business. In the Q&A follow-up session, someone asked about Google’s China policy and censorship. Brin answered that they didn’t censor, but it was China’s firewall. To me, that answer revealed how the company might passively bend to third party interests, while not actively moving against their motto. After all, they aren’t being evil, China’s firewall is actually doing the dirty work.

While I don’t want this essay to collapse into complete paranoia and conspiracy theory, it is safe to say, that most Western countries are experiencing changes in anti-terror laws which compromise the protection of private information. In the American context, John Battelle speaks specifically about the Patriot Act, and the fact that if Google is summoned to give its information over, legally, it would not have to notify its users [ref]Battelle, 2005, p.200[/ref]And if I consider for a moment how often I go to Google to make my confessions, how often I enter a query, and then multiply that by all the other users on the web plus add in other services like Gmail etc., that’s an unprecedented amount of collected data. As Battelle acknowledges:

…the implications of such broad government authority are chilling given the world in which we now live — a world where our every digital track, once lost in the blowing dust of a presearch world, can now be tagged, recorded, and held in the amber of a perpetual index. [ref]Battelle, 2005, p.200[/ref]

Given this threat, the Electronic Frontier has been diligently encouraging Google to look at its policy of indefinitely holding data, and in March, 2007 Google announced plans to retain identifiable search logs for only 18 to 24 months. Although pleased with Google’s move, the Electronic Frontier responded with the following assertions for greater policy change:

  • Google should shorten the retention period for identifiable logs to six months at the outside, and ideally to only thirty days (which is AOL’s retention limit for similar logs). Barring this, it should at least justify why it needs such records for up to two years, beyond offering one-sentence platitudes about how such records are used to improve Google’s service.
  • Google should also shorten the retention of the “anonymized” logs, which Google apparently still intends to keep forever. As Google itself admits, the new policy changes still don’t guarantee users’ anonymity, and holding onto those records indefinitely still poses a serious private threat. Therefore, Google should consider more robust anonymization techniques, up to and including scrubbing entire IP addresses rather than just the last quarter or “octet” of such addresses.
  • Finally, Google should expand its new anonymization policy to include the search records of users with Google Account log-ins, and to records generated by their myriad other services, rather than limiting the policy change to regular search logs.

[ref]source from The March Electronic Frontier Archive [/ref]

The stakes are high on both sides. I want to believe my data is mine, but for Google it’s linked to the productivity and optimization of the engine. And if you look at their stocks which have risen in leaps and bounds since their 2004 initial public offering, that tweaking through profiling has proved more than a profitable formula.

So far, Google has publicly defended the information they have in their possession. In 2006, when solicited by the US government to hand over their search data, Google refused when other search engines were obliging. The question is, for how long can they hold out? For better or worse, data has the nasty habit of lingering; and moreover, falling into the wrong hands. Knowing where we are, but not quite understanding the full consequences, I continue to ooze confessions with the rest of Goolge users. Why? Because they are basically the only search engine left in town. And with every query, our profiles fleshout in greater dimension. In other words, we construct images of ourselves, whether true or false. But it’s important to remember that to be curious or to query, is not necessarily to act. Whether our eventual data interpreters will grasp that fact remains to be seen.