June 29, 2009

Lessons from Googlenomics: Data abundance, Insight Scarcity

“"What's ubiquitous and cheap?" [Google’s Hal] Varian asks. "Data." And what is scarce? The analytic ability to utilize that data.”

The June issue of Wired has an excellent article by Steven Levy, entitled Secret of Googlenomics: Data-Fueled Recipe Brews Profitability.  The article delves into the history and algorithms behind Google’s auction based ad system, highlighting the significance of engineering, mathematics, economics, and data mining in Google’s success.

On the economics front, the article explains Hal Varian’s role as Chief Economist at Google, including why Google needs a chief economist:

“The simplest reason is that the company is an economy unto itself. The ad auction, marinated in that special sauce, is a seething laboratory of fiduciary forensics, with customers ranging from giant multinationals to dorm-room entrepreneurs, all billed by the world's largest micropayment system.

Google depends on economic principles to hone what has become the search engine of choice for more than 60 percent of all Internet surfers, and the company uses auction theory to grease the skids of its own operations. All these calculations require an army of math geeks, algorithms of Ramanujanian complexity, and a sales force more comfortable with whiteboard markers than fairway irons.”

After reading the article, Varian’s economic view of data ubiquity and analytic scarcity really stuck with me.  The quote I opened the post with isn’t directed at software availability or processing power.  It refers to the scarcity of people qualified to churn abundant data into economic value.  

What follows are some excerpts “about harnessing supply and demand”.  The sub-headers and emphasis are mine.

Enter Econometricians

"The people working for me are generally econometricians—sort of a cross between statisticians and economists," says Varian, who moved to Google full-time in 2007 (he's on leave from Berkeley) and leads two teams, one of them focused on analysis.

"Google needs mathematical types that have a rich tool set for looking for signals in noise," says statistician Daryl Pregibon, who joined Google in 2003 after 23 years as a top scientist at Bell Labs and AT&T Labs. "The rough rule of thumb is one statistician for every 100 computer scientists."

Ubiquitous Data

“As the amount of data at the company's disposal grows, the opportunities to exploit it multiply, which ends up further extending the range and scope of the Google economy…

Keywords and click rates are their bread and butter. "We are trying to understand the mechanisms behind the metrics," says Qing Wu, one of Varian's minions. His specialty is forecasting, so now he predicts patterns of queries based on the season, the climate, international holidays, even the time of day. "We have temperature data, weather data, and queries data, so we can do correlation and statistical modeling," Wu says. The results all feed into Google's backend system, helping advertisers devise more-efficient campaigns.”

Continuous Analysis

“To track and test their predictions, Wu and his colleagues use dozens of onscreen dashboards that continuously stream information, a sort of Bloomberg terminal for the Googlesphere. Wu checks obsessively to see whether reality is matching the forecasts: "With a dashboard, you can monitor the queries, the amount of money you make, how many advertisers you have, how many keywords they're bidding on, what the rate of return is for each advertiser."”

Behavioral Based Insights

“Wu calls Google "the barometer of the world." Indeed, studying the clicks is like looking through a window with a panoramic view of everything. You can see the change of seasons—clicks gravitating toward skiing and heavy clothes in winter, bikinis and sunscreen in summer—and you can track who's up and down in pop culture. Most of us remember news events from television or newspapers; Googlers recall them as spikes in their graphs. "One of the big things a few years ago was the SARS epidemic," Tang says. Wu didn't even have to read the papers to know about the financial meltdown—he saw the jump in people Googling for gold. And since prediction and analysis are so crucial to AdWords, every bit of data, no matter how seemingly trivial, has potential value.”

Rise of the Datarati

“Varian believes that a new era is dawning for what you might call the datarati—and it's all about harnessing supply and demand. "What's ubiquitous and cheap?" Varian asks. "Data." And what is scarce? The analytic ability to utilize that data. As a result, he believes that the kind of technical person who once would have wound up working for a hedge fund on Wall Street will now work at a firm whose business hinges on making smart, daring choices—decisions based on surprising results gleaned from algorithmic spelunking and executed with the confidence that comes from really doing the math.”

Now, a few questions I think folks should consider:

  1. Who does that math in your organization? 
  2. Does your analytics / active information strategy suffer from information processing richness and insight scarcity?
  3. Who are, or should be, your datarati? 

April 30, 2009

Cloud Watching Pick: Vint Cerf on Cloud Computing and the Internet

If you haven’t seen it yet, Vint Cerf published a thoughtful piece on Cloud Computing and the Internet on the Google Research blog.  In the post, Cerf compares the current stage of cloud computing "Each cloud is a system unto itself” to the state of networking in the 1960’s that led to his and Robert Kahn’s work to interconnect proprietary networks and form the internet.

While the entire post is excellent, and well worth the read, I wanted to call out the questions Cerf raises in respect to connecting the clouds, or as some refer to it, the inter-cloud.  Because as every architect knows, the right answers only arise from asking the right questions.

First, the problem as Cerf describes it:

“Cloud computing is at the same stage. Each cloud is a system unto itself. There is no way to express the idea of exchanging information between distinct computing clouds because there is no way to express the idea of “another cloud.” Nor is there any way to describe the information that is to be exchanged. Moreover, if the information contained in one computing cloud is protected from access by any but authorized users, there is no way to express how that protection is provided and how information about it should be propagated to another cloud when the data is transferred.”

Now, Cerf’s questions:

“There are many unanswered questions that can be posed about this new problem. How should one reference another cloud system? What functions can one ask another cloud system to perform? How can one move data from one cloud to another? Can one request that two or more cloud systems carry out a series of transactions? If a laptop is interacting with multiple clouds, does the laptop become a sort of “cloudlet”? Could the laptop become an unintended channel of information exchange between two clouds? If we implement an inter-cloud system of computing, what abuses may arise? How will information be protected within a cloud and when transferred between clouds. How will we refer to the identity of authorized users of cloud systems? What strong authentication methods will be adequate to implement data access controls?”

Instead of answers, Cerf closes by encouraging exploration and creation:

“Because the Internet is primarily a software artifact, there seems to be no end to its possibilities. It is an endless frontier, open to exploration by virtually anyone. I cannot guess what will be discovered in these explorations but I am sure that we will continue to be surprised by the richness of the Internet’s undiscovered territory in the decades ahead.”

Check out his full post.

March 11, 2009

Economist Tech Quarterly: Fueling your morning and commute with coffee

As you probably gathered from the title, although this post is technology related, it is definitely off-topic.  However, I found the article interesting, and thought others might as well.  Plus, my brother (an engineering geek) stops by occasionally, so if nothing else, this one is reward for slogging through posts littered with "tech acronym du jour".

And yes, I did the math.  1 gallon of coffee ground derived biodiesel requires the consumption of 50lbs, approximately 2,250 cups of coffee.  So, if you see a major uptick in my writing output, accompanied by jittery speech, you know I'm doing my part in the "beaning of America". 

Without (even) further ado, what follows are excerpts from Fuelled by coffee, in the most recent Economist Technology Quarterly.

The basics:

"In the case of coffee, the biodiesel is made from the leftover grounds, which would otherwise be thrown away or used as compost. Narasimharao Kondamudi, Susanta Mohapatra and Manoranjan Misra of the University of Nevada at Reno have found that coffee grounds can yield 10-15% of biodiesel by weight relatively easily. And when burned in an engine the fuel does not have an offensive smell—just a whiff of coffee. (Some biodiesels made from used cooking-oil produce exhaust that smells like a fast-food joint.) And after the diesel has been extracted, the coffee grounds can still be used for compost."

The accidental discovery:

"The researchers’ work began two years ago when Dr Misra, a heavy coffee drinker, left a cup unfinished and noticed the next day that the coffee was covered by a film of oil. Since he was investigating biofuels, he enlisted his colleagues to look at coffee’s potential."

Advantages beyond aroma:

"The researchers found that coffee biodiesel is comparable to the best biodiesels on the market. But unlike biodiesels based on soya or other plants, it does not divert crops or land from food production into fuel production.

A further advantage is that unmodified oils from plants, like the peanut oil used by Diesel in the 19th century, have high viscosity and require engine alterations. Diesel derived from coffee is less thick and can usually be burned in an engine with little or no tinkering."

The math (why we won't be doing this at home):

"Although some people make their own diesel at home from leftovers and recycled cooking oil, coffee-based biodiesel seems better suited to larger-scale processes. Dr Misra says that a litre of biodiesel requires 5-7kg of coffee grounds, depending on the oil content of the coffee in question. In their laboratory his team has set up a one-gallon-a-day production facility, which uses between 19kg and 26kg of coffee grounds. The biofuel should cost about $1 per gallon to make in a medium-sized installation, the researchers estimate.

Commercial production could be carried out by a company that collected coffee grounds from big coffee-chains and cafeterias. There is plenty available: according to a report by the United States Department of Agriculture, more than 7m tonnes of coffee are consumed every year, which the researchers estimate could produce some 340m gallons of biodiesel."

If you found this interesting, check out the full article and the reader comments.

October 16, 2008

Don't fence me in...when the "C" in CEP is for Cow

One of the many interesting articles in the most recent Economist Technology Quarterly was a piece on virtual fencing for livestock.  If pet containment systems at scale come to mind, you are on the right track.

The business problem:

"BUILDING and maintaining the fences needed to control livestock is an expensive and time-consuming business. The materials alone can cost more than $20,000 a kilometre. On top of that, there is the cost of repairing damage caused by wild animals and falling trees. And then there is the need to move some fences around, a bit at a time, so that grazing land can be used efficiently. Strange as it may seem, many ranchers would therefore like to get rid of fences—if they could."

The science problem:

"According to Dean Anderson, an animal scientist at America’s Department of Agriculture, and Daniela Rus, a computer scientist at the Massachusetts Institute of Technology, the answer is to move from real fencing to the virtual sort. The idea of virtual fencing is not entirely new. Pet “containment” systems, such as collars that give dogs a small electric shock if they roam outside a particular area, have been around since the early 1970s. But attempts to come up with a system for controlling free-ranging animals have failed."

The breakthrough idea (emphasis is mine):

"Dr Anderson and Dr Rus started from the observation that the job of a fence is merely to regulate an animal’s behaviour and asked if there was another way of achieving the same end. The result is a device dubbed the Ear-a-round, which acts both as a sensor of what an animal is up to and as a discipline on animals that are not behaving as their owner wishes.

The Ear-a-round consists of a small, light box that sits on top of a cow’s head, and a pair of earpieces made of fabric and plastic. The box contains a small computer, a GPS satellite-tracking device and a transceiver that enables it to be programmed remotely. The earpieces both keep the box upright and deliver commands—either sounds or electric shocks—to the animal wearing the device. The whole thing is powered by lithium-ion batteries topped up by solar cells."

The event processing part -- event detection, evaluation & response:

"The range that an animal is allowed to occupy is encoded using GPS co-ordinates. The GPS system determines the animal’s location, and an accelerometer and a compass housed inside the box track its rate and direction of travel. If an animal roams beyond the range specified, the device responds in a way determined by its wearer’s recent behaviour. The algorithms devised by Dr Rus are able to work out, based on past experience, how strong the message to turn back needs to be.

Minor transgressions lead to quiet sonic alerts or mild tingles; major ones to shouts or shocks. In both cases the cue is delivered to the ear opposite the direction that the animal is being nudged towards. Four years of research at a ranch in New Mexico have shown that cattle quickly cotton on to what they need to do."

The cow as downstream actor -- old fashioned stimuli:

"Not all the stimuli used to guide the animals are unpleasant—at least, not intentionally so. In April Dr Anderson set out to test whether a recording of him singing the “gathering songs” used during traditional round-ups would be as effective at herding cattle as irritating sounds such as barking dogs, or electric shocks. Four Brangus cows listened to recordings of him chirping “Come on, girls” at 30-second intervals. Almost immediately, he says, the herd began moving towards the corral. It is not clear whether they were encouraged by his singing, or were trying to get away from it."

The ROI -- cow leadership:

"Is all this really cheaper than fencing? At $600 a cow, not yet. But Dr Rus is working to get the price of the hardware down to the $100 that farmers will pay. Meanwhile Dr Anderson is about to start investigating how many cows actually need to be fitted with Ear-a-rounds to control an entire herd. He hopes that, by identifying a herd’s leaders and equipping them alone, this number can be reduced to a handful."

If this peaks your interest to read more about event processing, go here.  On the other hand, if you've got "don't fence me in" on the brain, here's Bing Crosby and The Andrews Sisters.

Oh, give me land, lots of land under starry skies above,
Don't fence me in.
Let me ride through the wide open country that I love,
Don't fence me in.
Let me be by myself in the evenin' breeze,
And listen to the murmur of the cottonwood trees,
Send me off forever but I ask you please,
Don't fence me in...

October 15, 2008

A small action on Blog Action Day 2008 - Poverty

As I considered what to write for today’s blog action day on poverty, I couldn’t help but think of how fortunate I am to be contemplating ‘poverty as an issue’ from behind a keyboard as opposed to as a life circumstance. Going down that path for a minute, I’m grateful my grandparents had the foresight and fortitude to reach Ellis Island and that my parents instilled in me the importance of education, working hard and living within means (savings good, debt bad). Essentially, my family provided me the circumstance and tools to reach towards success. 

At this point, it became clear to me what my blog action day ‘action’ should be. Give others an opportunity to create their own success. This led me straight to Kiva.org, a micro-lending site that brokers loans between individuals and (working poor) entrepreneurs in the developed world. 

I just made my first loans, one for bigger fishing nets and another to transport a child to school. Check it out, do what makes sense for you.

About | Contact

Ads

Subscribe



  • Powered by FeedBlitz


Ads 2

Search


  • Powered by Rollyo

Affiliate

Accountability

  • The ideas and opinions expressed in this blog are my own.

License

blogosphere



Blog powered by TypePad