Saturday, June 14, 2014

Amazing Google Trends and Correlate

Google Trends (www.google.com/trends) is a great tool to explore trends on user interest on virtually anything under the sun. Mining its collection of all search queries from 2004 onwards ( a mind boggling number, assuming 5 billion queries a day  (which is the current rate of search queries), total number of search queries  for over last ten years would be something around 5*3660=18.3 trillion..i.e.. 18,000,000,000,000 search queries!!!!), google trends identifies user interest across time (2004 onwards) and space (for regions, city, etc.).

Until recently, inspite of having this mountain of data at its disposal to look for answers, google trends was still quite dumb at understanding the context of the search terms. In simple terms, when I type in "Amazon", I could be interested in 'n' number of things?... The river amazon, or the company amazon.com, or the amazon rainforest?.. To Google Trends it did not make any difference. For it was just a term with no meaning and hence would aggregate all content that contained the search term irrespective of its relevance. It was left to the user to somehow communicate to the tool the desired context by way of excluding terms that are not related to the context of interest. For example, to look for user interest trends on amazon.com company, I would have typed in something like "amazon.com -rainforest -river" i.e all those search queries that contain the term "amazon.com" but not "rainforest" or "river". a workaround and not a foolproof way.

That was until Google launched the new version of Trends. In the new version of Trends, Google has been able to "teach" Trends to understand what search terms mean. Using its treasure trove of search queries, Google has been able to categorise all of its search query data content into people, places, and things.  This is by no means a small feat and is a shining example of what  can be achieved using "big data" analysis.

So now when I type in "Amazon" in Trends, it understands that it could be a name of a retail company , or name of a rainforest, or name of a river, or a fictional character, or  just a plain term. It throws up all of those options for the user to select one.




















Each identified context for a search term is organised as a topic by Trends. All search queries that relate to a particular context are associated with a topic of the same name. Now when I try to find out the trends for amazon -  the company, I  can simply select the associated topic and be sure that the results reflect pure interest in amazon.com unadulterated by any other context. Google Trends claim to have more than 700,000 topics in its catalogue with more being added every day.

This stuff is really cool. It is amazing how Google can categorise the information about the world by analysing its search query database. What is stopping it from taking it further to develop a full-fledged classification of people, places, and things of everything that exists or ever existed....?.  If google has raised the bar by 10x by giving Trends some intelligence, it has raised the bar by 100x by developing Google Correlate which in tandem with Trends puts awesome power in your hands. Using Google Correlate, you can find out the correlations (and causations, if you are lucky) related to the trends that you see in time and space.

I really enjoy playing with Trends by  just typing in terms that interest me and see the trends across time and space.. For example, I was interested in finding the trends of stress levels across time (2004 onwards) and compare the same for different countries. So I typed in two terms stress and anxiety which I thought are good proxies for general stress as we understand it and then added two locations - United States and UK to bring up the following graph.





















Interesting to note the downward trend till around 2008 and an upward trend from thereon..hmm.. maybe the impact of 2008 financial crisis has something to do with it.... UK seems to be on a higher trajectory path than the United States..

Breakup by regions where relative popularity of search queries is the highest


















The possibilities are endless. You can estimate relative market share of say.., cell phone makers by using the search queries as a proxy and drill it down by region, city.. and time (last 30,60,90,... days). Or look at patterns of diseases in space and time...So take it out for a spin and you might find a gold nugget in there that can send you off in a new direction in your work, or life.



No comments: