time required: 15 minutes
Ever wonder how ideas travel from country to country? Do they move to neighboring countries, or "hop" between distant countries sharing the same language? Search Trends are a pretty good indicator of what people living in a certain area are interested in. By analysing the top-ranked search terms for a given country, we can probably find answers to these questions and more!
Google provides a public BigQuery dataset of popular search terms which can be registered as an external repository and queried with Splitgraph.
Read on to find out how! All you'll need is a Splitgraph account (sign up for free if you don't have one yet) and a GCP account (the generous free tier will be more than enough).
There are many ways to add data to Splitgraph:
CREATE TABLE
statement.In this guide, we'll use the last option.
Let's start by enabling access to the search trends dataset. Running the example query in BigQuery is an easy way to verify that the GCP project you are using has access.
To authenticate with BigQuery, Splitgraph requires a Service Account with the
BigQuery Admin
role. You can create one on the
IAM and admin dashboard. Once
the Service Account is ready, you need to generate a JSON-format private key on
the
Service Accounts
page.
Splitgraph does not yet support reading datasets which belong to a different GCP
project than the Service Account used for authentication. The public datasets
belong to the bigquery-public-data
project, but you can work around this
limitation by creating a BigQuery dataset within your own project called
search_trends
and adding a latest_international_top_terms
view defined by
the following query:
SELECT
country_name,
week,
term,
MIN(rank) as rank
FROM
`bigquery-public-data.google_trends.international_top_terms`
WHERE
refresh_date =
(SELECT
MAX(refresh_date)
FROM
`bigquery-public-data.google_trends.international_top_terms`)
GROUP BY
country_name, week, term
The next step is to connect the external repository in Splitgraph.
After clicking on the link above, you should be greeted by the following form:
Fill out the form with the following:
field | value |
---|---|
Credential name | bq |
GCP credentials | [the contents of the downloaded JSON service account key] |
GCP project name | [the name of your GCP project] |
Big Query dataset | search_trends |
Once the form is filled out, click Continue and wait a few seconds for
Splitgraph to read the BigQuery view. A few preview rows are displayed. I named
the repository google-search-trends
, but you can choose any name, just
remember to update the qualified table names in the queries throughout the
remainder of this guide. Click on Create repository and load data
immediately.
Within a few seconds, the tables should be ready for querying in Splitgraph!
Note that Splitgraph also supports live querying BigQuery instead of importing data during repository creation. To connect the repository this way, click on the chevron to the right of the Create repository and load data immediately button, and from the resulting dropdown menu, select Create repository without loading data.
If you receive a 404 "not found" error, simply reloading the page will solve the problem.
The introduction contains the question,
Do [ideas] move to neighboring countries, or "hop" between distant countries sharing the same language?
If we consider search terms as a proxies for ideas, then an idea "moving to
another country" means it's rank diminishes in the source and grows in the
destination country. With ranking, the lower the number the more popular the
search term, so the rank
field would actually increase in value in the source
and decrease the destination during the "move".
Splitgraph co-founder Artjoms Iškovs came up with the following query:
The migration_strength
fields is the sum of ranking change in the source and
destination country. A high value indicates that interest in the search term
significantly plummeted at the source the same week it skyrocketed in the target
country. The query above returns pairs of countries where this occurred most
frequently.
The top 10 results:
source | destination | neighboring | same language |
---|---|---|---|
Egypt | Saudi Arabia | yes | yes |
Switzerland | Austria | yes | yes |
South Africa | Australia | no | yes1 |
Malaysia | Philippines | yes | yes1 |
United Kingdom | Nigeria | no | yes |
Austria | Germany | yes | yes |
France | Belgium | yes | yes |
Mexico | Colombia | no | yes |
Denmark | Norway | yes | no |
Canada | Denmark | no | no |
So it seems that common language is more important than being neighbors.
How can we be sure that the query above actually represents the migration of ideas (as recorded by search terms)? We can look at some of these, for example to see which search terms migrated from the UK to Nigeria we could run:
The result is a single term: Joe Aribo, a football player who relocated from the UK to Nigeria.
Unfortunately, not all of the search terms that migrate between countries were obvious. Why was Howard Carpendale becoming less popular in Austria and more so in Germany this May? I have absolutely no idea!
Are there terms going in the opposite direction, from Germany to Austria? Yes, the same 4 terms, but in a different order. The search term Galatasaray (the name of a Turkish football club) was in decline in Germany and increasing in Austria 94 times in the last 5 years. It only moved in the opposite direction once.
These search terms are a gold mine of unexpected insights into what captures nations' attention at specific times. Querying this data in Splitgraph was technically easy, but the underlying reasons for these trends are challenging to understand!
Have you figured out something surprising looking at the search terms repository? Let us know!