in a blog post this week, Google VP of Search Pandu Nayak outlined a new approach to spellings, combining advances in deep learning and natural language models providing understanding of context.
It’s is a big step forward for Google, and builds on the natural language processing advances contained in the Bert update.
How Google used to approach spelling mistakes
Google says spelling remains “an ongoing challenge of language understanding” with one in 10 search queries misspelled daily and new words (and new misspellings of them) constantly being introduced.
This makes it an important issue for Google, as it has to know what the user is looking for (with correct spelling) before starting to look for relevant results.
Google tackled this challenge in a number of ways.
Conceptual mistakes, where there are different accepted spellings, or the user is unsure of how to spell the word and makes a best guess, bring up the “search instead for” solution.
Mistypings prompt the “did you mean” result.
Google says mistakes are common. For example, there are over 10,000 variations of queries like “YouTube,” such as “ytoube,” “7outub,” “yoitubd” and “tourube.”.
Google approached the issue of previously unseen misspellings by learning from keyboard adjacency. So if you hit the wrong key, Google would look to nearby keys to see which was the most likely one you were aiming for. This general concept was applied to all new misspellings, going through nearby letter replacements until a popular replacement term was found. Google says beyond slip-of-finger mistakes, it also effectively corrected all kinds of spelling errors, including conceptual mistakes.
Using deep learning
Advances in deep learning have given Google a better way to understand spelling. It announced a new spelling algorithm last year using a deep neural net that better models and learns from less-common and unique spelling mistakes.
“This advancement enables us to run a model with more than 680 million parameters in under two milliseconds — a very large model that works faster than the flap of a hummingbird’s wings — so people can search uninterrupted by their own spelling errors.”
And, context is the key to understanding what a user is looking for, whatever kind of spelling mistake they make, and even if the misspelling has never been seen before.
“Our natural language understanding models look at a search in context, like the relationship that words and letters within the query have to each other. Our systems start by deciphering or trying to understand your entire search query first. From there, we generate the best replacements for the misspelled words in the query based on our overall understanding of what you’re looking for.“
Google gives the example of the query “average home coast”, where it can tell from the other words in the search term, that the searcher is probably looking for “average home cost".
Google's Bert was considered a massive leap forward in search, where through an understanding of context, the search engine is better able to interpret search intent. The application to misspellings is another way in which Google is refining its ability to capture search intent, and we can expect more applications down the line.