Artikel: Psychic Search : A quick primer on search suggestions

Autor: John Ferrara

Der Artikel erläutert die Hintergründe von Search Suggest-Funktionen und gibt einen guten Einblick in die Herausforderungen bei der Konzeption einer solchen Funktionalität.

 

Predicting what a user wants to find is actually pretty easy, because probability is on your side. If you take a list of the most commonly submitted searches and chart them by their popularity, you get a shape that looks a lot like this:

 

 

The really important lesson here is that without knowing anything about a random user, it’s possible to know something about what they’re likely to search for. If they provide even just a little bit of additional information—such as a few characters in the search box—the odds narrow so dramatically that it’s overwhelmingly likely that the search engine can accurately guess what they’re trying to find.

 

 

To get this effect to really work, the function needs to return suggestions matching the character string the user has entered, sorted by popularity. This is almost always better than sorting the suggestions in some other way (e.g., alphabetically) because it stacks the deck in your favor.

 

There are three principal ways suggest functions can work. Which one should be used depends upon the nature of the information that users are searching.
Exploratory
An exploratory function works best when many of the things users are trying to find have no official name. In these cases, people enter keywords that approximate the idea they have in their heads. For example, users of a college website who want to find a map of the buildings might search for:

  • campus map
  • building locations
  • directions to buildings
  • places on campus
  • finding your way around

Given the enormous number of other things people could be searching for on a college website, there is an infinity of possible phrases.

It’s impossible to work with a list of potential searches that’s infinitely long, so it has to be cut off somewhere. Fortunately, the magic of probability makes it possible to cut the list fairly short and still provide the vast majority of users with good suggestions.

The suggestion list needs to be scrubbed to remove multiple word forms (e.g., singular or plural), misspellings, closely related phrasings, and other common problems.

 
Known item
For other searches, everything the user might try to find has a specific name. This is the case, for example, with websites that are principally product catalogs, such as Apple or Amazon.

For such known-item searches, truncated lists don’t work because the absence of an item implies that it’s not available. Since the list needs to be comprehensive, it can be very long indeed (just think of every product that Amazon sells), so it often can’t all be stored on the client side. Instead, it would need to be retrieved from the server in real-time.

 
Historical
These are searches that the user has submitted in the past. People are likely to search for something that they’ve looked for in the past, like a particular destination in a mapping application. A system can make itself much more personally relevant when it retains a memory of the things that a user has done before, and then makes it easier for the user to do them again.