BUAD 307





04--Collaborative filtering

PowerPoint Narration

Collaborative filtering is a process that can be used to identify products that an individual may be interested in buying based on identifying purchases which overlap with those of specific others.  This allows the identification of products “discovered” by other customers to a customer who is statistically likely to find that product of interest.  For example, two psychologists—Drs. Jonathan Kellerman and Stephen White—both write murder mystery novels in which the protagonist is a psychologist who helps police find the killers.  Very likely, individuals who have read several novels by one of the authors would find those by the other of interest.  In the conventional bookstore, novels are typically arranged in alphabetical order, making this similarity difficult to detect.  Online vendors such as Amazon.com, however, can rely on “brute force” computations in identifying overlaps of customer purchases.  If even 10% of customers who have bought novels by Jonathan Kellerman have also bought books by Stephen White, this is likely to show up as a strong link, triggering a recommendation to individuals who have bought several books by one of the authors.  Note that a similar situation exists in the area of music:  What exactly makes two artists “similar” to the extent that they may have similar potential fans?  This overlap could be driven by “sound”—although it may be difficult to concretely describe the “sound” of different artists—lyrics, or other factors that may be difficult to catalog.  Overlap in purchases, however, will identify such apparent similarities.

        Collaborative filtering relies on a considerable amount of available information to make high quality recommendations.  Thus, one would expect the quality of recommendations made to improve as an individual accumulates a longer purchase record and as more customers are added to the database.  The process works very well for Netflix because a large number of individuals have all rented and rated a large number of DVDs over time.  For example, although the movie Hotel Riwanda never got much interest at the box office, it has become one of the top ten most frequently rented DVDs at Netflix.  Note that the collaborative filtering system is improved considerably at Netflix because customers actually rate the movies after viewing them.  At Amazon, the system is based mostly on the decision of the customer to buy a book or other item rather than a post-experience evaluation.  It is, however, possible at Amazon to respond to recommendations—either by saying that one already owns the item or that it is not actually of interest.

        Amazon.com uses a similar method in identifying other products whose sales significantly overlap with the item currently displayed.  Several products will be displayed with a message something like “People who bought ____ also bought.”  The actual algorithm used to identify these overlaps may be more complex than this, but here the focus is basically on the correlations of this item to others rather than on the entire purchase history of an individual.  Thus, the recommendations are likely to be somewhat less tailored to the individual customer.  This, however, may be an advantage when this customer starts exploring a new interest.  For example, if an individual suddenly takes up gardening, his or her purchases may not give any cue to this, suggesting that books that overlap in sales with one being browsed may be of greater interest.


Main Midterm Review Page