Recommendation Systems : An Introduction

Recommendation systems are on everyone's resumes. We have seen countless YouTube tutorials on clichéd movie recommendation or book recommendation systems. But how do you move past them? How are recommendation systems built from scratch, and how can you make your recommendation system on something unconventional, like food, content, or more?

This blog series is for Python and machine learning students to understand the core underlying logic and components of recommendation engines.

Recommendation engines are a type of filtering system that suggests relevant items to users.

It is one of the most popular use cases of data science. A powerful recommender system is critical to today's consumers. They make effective personalization possible, improving the customer experience while searching for products on e-commerce websites or streaming content on various social media apps.

Classification

There are several types of recommendation system based on the source information available for building the algorithm.

Collaborative Filtering

Collaborative filtering takes into account past user ratings, preferences, and behaviors to create a subset consisting of similar users.

This concept of similarity is used to recommend products and is further classified by User-Based and Item-Based.The idea "Similar users must buy similar items" is known as user-based filtering, while the idea "Two products with similar ratings from similar users are similar" is item-based filtering.

Collaborative Filtering is community-driven, meaning that you need large amounts of data about users. It does not take into account the qualities of the product that the system is recommending. The intuition about the product is built from the community.

This system is nuanced and complex, but most applications of recommendation systems cannot use this because of the cold start problem. The cold start problem refers to the lack of historical data about user preferences in case of a new platform. To resolve this, you can use public datasets to begin creating your algorithm or use the content-based approach.

The "frequently bought together" section on Amazon is an example of item based collaborative filtering, since it is generated on the basis of historical data of purchasing behavior of other customers. When many users buy two or more products in the same transaction, the system infers a relationship and presents those items as frequently bought together.

Content Based Filtering

Also known as cognitive filtering, this engine considers product metadata and the preferences given by users explicitly or implicitly, and user profiles are constructed based on this information. The algorithm generates recommendations by mapping item genres to user preferences. Not susceptible to the cold start problem, it is easy to build and may not contribute to creating a product that outperforms competitors in personalization.

Knowledge Based Recommender

This engine relies on explicit user input for preferences and item characteristics. Its application is pertinent to products lacking historical data. Domain expertise is essential to define and structure explicit knowledge about the item database, involving the creation of rules, taxonomies, or ontologies for relationship capture. Examples include recommending courses aligned with users' career goals, suggesting medical treatments based on patient history, or proposing travel destinations.

The approach

Now that we understand the possible data requirements and the kinds of recommendation systems we can build, we need to comprehend the approaches that solve the problem. When recommending products or items to users, you can either predict what kind of rating a user will give to an item they have never used before, or you can sort all the products in such a way that preferred products rank higher and are then suggested to the user.

The prediction method

Definition: The prediction method involves predicting how a user would rate an item they haven't interacted with yet.

Objective: The primary goal is to estimate the user's preference or interest in an item that they haven't explicitly rated or interacted with.

Approach: This problem is often addressed through techniques like collaborative filtering or content-based methods.

Example: Predicting the rating a user might give to a movie they haven't watched based on their historical movie ratings and those of similar users.

The sorting method

Definition: The ranking method involves selecting and presenting the most relevant items to a user from a vast set of possibilities.

Objective: The primary goal is to order or rank items in a way that maximizes the likelihood of the user interacting with or liking the top-ranked items.

Approach: This problem often involves determining the optimal sequence or order in which to display items to users, usually considering predicted ratings or relevance scores assigned to each item.

Example: Deciding the order of movies to display on a streaming platform's homepage, aiming to maximize the chances that users will find something they like among the top suggestions.

References

  1. http://infolab.stanford.edu/~ullman/mmds/ch9.pdf

  2. https://www.google.co.in/books/edition/Hands_On_Recommendation_Systems_with_Pyt/BglnDwAAQBAJ?hl=en&gbpv=1&dq=recommendation+systems&printsec=frontcover


Up Next...

I hope you found this article insightful! Your engagement matters; feel free to interact with this blog. Come back for the next blog where we will try to understand data mining techniques to build a recommendation system.