Recommender systems are applications that predict user responses to certain items and provide recommendations based on those responses that would fit the user’s needs and preferences. Businesses rely on recommender systems in many ways (i.e. targeted marketing) to improve user experience and increase revenue. This project will focus on building a system that provides product recommendations based on a target item trained using a dataset crawled from the Amazon website. In real-life, this use case appears on each item page whenever we see phrases such as “Recommended for you since you viewed this item” or “Other customers also like these items.”
The system built for this project employs the use of two models that will combine results to give effective recommendations in terms of relevance as well as diversity with the goal of maximizing click/purchase rate. The two models are an item-based collaborative filtering model trained on user ratings and a content-based model trained on keywords from product descriptions.
For this project, we are using a publicly available dataset of Amazon product reviews (Ni et al., 2019). These researchers crawled data from the Amazon website and created a large repository of datasets for each product category. In order to simplify the problem, we will be focusing on the “Movies & TV” category which contains over 3,000,000 reviews from 300,000 users across 60,000 product items. These data points range from May 1996 to October 2018.
After collecting the data, we will process it into matrices to use as inputs for our models. From the product ratings, we will build a sparse matrix to store the ratings where the columns and rows represent the items and products, respectively. In this way, each item is represented as a vector of reviewer ratings.
+----------------+------------+--------+
| user_id | product_id | rating |
+----------------+------------+--------+
| A2M1CU2IRZG0K9 | 0005089549 | 5.0 |
| AFTUJYISOFHY6 | 0005089549 | 5.0 |
| A3JVF9Y53BEOGC | 000503860X | 5.0 |
| A12VPEOEZS1KTC | 000503860X | 5.0 |
| ATLZNVLYKP9AZ | 000503860X | 5.0 |
+----------------+------------+--------+
For the content-based method, we will combine all the attributes mentioned above into the tags feature which will contain all the keywords (in lowercase) for the item. We will also employ the use of stemming to group words together into their word families in order to help the program recognize similar words (i.e. eat, ate, eating should have the same significance in content). Then these texts will be turned into vectors of TF-IDF scores of 5000 keywords in order to represent the corresponding item. We can accomplish this with the use of TfidfVectorizer from the scikit-learn library. We could have used other methods to tokenize the contents which could be more effective (i.e. Word2Vec or other neural network language models) but TF-IDF is easily understandable and inexpensive to compute.
+------------+---------------------------------------------------+---------------------------------------------------+--------------+
| movie_id | title | description | brand |
+------------+---------------------------------------------------+---------------------------------------------------+--------------+
| 0000695009 | Understanding Seizures and Epilepsy | [] | [] |
| 0000791156 | Spirit Led—Moving By Grace In The Holy S... | [] | [] |
| 0000143529 | My Fair Pastry (Good Eats Vol. 9) | [Disc, 1:, Flour, Power, (Scones;, Shortcakes;... | [AltonBrown] |
| 0000143588 | Barefoot Contessa (with Ina Garten), Entertain... | [Barefoot, Contessa, Volume, 2:, On, these, th... | [InaGarten] |
| 0000143502 | Rise and Swine (Good Eats Vol. 7) | [Rise, and, Swine, (Good, Eats, Vol., 7), incl... | [AltonBrown] |
+------------+---------------------------------------------------+---------------------------------------------------+--------------+
Finally, we will move on to calculating cosine similarity between the target item and all other items with both of our models, then provide the top items as recommendations.
Here is an example of outputs from each model when the target item is “Harry Potter and the Prisoner of Azkaban”:
Collaborative Filtering Model:
Harry Potter and the Goblet of Fire
Harry Potter and the Order of the Phoenix
Harry Potter and the Half-Blood Prince
Harry Potter and the Deathly Hallows, Part 1
HP7: Deathly Hallows, P2 (DVD)
Spider-Man 2
Shrek 2
The Lord of the Rings: The Return of the King
Batman Begins
Star Wars: Episode III - Revenge of the Sith
Content-Based Model:
Harry Potter - Years 1–4: (Harry Potter and the Sorcerer's Stone / Chamber of Secrets / Prisoner of Azkaban / Goblet of Fire)
Harry Potter Years 1–3
Harry Potter Collection - Years 1–7 Part 1 Region Free
Harry Potter and the Philosopher's Stone [Widescreen, 2 Disc Edition] DVD
Harry Potter 1–7 Collection - 8 DVDs (Mandarin Chinese Edition)
harry potter - 4 grandi film #01 (4 dvd) box set dvd Italian Import
Harry Potter Years 1–6 Gift set
Harry Potter Collezione Completa (8 Blu-Ray)
Seekers Guide To Harry Potter
The Parables of the Potter
As we can see, the collaborative filtering model not only gives relevant recommendations in order but also introduces the customers to items from different subjects (in this case, it branches out to Spider-Man and Shrek from Harry Potter). This quality of diverse recommendations is lacking from the content-based model since it solely focuses on giving results with content strictly related to the target item. In this case, the recommendations are strictly related to only Harry Potter. This can be good if the item is new or is unpopular since the program can still “reach” it if it has the right keywords. Although for popular items such as Harry Potter, it’s not as useful to only stick to other Harry Potter theme items but to branch out to other items so that customers would be exposed to more products, which is more beneficial to the business from a marketing perspective.
Results from the recommender system can be evaluated using A/B testing in a real-life environment, which is in practice industry-wide and also the most reliable evaluation method, or using a metric such as hit rate or accuracy. For example, we can measure the percentage of recommended items that are actually relevant to customers and use that to compare models.
Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pages 188–197.