#Hybrid Recommendation System

12 messages · Page 1 of 1 (latest)

tropic shale
#

I am building a recommendation system on the Movie Lens dataset. I want to use the movie descriptions to build content based filtering to suggest movies to new users and to solve the Cold start problem (using TF-IDF, word2vec embeddings etc). Then use actual Neural Collaborative Filtering to suggest movie based on ratings. I am using Pytorch. How do I combine these models?. It would be helpful if you guys could provide resources.

tropic shale
royal swan
#

To be honest, there doesn't seem to be anything that would logically solve the Cold Start problem for reccommendation systems

tropic shale
royal swan
#

Doesn’t solve the cold start problem because you technically used some users’ data, just not the users you are starting your service for.

royal swan
#

That's true but the OP wanted to try and "solve" the cold start problem which has no real solution

#

Practically speaking, using pre existing data takes care of it BUT it's not a solution

royal swan
#

Look man, there's a difference between the theoretical and practical. In the theoretical, there is no solution to the cold start problem. In practicality, you will likely use some existing/open data to seed the system.

tropic shale
#

o Write a Python function that uses NLP techniques (such as TF-IDF or BERT
embeddings) to process movie descriptions and recommend similar movies based on
a given user's viewing history.
o How would you integrate content-based filtering with collaborative filtering in your
code? Provide a code example that demonstrates a hybrid model.

tropic shale
tropic shale
# tropic shale This is one of my tasks of building content based filtering model

My idea was to scrape movie descriptions based on movieID's and then build Tfidf on whole dataset and train a linear model on the each users previous movies (This is for viewers with considerable ratings). Then predict ratings for all movies and sort them on ratings. To provide top 10 recommendations will this work. I am using movieLens 100k dataset with imdb ids.