I am building a recommendation system on the Movie Lens dataset. I want to use the movie descriptions to build content based filtering to suggest movies to new users and to solve the Cold start problem (using TF-IDF, word2vec embeddings etc). Then use actual Neural Collaborative Filtering to suggest movie based on ratings. I am using Pytorch. How do I combine these models?. It would be helpful if you guys could provide resources.
#Hybrid Recommendation System
12 messages · Page 1 of 1 (latest)
you can use lightfm
No, This is a task for a internship. I have to build it on my own.
To be honest, there doesn't seem to be anything that would logically solve the Cold Start problem for reccommendation systems
Cant I use movie descriptions to create word embeddings and then suggest movies to users who have reviewed less movies by cosine similarity?
Doesn’t solve the cold start problem because you technically used some users’ data, just not the users you are starting your service for.
That's true but the OP wanted to try and "solve" the cold start problem which has no real solution
Practically speaking, using pre existing data takes care of it BUT it's not a solution
Look man, there's a difference between the theoretical and practical. In the theoretical, there is no solution to the cold start problem. In practicality, you will likely use some existing/open data to seed the system.
o Write a Python function that uses NLP techniques (such as TF-IDF or BERT
embeddings) to process movie descriptions and recommend similar movies based on
a given user's viewing history.
o How would you integrate content-based filtering with collaborative filtering in your
code? Provide a code example that demonstrates a hybrid model.
This is one of my tasks of building content based filtering model
My idea was to scrape movie descriptions based on movieID's and then build Tfidf on whole dataset and train a linear model on the each users previous movies (This is for viewers with considerable ratings). Then predict ratings for all movies and sort them on ratings. To provide top 10 recommendations will this work. I am using movieLens 100k dataset with imdb ids.