Hey there,
I am trying to apply TF-IDF on naive bayes but I dont get how am I going to calculate it and then use it with naive bayes when classifying reviews to positive/negative. As far as I understand TF-IDF vectorizes every document in terms of all the words in the corpus. My review dataset has nearly 250k unique words with 25k reviews, so will I have to hold a 250k x 25k matrix? that would be insane so I think I am wrong. Also how am I going to use the scores for each word in naive bayes when reading off new sentences for testing if every word is represented as a vector of scores among documents? I would love to hear your explainations or any resources that may help because I haven't found any good ones on the topic.