MovieLens Datasets—Predicting and Analyzing User Ratings of Movies

Abstract

The MovieLens datasets contain user ratings of movies as well as movie and user information. In this report, we consider four predictive models of the ratings based on the available information: K-nearest-neighbors, neural networks, matrix completion using singular value decomposition and restricted Bolztmann machine. We tune all models to finally compare them on their out-of-sample performance: matrix completion produces the best testing metrics. Then, we propose two exploratory analysis methods in order to extract insight from the selected predictive model.