Skill rating algorithms are now ubiquitous in online competitive video games and are used not only to accurately compare players’ strength, but also to match players of similar proficiency against each other, thus ensuring fair and exciting matches. Solidly anchored in online approximate Bayesian inference, these modern algorithms only require a prior rating for each match participant and the outcome of the match to infer posterior ratings for each player. Efforts have been made in the last decade to incorporate auxiliary information in rating computations (such as players’ score in a match or their proficiency in different game modes) in order to get more accurate rating updates as well as generally faster convergence to the true underlying skills. However, as ratings also influence the matchmaking and therefore the new ratings, their distribution over all players is a highly dynamic system, where the interdependence of ratings and matchmaking impedes our ability to fully understand the impacts of potential improvements on live applications. We shall survey existing rating systems and explore the assessment challenges involved in enhancing and extending these algorithms.