Evaluating an External Recommender written in Python using LensKit

In this article, we show how to use LensKit to evaluate a recommender written in Python.  We wrote this article to help people who want to use LensKit’s built-in evaluation capabilities and comparison algorithms, but don’t want to implement their own algorithms in Java.  Evaluating an external recommender — whether in R, Python, or MatLab, involves three primary steps:

  • Writing the recommender. We will need a simple recommender written in language other than Java (Python in this case) that can take test data to build up a simple model and generate recommendations for a given list of test users.
  • Setting up a shim class. We will need to write a small class that teaches LensKit how to use our external algorithm.
  • Setting up LensKit evaluation. Finally we show how we setup an experiment using the shim class in a LensKit eval script to evaluate the external recommender.

Note, that the data we will use to test this recommender is a MovieLens rating dataset. The data consists of movie ratings with each row being <userId,itemId,rating>. You can read more about the dataset here. (more…)

College guidance platform PossibilityU forms research partnership with GroupLens

White House, Department of Education, and Chronicle of Higher Education highlight our college recommendation software

PossibilityU logo

Two organizations have joined forces to answer important questions related to how search, discovery and recommendation online are affecting student college choices. PossibilityU, an innovative college guidance platform, and GroupLens, a research lab dedicated to recommender systems and online communities in the Department of Computer Science and Engineering at the University of Minnesota, Twin Cities, announced a research partnership to explore how college recommendations online can effect a broad range of student outcomes.

For the past 3 years, PossibilityU’s unique college guidance program has been matching high school students across the country with colleges that match their talents, aspirations and family budgets. The program features guided inquiry, data visualizations and personalized, Netflix-like recommendations—all designed to give students confidence in their choices. PossibilityU also features a blended-learning video curriculum created to help students and parents become better consumers of higher education.

(more…)

Taking Recommendations Improves Consumption Diversity — A Surprise Result from Exploring the Filter Bubble and MovieLens

This post describes work being presented at WWW 2014, by Tien Nguyen

Those of you following recommender systems have almost certainly heard the debate about filter bubbles.  This concept, perhaps best articulated by Eli Pariser, argues that recommenders have the potential to trap users into increasingly similar content, isolating them from the diversity of content that makes people rich learners.

We decided to test this concept empirically, using longitudinal data from MovieLens.  Specifically, we wanted to answer two questions:

  • Do recommended movies get narrower as users continue to rate movies?

  • Do users consume narrower movies — and if so, is this a consequence of taking recommendations?

What we found surprised us.

(more…)

Tag Genome Dataset Released

Want to know how quirky a particular movie is? Or how to find the most visually appealing movies of all time? Or how to find a movie that is similar to another movie you’ve seen but less big budget and more cerebral?

The tag genome is a data structure that enables you to answer queries such as these. As described in this article, the tag genome encodes how strongly movies exhibit particular properties represented by tags (atmospheric, thought-provoking, realistic, etc.). The tag genome was computed using a machine learning algorithm on user-contributed content including tags, ratings, and textual reviews.

We’re announcing the release of a tag genome dataset, containing the relevance values for 1,128 tags and 9,734 movies. We hope you will explore this dataset and come up with new and creative ways to use it! You can find more details here.