Software Nerd

Friday, October 27, 2006

The NetFlix Prize

Suppose I were to tell you how much I liked or disliked 10 movies that you have seen. I rate each with a number from 1 to 5. Now, suppose I name 10 other movies that we've both seen and ask you to guess what I would rate those 10. Could you do it? And, how accurately?

Here's an example of rated-movies:
  • Turbo: A Power Rangers Movie (5)
  • Lemony Snicket: Series of unfortunate events (5)
  • Million Dollar Baby (5)
  • Macbeth (4)
  • The Simple Life of Noah Dearborn (4)
  • Terminator (4)
  • The Untouchables (4)
  • Thomas Crowne Affair (4)
  • Blade Runner (4)
  • Cold Mountain (3)
And here's a list for which one must guess the ratings:
  • Nell
  • Inside Man
  • The Constant Gardner
  • Stuart Little
  • Mr. and Mrs., Smith
  • Harry Potter and the Chamber of Secrets
  • Monty Python and the Holy Grail
  • Die Hard
  • The Way we Live now
  • Kiki's Delivery Service
  • Maverick

Retailers like Amazon have to make recommendations based on previous user-ratings. They try their best to minimze the error (assuming they have enough user-ratings from the particular user, that they can work with). The online movie-rental companies try to do the same.

NetFlix claims that when they make estimates, on "average" the guess they make is about +/- 0.95 (about 1 off) from the real rating that the user eventually give to the movie. Frankly, on a 5 point scale, with the bulk of actual ratings on 2,3, and 4, that isn't too good. They want to improve their ratings.

They've set the following target: they want to reduce their "average" error from around +/- 0.95 to around +/- 0.85

If that does not sound ambitious, listen to this: they will pay a million dollars ($1,000,000) for anyone who can figure out how to do it.

Yup, that's the latest web-based contest. NetFlix will provide people with sample data containing thousands of movie ratings from their customers. They will then give you a set of Customer-Movie pairs and ask you to guess what the rating was for each pair. They will compare that to the real rating. Also, the best team that gets an error of +/- 0.94 or better will get a $50,000 "progress prize", but if any team can get their error below +/- 0.85, the $1,000,000 is theirs.

The competition was announced on October 6th and already 20 teams have beaten the 0.95 hurdle; so one of them will get the $50,000. The best team is down to an "average" error of 0.91

I think it's pretty innovative of NetFlix to solicit ideas this way.


Post a Comment

<< Home