This isn't a comment about the ML part but about movies. Netflix was part of the trends that killed movie stores. There are so many good movies out there, and movie stores were how you used to find them. Netflix, when it started, had a lot of good movies on the platform. Now they have a few gems that they keep recommending I rewatch. The overall set of quality films are now spread out over any number of competing platforms. You have to fight and search all over the place to find anything worth watching. It is a good example of enshittification caused by tech.
The only subscription I pay for these days is Criterion, on the recommendation of a cinephile friend. Most of their catalog is old, but I like discovering this stuff. We watched "The Devil's Eye" by Ingmar Bergman last night.
"Is all we got out of activism the honor of clicking to accept cookies?"
No, we also got increased barriers to entry, leading directly to centralization, censorship, and the destruction of all that was once good about the internet.
I was on the 2nd place team in the Netflix Prize (The Ensemble). I don't think your footnote re: Salakhutdinov and Hinton is correct. Restricted Boltzmann Machines were in the Pragmatic Theory subteam's portion of the winning team's blend. See page 39 and reference 5 here:
To be fair, there was a ton of stuff in the winning blend (as well as ours ) so it's hard to expect anyone who was not deep in the weeds of the competition to know all the models that were in there.
Part of the reason I wrote this post was to try to remember all of the details. Clearly my memory is spotty! If there's anything else you think I should add here, please let me know.
I have watched your talk with great interest and enjoyed it a lot. Thanks so much for the link.
Do you remember anyone utilizing boosting methods (adaboost) for the same goal of blending classifiers? It has the same goal; but, focuses on the difficult to predict (user,movie) pairs in the training set and tries to correct the prediction mistakes by adding more classifiers in the ensemble. It seems to me an excellent match to the task...
Glad you enjoyed the talk. I agree that boosting methods could be well suited for blending. I don't specifically remember them being used much in the Netflix Prize, but I also don't remember the details of the approaches of the other teams that well. You could check the links to the writeups of the subteams of the winning team in my first comment to see if there was much boosting in the solutions.
I feel like any discussion of the Netflix Prize is incomplete without discussing what happened to the winning solution. Though I have heard conflicting reports, my best evaluation of the situation is that the winning solution was not used by Netflix at all, mainly because of how the business changed to de-emphasize user ratings over alternative implicit metrics like watch time.
"the Netflix Prize also taught us a lot about the insignificance of overfitting. Constantly climbing the leaderboard did not lead to overfitting. The scores at the top of the board were pretty much the same on the private and public test sets. The evidence that leaderboard score was indicative of private score was clear by 2007. People ignored this for 15 years. Some still deny it today."
I disagree with this! In the following sense: as you point out, 90% of the success of Netflix prize entrants was linear regression. And nobody ever worried about linear regression overfitting! It would be like worrying "oh no, I averaged too many observations of the same thing, did I overfit?" No! The fact that sample means are often close to population means does not tell you anything about overfitting because those are not the kind of protocols where people thought there was a potential overfitting problem.
This isn't a comment about the ML part but about movies. Netflix was part of the trends that killed movie stores. There are so many good movies out there, and movie stores were how you used to find them. Netflix, when it started, had a lot of good movies on the platform. Now they have a few gems that they keep recommending I rewatch. The overall set of quality films are now spread out over any number of competing platforms. You have to fight and search all over the place to find anything worth watching. It is a good example of enshittification caused by tech.
The only subscription I pay for these days is Criterion, on the recommendation of a cinephile friend. Most of their catalog is old, but I like discovering this stuff. We watched "The Devil's Eye" by Ingmar Bergman last night.
Yeah, it's sad how this recsys-driven business has reduced choice and availability.
Given the dimensions of the dataset, Netflix had over 18000 movies in its catalog in 2006. Today they offer around 4000. That's crazy.
"Is all we got out of activism the honor of clicking to accept cookies?"
No, we also got increased barriers to entry, leading directly to centralization, censorship, and the destruction of all that was once good about the internet.
I was on the 2nd place team in the Netflix Prize (The Ensemble). I don't think your footnote re: Salakhutdinov and Hinton is correct. Restricted Boltzmann Machines were in the Pragmatic Theory subteam's portion of the winning team's blend. See page 39 and reference 5 here:
https://www.asc.ohio-state.edu/statistics/statgen/joul_aut2009/PragmaticTheory.pdf
Note that Mnih was also a coauthor of that RBM paper.
They were also in the BellKor subteam's solution, see e.g. the intro here:
https://www2.seas.gwu.edu/~simhaweb/champalg/cf/papers/KorenBellKor2009.pdf
Also in the BigChaos subteam's solution:
https://www.asc.ohio-state.edu/statistics/statgen/joul_aut2009/BigChaos.pdf
Thanks, I'll edit this.
To be fair, there was a ton of stuff in the winning blend (as well as ours ) so it's hard to expect anyone who was not deep in the weeds of the competition to know all the models that were in there.
Part of the reason I wrote this post was to try to remember all of the details. Clearly my memory is spotty! If there's anything else you think I should add here, please let me know.
Here's a video of a talk I gave in 2010 on my experience in the competition and some of the research that came out of it, in case it's of interest:
https://www.youtube.com/watch?v=coeak1YsaYc
I have watched your talk with great interest and enjoyed it a lot. Thanks so much for the link.
Do you remember anyone utilizing boosting methods (adaboost) for the same goal of blending classifiers? It has the same goal; but, focuses on the difficult to predict (user,movie) pairs in the training set and tries to correct the prediction mistakes by adding more classifiers in the ensemble. It seems to me an excellent match to the task...
https://en.wikipedia.org/wiki/Boosting_(machine_learning)
https://direct.mit.edu/books/oa-monograph/5342/BoostingFoundations-and-Algorithms
Glad you enjoyed the talk. I agree that boosting methods could be well suited for blending. I don't specifically remember them being used much in the Netflix Prize, but I also don't remember the details of the approaches of the other teams that well. You could check the links to the writeups of the subteams of the winning team in my first comment to see if there was much boosting in the solutions.
I feel like any discussion of the Netflix Prize is incomplete without discussing what happened to the winning solution. Though I have heard conflicting reports, my best evaluation of the situation is that the winning solution was not used by Netflix at all, mainly because of how the business changed to de-emphasize user ratings over alternative implicit metrics like watch time.
"the Netflix Prize also taught us a lot about the insignificance of overfitting. Constantly climbing the leaderboard did not lead to overfitting. The scores at the top of the board were pretty much the same on the private and public test sets. The evidence that leaderboard score was indicative of private score was clear by 2007. People ignored this for 15 years. Some still deny it today."
I disagree with this! In the following sense: as you point out, 90% of the success of Netflix prize entrants was linear regression. And nobody ever worried about linear regression overfitting! It would be like worrying "oh no, I averaged too many observations of the same thing, did I overfit?" No! The fact that sample means are often close to population means does not tell you anything about overfitting because those are not the kind of protocols where people thought there was a potential overfitting problem.
Movie rec could be random while music rec should be more predictable? But I found Spotify data set only has about 10+ features.
Superb post. This brings back lots of memories.