03 July, 2010

chess rankings for photos

I have a lot of photos - around 50000. I want to have some kind of scoring system for them.

The obvious one is something like a 5-star scale: when presented with a photo, you can award it from 0 to 5 stars, with 5 being best and 0 being worst.

But thats not very granular, and it is not clear to me that clearly defined standards for the 6 different scores will emerge.

I preferred something that says "this is better than that, so this should have a higher score than that".

But I don't want to have to manually make a strict linear order of all photos (despite the fact that a numerical score would do so), and I want the system to tolerate inconsistencies (eg. A>B, B>C, C>A) somehow.

Eventually I read about chess ranking, where each player is assigned a numerical score indicating their "goodness" and the scores are adjusted by pairwise comparisons between players - chess matches.

I adapted this for scoring my photographs. I started with the glicko system and modified it some.

The way this works is:

Photos compete against each other, as chess players compete against each other. The equivalent of a chess match is a presentation of two photos alongside each other in a web browser, with the user clicking on the photo they prefer. So, users do not assign an absolute score to a photo. Nor do they pick how much better one photo is than the other. They pick have a simple choice: "a>b" or "b>a".

Each photo is assumed to have a single numeric score, such that the difference in the score reflects the probability that one photo will win over the other photo. (this is affine: 900 vs 1000 is the same probability as 4000 vs 4100)

It is assumed that the score cannot be known exactly, but is approximated by a normal distribution (so there is a mean, and a standard deviation).

Adding a comparison between two photos gives information about the distributions for both photos causing the mean and standard deviation to be changed to more accurately reflect the score, as described in the glicko paper.

For my 50000 photos from the past 5 years, I have about 20% voted on at least once.

For a recent trip to rome, where I took about 1000 photos, it took a few hours to include each photo in at least one vote, where each comparison was an unvoted photo vs a random photo (which may or may not have been previously voted). This does not give a huge amount of information per photo.
Once that was done, I spent some hours making other votes: sometimes random vs random, sometimes random vs the photo with the closest mean. This caused rankings to become somewhat refined (sometimes causing surprisingly large changes in mean score)

So here are the top 3 photos from that trip:



and here are the bottom 3:



It seems to work reasonably well, though I think I need many more votes to get more accuracy. But that will come over time: as new photos are added, they'll get their scores by being compared against old photos, which will give more information about the old photos too.

No comments:

Post a Comment