Wednesday, August 31, 2011

BORP? XI

Tomorrow will see the publication of FIDE’s latest bi-monthly rating list. This event has your humble scribe more than usually aquiver with excitement since, for the first time ever, his name will appear as one of the squillions of chessers who are officially not as good as Magnus Carlsen.

If you click on my FIDE Card today you will see that Sunningdale, Gatwick and Benasque gave me a total of 17 games for rating purposes and an expected elo of 2049. That might become a tad lower if the results from Twyford make it through in time, but either way my rating will be pretty close to the result that you get if you apply the official conversion formula of 'x8 +650' to my new ECF grade of 172.  I have no particular axe to grind for or against either system, then, but I do neverthless follow the debates about which one is best with some interest.

The ECF Grading regime's strong point is its simplicity. You take your opponent's rating and add fifty if you win or subtract fifty if you lose, the average of all such results over the season being your grade for the following year. Tournament Performance Ratings can be calculated in one's head or, at the most, be worked out swiftly using nothing more than a pad of paper and a pencil.

Mucho KISS points it might have, but the ECF three-digit rating method definitely has its anomalies. Consider what would happen if a pair of chessers, an ECF 200 and an ECF 150, played ten games against each other and scored three wins each with the remaining four drawn. You'd think that they'd be rewarded equally, but instead of the pair of them ending up as ECF 175s the 200 would become a 150 and the 150 would become a 200. This, probably the most widely cited of the curiosities that the current system can throw up, does not sound good at all, does it?

For elo systems, though, the reverse seems to be true. They avoid the 'grade swap' problem - at least I think they do - but they're fiendishly complicated. In fact I'm not even going to bother trying to explain how they work so if you're interested you might want to have a peek at the wiki article on the subject although I will mention as an aside that if you manage to finish it, you're a better man than I.

Some people say that the elo method also often fails to accurately reflect the strength of rapidly improving players. That may well be true, although I've always wondered whether that issue might be a function of the number of rated games a person plays. If so, the fact that there are more and more opportunities to play elo rated games these days - Twyford, for example, not to mention the excellent e2e4 tournaments run by Sean Hewitt and the 4NCL of course - will improve the system no end.  Presumably publishing a rating list six times a year helps ameliorate the problem too.

We can't just sub-contract the whole business to FIDE because British club games don't qualify as rateable  (the playing sessions aren't long enough).  That doesn't mean the ECF can't scrap the existing system and come up with one based on Elo principles instead, though.

British chessers, your choice is clear.



keep things as they are


or



switch to elo




BORP? Index


12 comments:

Anonymous said...

The lighthouse keepers problem, players exchanging ratings is well known. You can even get the same effect in Elo style systems.

It's better to use the alternative interpretation. If our 200 player plays against an average field of 150 and scores 50%, he's no better than them, therefore deserves to be reduced to their level. Equally if our 150 player can score 50% at a level of 200, then he's now at their level.

Elo systems do the same, but have a longer memory. Players who are now several hundred points better than when they had their first rating can lag quite seriously. It's becoming something of a problem for some of the younger British players.

Anonymous said...

Any comments on the elo system from occasional blog writer Peter Lalic?
Think you are right Jonathan, most people only play a handful of FIDE rated games each year as opposed to maybe 30-50 ECF rated ones. As such it can take years to correct a wrong FIDE rating. ECF I assume would correct the problem more rapidly but it still needs 30 games to ignore past ratings (apart from the +50 stuff) I think. Probably best to play long tournaments during school time if you want to avoid playing seriously under-rated juniors every third game. That said I have become very aware of the "under-rated" opponent excuse to explain away my bad play. Funny, I never seem to play over-rated players when it appears I am doing well.
Andrew

Jack Rudd said...

And rapidly improving players' playing more often doesn't solve the problem - it means their ratings tend to lag less, but their opponents' ratings come down just the same.

Anonymous said...

Umm, a slight technical point: if a 200-strength player scores 50% against a 150-strength player (or a group of 150-strength players) then his/her performance will be 160 rather 150. This is due to the 40 point rule which limits the difference in grade to 40. (The ELO system has a similar proviso.)

This doesn't, of course, detract from the gist of Jonathan's point.

Angus

dfan said...

Ratings can have different purposes.

Elo is meant to be predictive: it attempts to solve the problem of "What is the probability of player A winning against player B today?"

BCF seems like it is designed more to measure one's past performance than to predict one's future performance.

Of course the two goals are correlated; strong players are going to show up near the top, and weak players near the bottom, of both lists.

Anyway, deciding what goal you want a rating system to fulfill can go a long way towards helping you choose one.

Jonathan B said...

Players who are now several hundred points better than when they had their first rating can lag quite seriously. It's becoming something of a problem for some of the younger British players

I played a young fellow at Twyford who has an ECF of 151 and an elo of 1300 or so. Mind you I also played a man in his 40s/50s who as 185ECF and 1920 elo and a guy who was 164ECF and only in the 1800s elo.

I got rather shafted for those three! Mind you, my overall preference for ECF is more about the simplicity factor than for anything else. For club players rough and ready will do. I accept that for international players a more exact system is to be preferred.

What about the rest of you though? Which pill to you want to take? Or, to answer Dfan's question, which goal are you (we) trying to achieve?

Niall said...

A view from abroad (France). In France there's a national ranking and obviously one can become Fide-rated. But it's one or the other. Generally one starts with a national ranking, which has roughly the same value as the Fide one apparently (1500 French = 1500 Fide). Once one becomes Fide, the national ranking disappears. In my case, I went from around 1700 National to around 1470 Fide (bad perfs in my Fide tournaments, naturally). This causes me, and others, several problems.
1. A lot of tournaments aren't Fide-rated, so I'm not that motivated playing in them. Bad loss? Who cares? This includes FFE (French Chess Federation) organised events such as the Coupe Loubatière (under 1700 team of four cup) and the Coupe 2000 (same only under 2000).
2. As the Fide rating only recently went as low as 1200 (or is it 1000?) this rating band is under-populated. In French club championships, one's place in the team is determined by one's rating with a 103 point tolerance e.g. an 1800 can play on board 1, a 1900 on board 2, but if the 1900 becomes 1905, s/he must play on board 1. My lowly ranking condemns me to board 5 or 6 depending on who's available to play, where I meet 1400-1500 players (if I'm lucky) who mostly haven't turned Fide. So my wins mostly count for nothing. Considering I haven't lost in two years, it's a bit disappointing. My rating doesn't progress and I'm stuck playing the same level of player.
There are also some "rapidly-improving juniors" in the same boat in my club.

So if England adopted a Fide ranking only system, I think a lot of the motivation would disappear from league chess. I'm not saying that people play purely for points, but they can help players to fight harder in games where they otherwise mightn't be that bothered.

If the suggestion is to have a separate English elo-based system, it might be an idea to keep the three-digit numbers to avoid confusion with the Fide ratings.

Mike G said...

If the ECF system and the Elo system are fed the the same data (and enough of it) they will give you equivalent grades/ ratings. Both reflect past performance and can be used to predict future results.

As pointed out above, the main problem with Elo ratings for lower rated players in the UK is down to not enough results being fed into the system, thus the ECF grade is usually more accurate than Elo for many of us.

ejh said...

If the suggestion is to have a separate English elo-based system, it might be an idea to keep the three-digit numbers to avoid confusion with the Fide ratings.

It's not really necessary. In Spain we use both, and everybody knows the difference.

Jack Rudd said...

Blogger Mike G said...

If the ECF system and the Elo system are fed the the same data (and enough of it) they will give you equivalent grades/ ratings. Both reflect past performance and can be used to predict future results.


Wrong. Imagine two scenarios (in which all the players concerned play enough games to make sure no results from previous seasons come in), which are exactly the same except that in one scenario, each game gets played again with the same result.

The two scenarios result in identical grades under ECF, but under Elo, the second scenario results in the ratings' having moved twice as far from their original values.

Mike G said...

You are right to identify that there is a difference in "lag" between the systems (the same point was made earlier) - that is why I was careful to specify "enough" in what I said. Of course, if you have a steadily improving player, neither system gets it right but a player who is playing at a consistent level (or close to it) and who plays ENOUGH games, then the rating/ grade will converge, I suggest.

Jack Rudd said...

"Enough" is a dangerous concept in this context. With ECF grades, it's straightforward: the more games you play, the more your grade converges to your true strength. With Elo, the more games you play, the more your grade changes within a given rating period.

In fact, if you play sufficiently much Elo-rated chess, your rating can diverge from your true strength. (It's unlikely to happen in practice - you'd need to be playing something on the order of 50 games/month with the FIDE rating system.)