Saturday, January 05, 2008

Chess is easy, it's the rating system that's difficult



[From Chessbase.]

14 comments:

Anonymous said...

Thus demonstrating the advantage of the ECF grades (which can be calculated/ checked using simple arithmetic).

A few years ago when I was trying to understand Elo ratings I found via google what seemed to be a very helpful website. I did a few calculations comparing ECF grades with Elo ratings and there seemed to be some discrepancies. I finally tracked this down to the fact that the table on the website for calculating Elo ratings was not the standard one (it was different to the one in Stewart Reuben's book, for example). I contacted the author of the website and pointed this out and he said "oh dear, I copied it from somewhere" (and he couldn't remember where). This points up the danger of using a system which only a minority of players understand.

One often reads an argument along the lines "we should all be using the Elo rating system because it is inherently superior" but this ignores two important facts:

(i) there is a theoretical connection between Elo and ECF grades (Elo=600+8*ECF). The fact that this formula doesn't seem to be followed for the lower range of ECF grades (and an alternative is printed in recent grading books) is because of the sparsity of the data at the lower end (and the rules for acquiring Elo ratings). If both systems had the same data (game results) then they would produce equivalent ratings/gradings (to within a few ECF grading points).

(ii) all rating/grading systems suffer from what is known as "deflation" - the tendency for grades to reduce over time. This caused mainly by rapidly improving players (e.g. juniors) - when their games are graded/rated it is on the basis of a rating/grade which is too low. Both Prof Elo and Sir Richard Clarke (who invented the ECF grading system) understood this and wrote about it. In the ECF system junior grades are increased (when new grades are being calculated) to counteract deflation but it looks as if this has been inadequate. Logically, FIDE should be doing the same ... although I have never seen any reference to this. FIDE has an advantage over the ECF in this area because it calculates ratings every three months (or by tournament?) but ECF could also recalculate grades more often (if the data were available).

Support your local grading system, folks.

Mike G.

ejh said...

There's also a problem which is that we don't just want to take ECF grades and say what their Elo equivalent is: we may want to take Elo grades and say what their ECF equivalent is.

This may be if, for instance, overseas players (or indeed UK players outside England) want to play in an English tournament. Or if, for instance, an English player is playing in a non-English tournament and wants to estimate the standard of the opposition.

The usual formulae aren't too helpful here, from what I can judge playing Spanish opponents regularly and trying to estimate what, say, 1850 or 2000 or 2250 would mean if the player holding such a rating had an ECF grade...

Ed-T said...

The problem with statistical systems (like ELO) is that the ratings are influenced by the overall strength of the entire pool of players. Most internet chess servers run some variant of an ELO rating system. I tried Yahoo once and easily exceeded 2000, whereas on FICS, the best I've achieved is a few hundred points lower. If I were to play on ICC (where the GMs play), I would probably look quite pathetic.

Consequently, the conversion from ECF to USCF, FIDE and "National ELO" differs quite significantly. Check out: http://www.westlondonchess.com/node/grading
The conversions originated from the ECF Web site.

Granted, the ECF grading system is much easier to calculate, but I find it rather crude. Firstly, the fact that it is only updated annually. Newcomers, who can be quite strong, enter quite low in the board order and as a result get quite a low initial grade, despite winning most of their games. It takes two seasons for their grades to approach their level of ability, by which time they have improved with experience. So the grades for a new-ish player are only vaguely accurate after two seasons. We find people come and go quite frequently (some from abroad etc) so this is a great source of inaccuracy. The lag in junior grades is similar in effect.

There are some unknowns in the ECF grading system too. How do you rate an ungraded player vs another ungraded player? I presume via some form of iteration, but the methodology is not published.

It has been shown that ELO can be made usable via the use of tables -the USCF have been doing it for a long time. Despite the added complexity, I personally prefer a more accurate system of grading and feel that ELO is slightly superior in that regard.

Anonymous said...

I thought it was accepted that the ELO system (as interpreted by FIDE until they starting reducing the starting grade) suffered from inflation, not deflation.

Tom Chivers said...

Another deflationary factor is the increased use of computers in chess: everyone's standard has increased because of them, therefore, those increases do not show up in ratings.

ejh said...

everyone's standard has increased because of them

Why would you say so?

Tom Chivers said...

Ok - not everybody's. But enough people's that the increase in standard doesn't show up in actual ratings.

ejh said...

No, what I mean is - why do you think that the possession of computers increases the standard of OTB chess? Do you mean in the sense of their database functions, so that opening theory is better known, or something else?

Tom Chivers said...

Many things, including opening theory, but also... The 24 hour ability to practice against strong opponents - programmes &/or people. Training software, whether tactics, openings, endgames. The ability to take lessons from experts no matter where you are, eg rent-a-GM on the ICC.

ejh said...

Do you think any of those actually make a difference? I'd hate to say definitely that they don't, but I'm unconvinced that they do.

Ed-T said...

Regarding the use of computers, I have to agree that it has improved the standard of chess. Mine at least. Very simply, I get the opportunity to play more games. I analyse every over-the-board game and learn more from it. When presented with an unusual opening or new line, I get to explore it using the computer. Most people I play chess with say they use computers, often for both online play and analyis. Some even develop their own opening books. I've used tactics training programs too.

Back to the rating system for a bit; I think even the ECF grading system could benefit from a quicker update cycle (every 6 months perhaps like rapidplay grades?). The main reason for compensating juniors is their fast rate of improvement. One wouldn't need to compensate so much if the rating system actually kept up with their abilities.

Tom Chivers said...

I think so. I think I'm a better player now than I was when I was 16 or 17 - when I had more or less the same grade as now. I should really dig out some old scoresheets though, to compare the (lack of) quality directly.

Thomas said...

Just to pitch in that my experience was that the computer destroyed my playing standard by being unbeatable. Never winning had such a demoralising effect that my real game went down the toilet too.

I recently (about 2yrs ago) found the same thing when I upgraded my machine and could no longer beat gnushogi. Again, my off-simulator play started to suffer.

I've never found a chess simulator that was any good at simulating weak play as seen in humans. Turning the difficulty down to the point where I could win invariably meant the computer making bad moves that no human would make.

It's a bit like a runner training against a robot motorbike who's idea of "turning it down" is to completely stop every so often to let the runner get ahead.

Ed-T said...

The idea is to use the computer as an analysis/learning tool rather than an opponent. There are however two programs I do sometimes play against:

Chessmaster 10th edition. You play characters with certain strengths and flaws. My experience here is that while it may give away 1 point in the beginning, Chessmaster plays incredibly well for the rest of the game. Really hard work. Chessmaster XI is out, but I don't know how it compares.

Fritz 10 has a "Friend" mode and a "Sparring" mode, which I think can be beneficial for training. In the sparring mode, a light flashes to indicate a possible (minor) blunder by your computer opponent that you should be able to take advantage of. This is less frustrating than Chessmaster.