Links

- The Mecca of Chess Programs -

main whatisnew download strength features shots subscribe reviews FAQ price list phonelistemail list how to order aegon bench database util epd2diagramj. noomen column DIAZ column misc. older topics comp. profile cartoon mode
Conquering the Swedish Rating top: The Mecca of Chess Programs

by Claudio Bollini



For a long time the Swedish rating of the SSDF has supposedly provided a reliable approach to estimate the Elo of a chess program. We accepted their results in a non-critical and naïve way, while we were witness of their reaching of FIDE Master's level first, then IM level, and finally how they fully entered into GMs' league. Numeric extrapolations were carried out, and ascendant curves were traced, trying to extrapolate the time when those silicon players would surpass the barrier of the 2700-2800 Elo points.

But for some years now, a serial of dark circumstances began to throw some shadows over the real consistency of the Swedish rating. First, it occurred the so called matter of the "hidden" or "cocked" books: It was proven that some programmers included in their creatures carefully prepared opening lines, to exploit some lagoons in the book of a given rival. Thus, once and again they crushed programs that were not prepared to react, and they artificially climbed positions. Why "artificially"? Because the increment was meaningless as a trustable reference for chessplayers.

Let us suppose that the program "A" is stronger against humans than the program "B" (in other words: it would obtain better Elo in living tournaments); but the latter has some specially prepared hidden lines against the program "A". Notice that they are not necessarily better variations that would improve the opening book quality, but rather only rebuttals for very particular weakness inside A's book. In this way, "B" would be able to improve its Elo at expense of "A". We haven't got here the usual practice in professional chess, i.e. the previous preparation of openings lines against a flesh and blood competitor, because a Master would never recurrently commit the same mistake in a variation in which he failed badly before.

A second incident was unchained with an accusation that fell on the popular Fritz program. ChessBase people would have provide their own interface for Fritz 5 automated games, while the rest of the commercial programs remained connected with the standard auto-player 232. If the responsible of a given program is the same who supplies its interface, then it's unavoidable some mistrust regarding on secret devices, which resembles the hidden books issue. The whole stuff was about to blow out, but the Swedish association committed to normalize all the interfaces, and to accept only the program from of the distribution companies.

Surprisingly, the current crisis doesn't have to do with disloyal attitudes, but with a decision impossible to deal with. According to Ed's announcement in this web page, several programmers are facing the task of suppressing important amounts of programming code in the evaluation function, to achieve more speed and conquer their computer competitors by the brute force of tactical calculation.

Mark Uniacke and Ed Shröder himself have included respectively the option "anti-human" (Hiarcs 7) and "anti-GM" (Rebel 10). They advise to disable these algorithms to face other programs, turned them into a quicker mode at expense of a smaller strategic knowledge. On the contrary of Mark, Ed has set (with better sense, we believe) his algorithm "against humans" as default. We find the reason obvious: the last addressee of a chess program is the user himself!

The seriousness of the situation becomes evident, and puts again into question the credibility of the SSDF rating. Reaching the pole position has become a goal by itself, and everything have to be sacrificed to this supreme objective -even the quality of a program. The aims are suspected of commercial nature. To obtain the pinnacle of the Swedish list is still a good commercial resource to noticeably increase the sales.

There is nothing necessarily bad on marketing strategies. But, what kind of product would they be offering to final users? A program specially optimized to successfully compete with other programs, which concrete benefits would report to a serious player? We agree that if it is gauged to defeat their computerized collages, it probably would be an almost invincible rival at blitz chess. But if this tendency continues, privileging speed over knowledge, it would become apparent its incompetence to compose medium and long-range plans, no matter which time control is choosen.

In both RGCC and CCC discussion groups, numerous examples have been shown to illustrate an unshakable ignorance of best chess programs to manage certain positions. Not even a deep and speedy searching could nowadays find a solution to a variety of cases that appear as obvious for a beginner. A sample taken among thousands:

White: King d4, Bishop c4, Bishop e4; Black: King d6 (both white bishops on white squares). This is an extraordinarily unusual, although not impossible position. At a glance it shows as an evident draw, but there is not program able to grasp this plain truth, unless the programmer had inserted some very specific code lines to foresee this particular case. When I state that no program can "grasp" this, I refer to the capacity of inducing from general principles, a sound judgement for a singular position like this one. Incidentally, a 100-plies depth search would be necessary to realize by sheer brute-force method that the 50-moves-rule is unavoidable, and to admit a zero evaluation score.

Programs with an ever-decreasing baggage of chess knowledge would inexorably be defeated by a GrandMaster at tournament time controls, after a few exploratory games -although they reigned unbeaten at the top of the SSDF.

If this alarming policy goes on, which would be our diagnosis about the future value of computer-computer matches? As we said, we estimate dubious that the SSDF can survive as a confident reference in human terms for further measurement for programs progresses. In our opinion, the only way -simple, direct, irrefutable- for obtaining a good picture of the real strength of a program depends on the decision of sending it as a contestant in human tournaments or matches. Before doing this, the companies should assume a little risk with their immediate gains, betting to a change of mentality in the computer chess world. We believe that this modification consists on reconsidering what it is the essential feature to be demanded to the pedigree of a program: A brilliant record defeating their commercial competitors, or a good campaign, perhaps earning GM norms, in tournaments of international category? In other words: What's is more valuable: an impressive SSDF Elo or a respectable real Elo?

We hope both programmers and chess players will gradually assume this perspective. Otherwise, how much more knowledge has to be sacrificed on the altar of the marketing, retracing the paths opened by almost a half-century of laborious researches? Where will be left that formidable challenge taken by entire generations of chess programmers, who tried hard to translate into the computer language the highly complex chess principles? Have they now given up (putting as some honorable exceptions), renouncing to keep the fascinating and formidable task of simulating each time more accurately the games of a GrandMaster? In our Rebel 9 review we had used the analogy of the Turing Test: We would wish that a moment arrived in which an elite chessplayer, confronting a computer under tournament conditions would be unable to realize if he is playing with another Master or not. Even more, we expect for a near future that FIDE allows such kind of programs to compete under fair conditions, to get an official Elo, and (why not?) to aspire to Chess World Champion title.