Wednesday, February 28, 2018

To analyze using a computer part 3

December was a dark month for Stockfish. First Stockfih was beaten by Alpha Zero see chess.com. Next Stockfish didn't manage to qualify for the TCEC-super-final of season 10 the first since years. If you would only look at those facts then many would decide to stop working with Stockfish.

However if we are investigating more closely the data then we see a very different picture. The beating by Alpha Zero was eventually only a difference of 64-36 which corresponds to about a gap of 100 elo see the fide-handbook. Besides there was a lot of critics on how Stockfish was operated. In my previous article I already explained how easily you can get a difference of a couple of hundred ratingpoints by just manipulating some parameters. This is not even taking into account the missing opening-book and the very limited hash-table which definitely also impacted the playing-strength of the engine. In short a match where Stockfish would get better conditions, could well reverse the final result but this will never be allowed of course by deepmind of Google.

Also the early elimination of Stockfish after stage 2 in season 10 was no disgrace at all. Contrary to previous years after stage 2 immediately the super-final was played. Consequently a lot depended on how the best engines performed against the weaker ones. Stockfish was the only engine not losing a game but finally ended a half point from the leaders. So the 2 finalists (Komodo en Houdini) never proofed their superiority. Personally I believe Stockfish is the strongest engine today available. After the final it won the TCEC-rapid and the newest release 9  (available since beginning of this month) leads at ccrl with 39 elo.

Naturally Chessbase doesn't report anything about the success of Stockfish on their website. The engine destroys their market. Even when a new top-engine like Houdini 6 is introduced then mainly negative comments are received. You really need to be stupid to pay 100 euro while a stronger engine is available for free. An open forum creates extra visitors but can also cause damage to your business. Personally I think such comments are inappropriate. First you get more than just an engine for the 100 euro. Also you can't expect that everybody wants to work for free. There is nothing wrong with trying to make a living from creating new interesting things.

Still if we only concentrate at the engine then we can wonder how necessary it is for ourselves to get an extra engine beside Stockfish. Do other engines have an added value for us? Well honestly I doubt it for 99,9% of the players. The quality of Stockfish's analysis suffice for any player till at least 2600 fide and maybe even higher. Only for some theoreticians like correspondence-players and the world-top in otb it becomes doubtful to only rely at Stockfish. I call it doubtful as today it is really not clear if an extra engine will still bring something extra. To support this statement, I made during the 2 weeks of the last Christmas holidays a special research-project.

In the 6th round of the last Open Leuven again like last summer at Gent see evolution, I didn't stand a chance against the Belgian IM Stefan Docx. Again I was surprised in the opening and was trailing the whole game. However this time the problem of the opening was more serious than last time. Despite many hours of analysis I was not able to repair the system at home. In the end I had to admit the opening was not fully correct so I should look for something else instead. However I am rather reluctant today to learn something completely new from scratch. I play few games so the work should be proportional. Stefan recommended the classical Dutch as that is the closest related to my repertoire. Just recently our current world-champion Magnus demonstrated the viability of this opening in a secret online blitz-game.
[Event "Online blitz"] [Site "?"] [Date "2017.??.??"] [Round "?"] [White "Gustafsson, J."] [Black "Carlsen, M."] [Result "0-1"] [ECO "A99"] [PlyCount "50"] [Sourcedate "2018.02.22"] [Sourceversiondate "2018.02.22"] [WhiteElo ""] [BlackElo ""] [CurrentPosition "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"] 1.Nf3 f5 2.g3 Nf6 3.Bg2 e6 4.O-O Be7 5.c4 O-O 6.Nc3 d6 7.d4 Qe8 8.b3 a5 9.Ba3 Na6 10.Rc1 Nb4 11.Qd2 Ne4 12.Nxe4 fxe4 13.Ne1 Qg6 14.Nc2 Bg5 15.e3 Nd3 16.Rcd1 e5 17.Bxe4 { (After the game Jan complained that he hadn't studied yet this opening but his defeat wasn't related to his poor knowledge.) } ( 17.dxe5 Bg4 18.Bxe4 Qxe4 19.Qxd3 Qxd3 20.Rxd3 Be2 21.Rd2 Bxf1 22.Kxf1 Be7 23.Ke2 $16 ) 17...Qxe4 18.Qxd3 Qf3 19.dxe5 Bf5 20.e4 ( 20.Nd4 Bxd3 21.Nxf3 Bxf1 22.Nxg5 Be2 23.Rd2 a4 24.Bb2 Bg4 25.exd6 axb3 26.axb3 cxd6 27.h3 $14 ) 20...Qxd3 21.Rxd3 Bxe4 22.Rc3 Bd2 23.Re3 Bxc2 24.Re2 Bb4 25.Bxb4 Bd3 0-1
Nevertheless Magnus can probably play anything especially at blitz and still win. Even in that Magnus' game black's position was several moves pretty dubious. So I wasn't convinced yet to take up the classical Dutch in my repertoire. I needed to know more about the quality of the opening. Of course the first thing you need to check are recent articles/ books about this opening. The e-book The Killer Dutch, published in 2015 can probably considered today as the current best up to date theoretical work about this opening. The author is the English grandmaster Simon Williams. He is the biggest expert in the opening. Unfortunately from theoretical point of view this book comes short. Simon doesn't write for theoreticians but he explains the opening from a pure practical use in tournament-play. That is understandable as the practical player is his main reading-audience. So some critical lines are a little too easy categorized as harmless.

If no good theoretical references exist then the only thing which remains is to start your own research. However that is easier said than done. Even an opening like the classical Dutch contains today a myriad of variations. Below you see a screenshot of my current personal opening-book only built with games of the Megadatase in which at least one color has + 2300 fide.
That is +1700 games and we still need to add the correspondence-games and the engine-games which also could influence the evaluation of the different lines. In my article studying openings part 2 I already explained that 100 games often take about a week to digest. So I realized in advance that I had to change my working-methods to process +1000 games within an acceptable time-frame. The first adaptation was to prioritize the lines played the most often in practice instead of the lines recommended by the engine. Especially at a very early stage of the opening which is here the case, we often see that the engine plays inferior moves compared to a strong opening-book (see also the earlier reference to the match between Alpha Zero and Stockfish). It is necessary to analyze side-lines to support the main-lines but analyzing side-lines to detect the main-lines is mainly a waste of time.

A second important time-improvement without loss of quality was expected by not checking everything anymore by a 2nd engine (see my old article to analyze using a computer part 1). It is very time-consuming to switch between engines even if 2 computers are used. That is the link to my introduction in which I announced a special research-project. Instead of checking everything twice, I only did for some positions which I considered critical for the evaluation of the opening. This means only the positions where best play of both sides could/ would still give an edge to white by the first engine.

In the end only 18 positions remained to be checked by a second engine. The result was stunning. Only for 3 positions there was a conflict due to the smallest possible difference of only 1 hundredth of a pawn. I discussed this absurd phenomenon in my article annotations. I like to use strict boundaries to achieve a very objective method of analyzing but in some exceptional cases this can create some weird evaluations. In other words the only conclusion of the project is that the extra analysis with the 2nd engine was not adding anything substantial.

It again demonstrates how drastic the engines have evolved in the last years. 10 years ago it was rare that 2 top-engines were so often agreeing. You could easily find positions in which one engine would tell you that white is winning while the other one would state black is winning. Extra analysis were necessary to find who was right or wrong. In some cases the truth was somewhere in middle. Today not only we see top-engines prefer the same move but also sometimes show the same main-line. In fact it is not so surprising as engines are getting closer to perfection. Besides Stockfich applies an open source strategy. Everybody is allowed to see their code and learn from it. Of course other top-engines copy stuff which again diminishes the differences between them. Before the old top-engines were more closed.

So this means we can skip the 2nd engine forever. No, that conclusion we can't make yet. It is not because we don't see differences for the classical Dutch that there exist none between the engines. The qualities of an engine are not depending on just one opening. It is also the most important reason why tests of engines happen with a wide range of openings. Recently I detected a serious difference of evaluation in a position popping up from a Spanish Breyer-opening. See below screenshot in which we see both engines calculating in parallel.


While Stockfish claims a big advantage, Komodo states it is approximately equal. So for some special positions some extra analysis remains necessary to know which engine to trust. Anyway the number of special position quickly diminishes.

Theoreticians will still have to use a second engine. For players not interested in maximum quality, it makes no sense anymore to buy a 2nd engine. I will use also less a 2nd engine in the future. Only for detecting small differences (0,3 pawn) I still see an added value. Today a 2nd engine has become redundant in positions with an evaluation higher than +2 or lower than -2.

Finally maybe some reader wonders what I concluded about the classical Dutch. A very concise summary of the weeks of analysis can be found below.
[Event "Veterans World Cup9 Gr20"] [Site "ICCF"] [Date "2016.09.01"] [Round "?"] [White "Oppermann, Peter"] [Black "Prystenski, Arthur"] [Result "1-0"] [ECO "A96"] [WhiteElo "2291"] [BlackElo "2222"] [PlyCount "73"] [EventDate "2016.??.??"] [Sourcetitle "UltraCorrX-revised"] [Source "Tim Harding"] [Sourcedate "2017.09.15"] [Sourceversion "2"] [Sourceversiondate "2017.09.15"] [Sourcequality "1"] [CurrentPosition "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"] 1.d4 e6 2.c4 f5 3.Nf3 Nf6 4.g3 Be7 5.Bg2 O-O 6.O-O d6 7.Nc3 { (This position can be reached via different move-orders. It is the critical starting position of the classical Dutch.) } 7...Ne4 { (The English grandmaser Simon Williams calls this the modern variation. The play becomes quickly very sharp in this line contrary to the more positional a5. Anyway this is blacks best chance to equalize as my analysis showed that a slow approach of black allows too much initiative to white.) } 8.Nxe4 { (I consider this the most critical test but also against Qc2, Qd3, Bd2 and Nd2 black has no easy task to get equality.) } 8...fxe4 9.Nd2 { (Ne1 should not be underestimated by black.) } 9...d5 10.f3 { (First e3 is also possible and can lead to a transposition of the game.) } 10...Nc6 11.e3 { (Before fxe4 was considered the mainline but I find today e3 more clear.) } ( 11.fxe4 Rxf1+ 12.Kxf1 { (This refinement to Nxf1 was discovered first in 2013 and played in correspondence chess.) } 12...dxc4 13.Nf3 b5 14.Be3 Qf8 { (The Chech correspondence-player Zdenec Nemec is the last player still willing to defend this line. Anyhow I can't say black's play is a walk in the park.) } 15.Kg1 ( 15.a4 b4 16.Kg1 Rb8 17.Rc1 Na5 18.Ne5 c3 19.bxc3 b3 20.c4 Qe8 21.Rb1 Ba6 22.Bf4 g5 23.Be3 Qxa4 24.Bh3 { (The correspondence-game Pekin,T - Nemec, Z played in 2017 was drawn after 33 moves.) } ) 15...Bb7 16.Qd2 Rd8 17.Bh3 Qf6 18.Bg5 Qf7 19.b3 cxb3 20.Bxe7 Qxe7 21.axb3 a6 22.Rc1 h6 23.Qc3 Re8 24.Qc5 { (The correspondence-game Willmann,B - Nemec,Z played in 2015 was also drawn in 51 moves.) } ) 11...exf3 12.Nxf3 b6 { (Bf6 was played more often in otb but after Qc2 recommended by the engines white is simply better.) } 13.Bd2 Bb7 14.Rc1! Qd6 15.Qc2! Rac8 16.cxd5! exd5 17.b4! { (With a series of powerful moves black gets into serious troubles.) } 17...Nxb4 18.Bxb4 Qxb4 19.Bh3 Qa3 20.Ne5 Rce8 21.Be6+ Kh8 22.Nf7+ Rxf7 23.Bxf7 Rc8 24.Be6 Rd8 25.Rf3 c6 26.Bg4 Bf6 27.Kg2 Qe7 28.Qa4 Ba8 29.Bf5 g6 30.Bd3 Bg7 31.Rcf1 a5 32.Qb3 Rb8 33.Bc2 Kg8 34.h4 c5 35.Rf7 Qe6 36.dxc5 b5 37.Ra7 { (This was breathtaking chess. This level of modern correspondence-chess can not be achieved by us mortals in otb.) } 1-0

Brabo

Addendum 21 March 2018
At http://www.chesspub.com/cgi-bin/chess/YaBB.pl?num=1369191586/75 an important improvement was shown for black by the German FM Stefan Buecker in the classical Dutch. The novelty 13...a5 instead of the played 13...Bb7 revitalizes this line.

1 comment:

  1. OK. Some problem earlier.

    You have very interesting blog. I found it only recently. I myself have not been active for some years now but if I decided to start play again and catch up all the theory I was left behind this time, surely this blog of yours is the place to start. In many ways.

    Thanks from Finland by one national CM

    ReplyDelete