Till a couple of months ago I never bothered about testing chess-engines. I didn't see any value in it. I would never be able to achieve the same quality as the results CCRL publishes weekly. Besides such work is not cheap as you need to invest into hardware, electricity, floorspace,... On top most of those games played by engines are pretty boring. You better watch games of humans to see drama and creativity.
However as I mentioned in my last article, I had an open question for Leela. CCRL nor other sites give me an answer about how strong Leela would be in comparison with the classical engines when both use exactly the same type of hardware. That is a problem for me. I can install for free Leela on my PC but I only want to use it for analysis if I know the engine is one of the 2 strongest ones I possess. I am using that rule already for a very long time see my article of 2012 about how I analyze. Maybe some will consider this a bit silly but it assures me that my opponents will likely not have any better analysis.
So in the end I decided to do the testing myself. Then the next question is of course how to do this job quickly, accurately and as cheap as possible. I could use a set of puzzles but that is only one aspect of an engine. I rather prefer the engine to be tested by playing games but I can't/ don't want to miss my hardware for several months. A good compromise was found in a rapidmatch with the rate of 15 minutes + 10 seconds increment over 100 games. That should give a good indication of the playing-strength. At stake was a place in my top 2 engines so logically I chose Komodo 11 as its opponent for the match.
Then the next question is what do we decide about the openings. Do we give the engines full liberty of choice or do we select a number of positions which need to played out once from each side as TCEC does? The free choice is as we humans play our games but there are some disadvantages to that. The engines will likely play openings which are not part of my repertoire. The risk exists that they play very safe and we get an abundance of draws. Finally Leela will without an openingbook play almost exclusively the same moves in the opening so you risk to see several times the same opening/ game.
Therefore I preferred to let the engines start from a pre-defined set of openings. Which openings to choose is then the next logical question. It didn't take me long to find a good answer for it. I created a new database and injected a selection of 50 recently played games of myself. Next I removed in all games the moves beyond the 10th. The few duplicates which I got, were swapped by selecting a few other of my games. The final result was a nice mix of 50 positions in which some of them the balance was already broken. This way I avoided a too high number of draws. Besides the engines will only play openings which have occurred before in my practice which makes it of course more fun to watch the match.
Finally everything was ready. Via Fritz I activated the window to initialize the match as obviously I wanted to automate the whole process. First I selected Leela. Next Komodo11. I selected the right tempo and the last step was linking to my special database of 50 positions. After verification of all parameters I clicked ok and the match got off.
About 3 full days lasted the match. I let my PC run day and night but I did interrupt the process a few times to allow my PC cool down as around that time we were having temperatures around 40 degrees in Belgium. Anyway it was very easy to continue the match from the point where I paused.
The match was a big success which superseded the tests. First it became quickly clear both engines were very close of strength but also had a very different style. Often games got extremely interesting and besides played from openings all part of my repertoire. A number of times, I sometimes even together with my children, watched live 1 or more games. My children also regularly asked about the preliminary score as we all got attached to little Leela which despite the tactical handicap (more about it later) often managed to defeat the giant Komodo .
It made me want to have more of it so I decided to organize twice more such match in the next months with newer releases of Leela. For the 3rd match I decided to replace some of the openings. If in the 2 previous matches 4 times the same color won (so irrespective of the engine) then it seemed more appropriate to select some other opening to use as test.
2 matches were narrowly lost by Leela. The second match Leela tied with Komodo. I considered this a very unexpected and exceptionally good result on my modest computer definitely not optimal for Lc0. On the other hand the matches didn't give an answer on my original question. The scores were too close to know for sure which engine of the 2 was the strongest. Anyway this is not a disaster as now I got to know Leela very well in the 300 games. I got a pretty good idea when to use Leela for the analysis.
In my previous article we already got acquainted with Leela by looking at how the engine reacts in different types of positions but it is only by replaying her games that we fully realize how different the engine is compared with the traditional ones. So to conclude this article I made a selection of 3 games which demonstrate very well the strengths and weaknesses of Leela. This was not so easy as there was a very large number of beautiful games. I start with a fantastic game played from the Chigorin-variation of the Spanish opening (I covered the opening recently in my article statistics). Leela sacrifices very early an exchange and succeeds like a real boa constrictor to slowly suffocate black.
The match was a big success which superseded the tests. First it became quickly clear both engines were very close of strength but also had a very different style. Often games got extremely interesting and besides played from openings all part of my repertoire. A number of times, I sometimes even together with my children, watched live 1 or more games. My children also regularly asked about the preliminary score as we all got attached to little Leela which despite the tactical handicap (more about it later) often managed to defeat the giant Komodo .
It made me want to have more of it so I decided to organize twice more such match in the next months with newer releases of Leela. For the 3rd match I decided to replace some of the openings. If in the 2 previous matches 4 times the same color won (so irrespective of the engine) then it seemed more appropriate to select some other opening to use as test.
2 matches were narrowly lost by Leela. The second match Leela tied with Komodo. I considered this a very unexpected and exceptionally good result on my modest computer definitely not optimal for Lc0. On the other hand the matches didn't give an answer on my original question. The scores were too close to know for sure which engine of the 2 was the strongest. Anyway this is not a disaster as now I got to know Leela very well in the 300 games. I got a pretty good idea when to use Leela for the analysis.
In my previous article we already got acquainted with Leela by looking at how the engine reacts in different types of positions but it is only by replaying her games that we fully realize how different the engine is compared with the traditional ones. So to conclude this article I made a selection of 3 games which demonstrate very well the strengths and weaknesses of Leela. This was not so easy as there was a very large number of beautiful games. I start with a fantastic game played from the Chigorin-variation of the Spanish opening (I covered the opening recently in my article statistics). Leela sacrifices very early an exchange and succeeds like a real boa constrictor to slowly suffocate black.
The extraordinary of this game is that there is no fixed center. The battle rages over the full board but black never gets a change to exploit the extra exchange.
A second game starts from a Dutch stonewall which I encountered in one of my games played end of 2017 against the Dutch IM Xander Wemmers see secret. In the game we see the advance of both rook-pawns which is very typical for the style of Leela. Next we see a magnificent demonstration of activity. Komodo doesn't understand at all what Leela is trying to do.
A second game starts from a Dutch stonewall which I encountered in one of my games played end of 2017 against the Dutch IM Xander Wemmers see secret. In the game we see the advance of both rook-pawns which is very typical for the style of Leela. Next we see a magnificent demonstration of activity. Komodo doesn't understand at all what Leela is trying to do.
Leela plays this game as many others with an understanding of open lines, bad bishops which is much more advanced than Komodo.
If you have replayed the 2 previous games then you probably start to wonder why Leela didn't destroy Komodo in the match. Well tactically things got often completely wrong. A nice example is the next one in which Leela sees the combination 5 moves too late.
Fans of my blog will likely already recognized the link to my article the butterfly-effect. All the moves were already covered in that article so it was definitely a surprise to see them all executed on the board.
I got to enjoy testing of chess-engines via these kind of matches. A new match won't be for immediately as other work needs to done first. Besides Leela is building a new network from scratch and today it is still much weaker than the networks of a couple of months ago. It would also be nice for a next match to have by that time newer and stronger hardware.
Brabo
I got to enjoy testing of chess-engines via these kind of matches. A new match won't be for immediately as other work needs to done first. Besides Leela is building a new network from scratch and today it is still much weaker than the networks of a couple of months ago. It would also be nice for a next match to have by that time newer and stronger hardware.
Brabo
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.