Chess.com and Game Reviews: there has to be a better way

If I, a complete chess patzer, can see that your game reviews are flawed — even on maximum strength — what about those who don’t know any better?

This will be a first (and probably last) time I actually do chess analysis on this blog. Here’s the problem: chess.com is using Stockfish 17.1 on “maximum” analysis mode, but it’s not spending that time actually analyzing the opening. Here’s the game for reference, followed by chess.com’s analysis at various levels, followed by my analysis of chess.com’s analysis:

Game moves follow. I’m black.

1. e4 e6 2. Nf3 d5 3. exd5 exd5 4. Qe2+ Be7 5. d3 Nf6 6. Bg5 O-O 7. Bxf6 Bxf6 8. Nbd2 Re8 9. O-O-O Rxe2 10. Bxe2 Nc6 11. Nb3 Nb4 12. Nc5 Nxa2+ 13. Kb1 Nb4 14. Nxb7 Bxb7 15. d4 a5 16. Ne5 Qd6 17. f4 Ba6 18. Bxa6 Rxa6 19. Nxf7 Kxf7 20. Rhe1 Nc6 21. Re2 Nxd4 22. Rxd4 Bxd4 23. Rd2 Be3 24. Rxd5 Qxd5 25. f5 Qxf5 26. b4 axb4 27. Kb2 Bd4+ 28. c3 Qf2+ 29. Kb1 Qe1+ 30. Kc2 Ra2+ 31. Kd3 Qe3+ 32. Kc4 Qxc3+ 33. Kd5 Ra5+ 34. Ke4 Qe3# 0-1

Note: Feel free to zoom in on the table below. I reduced the font size so that words wouldn’t move to the next line. The table should zoom nicely; it did in my testing.

Excellent typically means a second-best move; good means a non-bad move. Great is equivalent to an exclamation point, while brilliant is equivalent to a double exclamation point (yes, I know it’s strange that great and brilliant are both better than best in this context.).

RatingMaximum Analysis (White)Maximum Analysis (Black)Deep Analysis (White)Deep Analysis (Black)Medium Analysis (white)Medium Analysis (black)Fast Analysis (white)Fast Analysis (black)
Accuracy73.385.777.987.979.588.383.392.1
Brilliant00000000
Great01010101
Best1215101611151415
Excellent3878410413
Good875610482
Book33333333
Inaccuracy40603120
Mistake20101010
Miss00000000
Blunder00000000
Game rating9001550130016001300165014001650
Openinginaccurategreatinaccurategreatinaccurategreatinaccurategreat
Middlegamegoodbestbestgreatbestgreatgreatgreat
Endgame

My analysis:

Chess.com rightly gives white’s 7th move a question mark, for it allows me two advantages: (1), when I retake on f6, I’m now threatening the pawn on b2 and the rook on a1. White needs to address this threat on move 8 with something like pawn to c3, or even rook to b1 (although probably not the latter, since white appears to be castling queenside). I get why white didn’t want to move to c3, as it would destroy his attempts to castle queenside. But the rook was under a pin, so something had to be done. Chessbase believes that that was the losing move.

Here’s where chess.com messes up. White moved 8 Nbd2. Chess.com marked that as the best move, bringing out the knight and increasing control over the center. Yet, at this point, Stockfish already has me ahead by 4.93. How can Nbd2 be good when I immediately get the move I did make, 8 … Re8. Now the queen is pinned to the king, and I’ll get the queen for a rook immediately. Stockfish already knows I’m comfortably ahead. Chess.com marks 9 0-0-0 as a “best” move noting, “They defended their pawn, which was under attack.” Yeah, the queen is already lost and the pawn is threatened. White can’t defend both. How about calling that move just “good.”

Chess.com correctly notes that I could have won more than a pawn on move 12, calling Nxa2+ good, when I could have gone Qe7 and threatened both the knight and the bishop, likely winning one of the two on the next move at the expense of having my queen harassed for a couple of moves. Nxa2+ does allow me the free pawn, although I do have to move back the knight right away for fear of losing it.

14. Nxb7 isn’t just an inaccuracy, it’s a mistake! Taking a pawn and losing a knight is a clear blunder here. White gains nothing in space or tempo. Black just has to convert the advantage, which I did.

What I really want to point out here is how different the “accuracy scores” are based on the amount of time chess.com takes and the “game rating.” Black’s score is within a decent band (1550-1650). But white’s is all over the place (900-1400). Does white really come that close to black?

I know it’s computer cycles, and we all want results quickly, but perhaps medium or deep analysis should be the default, given that the output didn’t take that much longer than the quick setting.

If I’m going to pay for diamond status on chess.com, don’t give me cubic zirconia results!