How does this work?
Chess Grabber does image recognition much the same way that some applications do face recognition, or character recognition. It uses all kinds of algorithms from these fields. On this page, I'll explain what transformations the picture goes through to become a FEN record.
First of all, the image is converted to grayscale, to reduce the complexity of the following steps. In the end, the most important information in chess diagrams is in dark and light pieces and squares, and it doesn't matter if they are dark brown or dark grey.
Then, the picture's colors are reduced to solid black and solid white, using a method called thresholding: every pixel that is sufficiently dark is changed to black, and other pixels become white. What I try to accomplish here is that the black squares of the chessboard in the picture become black and the white ones white, regardless of what color they had.
However, I don't know what the threshold should be on beforehand. So I guess a bit about the colors of the picture: I calculate the average luminance of the pixels and use that as a threshold. In other words, I assume that the white squares of the chess diagram are lighter than the average color of the page and the black ones are darker. As a result, I get a picture in black and white.
Now I need to find the chessboard on the image. A chessboard is easily recognized by a human because of the alternating black and while fields. That is what the computer tries to detect, too; it does it by looking at the crossings.
For each point in the picture, I check if the above-left area and the below-right area are mainly black, and if the above-right area and below left area are mainly whiteor the other way round. If that is the case, I'm probably on a crossing between black and white fields.
I divide the crossings in two groups, depending on where the black areas are ('/' or '\'.) Note that I probably find a lot of 'false' crossings, e.g. where a black piece is on a white square; I will weed them out in the next step.
Constructing the board's inner grid
A chessboard has 49 such crossing, and they are all connected. One thing to not is that the length of those connections is more or less equal, because a chess board is square. I don't know what that length is, so I will assume the following: most crossings I find belong to the chessboard, so the most connections of equal distance I find will belong to adjacent squares on the board.
So I start examining the distances between the two types of crossings. I connect every /-crossing with every \-crossing, and assemble groups of lines of equal length (horizontally or vertically.) Now the largest group is the grid of the 49 inner crossings of the chessboard.
Cutting out the board
Cutting out the board is now easy: extend the grid by 1/6 in every direction, and we have the corners of the board. Of course, the board could be skewed, or in perspective; a slightly more intelligent transformation (a quadrilateral transformation) could be used, but that is currently not implemented.
Also, I assume I'm actually trying to recognize a 8 by 8 chessboard, and not a 10 by 10 draughts board, or something even more exotic. This simplifies matters a bit.
Recognize the pieces
Under the assumptions above, I now know exactly where the pieces are on the board. I cut the board in 64 squares, and each square contains a piece, or nothing.
So now I need to recognize what is on each square. There are several algorithms that could be used. I tried the following, ordered here more or less from easy to complex:
- Pixel compare
- k-nearest neighbor
- Adaptive k-nearest neighbor
- Naive Bayes classification of pixel-lines
- Naive Bayes classification of pixel-groups
- Back-propagation neural networks
I am not an expert. I couldn't get the neural network to converge on anything, even though the problem space seems small. The Bayesian classifiers looked promising, but in the end didn't do better then simple k-NN. Adaptive k-NN improved matters a bit.
I should probably use a lot more training data than the ~100 chessboards I recognized by hand. You can help by trying to recognize chess diagrams through Chess Grabber, and leave the right FEN record in the feedback when the recognition fails!
Build the FEN record
A complete FEN record consists of more data than just the position on the board. It also has the following fields:
- Whose turn is it? 'w' or 'b'.
- What castling is still possible? White king-side ('K') or queenside ('Q'), black king-side ('k') or queenside ('q')
- En-passant field.
- Half-move number
Some of these fields can be derived from the position of the pieces: if e.g. the white king is not on its starting field, white can no longer castle. The revers is not true: if the king is on e1, it could still have moved away and back.
Chess Grabber will fix the most obvious inconsistenties, but will not try to analyse the position too thoroughly. For a correct FEN record, the values for these fields need to be specified.
What improvements can be made?
Chess Grabber is far from perfect. The board is often not found correctly, and if it is, the piece recognition leaves a lot to be desired. Currently, I still have the following plans to improve it:
- Thresholding based on results
- Currently, when the thresholding in the first step doesn't distinguish between the black and white pieces, the recognition fails. I should try a different threshold if the board is not detected, until the result is good.
- Skew boards
- Support for boards that are not exactly square and straight should be reasonably easy to implement, e.g. for photo's of demonstration boards.
- Piece recognition
- The piece recognition is not very good. I should use more input data and look into algorithms again.
Are you an image recognition expert? I could use some help to improve Chess Grabber. Please contact me if you are interested in this project!Dion Nicolaas