I would imagine that depending on the set being played with, given a top down view of the board, it might prove difficult to distinguish between the different pieces.
Rather than relying on image recognition to determine which pieces are which, it would almost certainly be easier to simply track the pieces throughout the course of the game. You already know exactly where they started from, so after each turn it should be possible to deduce which square is now empty that wasn't previously empty, and which square is now occupied that wasn't previously occupied. This makes your image analysis much simpler as you're just determining whether each square on the board is empty or not.
e4and a black one one5, how would you know that the position is established by1. e2e4 e7e5or by1. e2e3 e7e6 2. e3e4 e6e5?