The main component is a heuristic to judge how good a position is, and how reliable is the judgement. This had been done by every chess engine, not just alphago. Alphago just one particular heuristic that work pretty well.
Once you have the heuristic, the tree can be trimmed:
– Eliminate branches that are much worse than other branches.
– Search randomly among seemingly-equally good branches, instead of trying out everything.
So the difficult task is the heuristic itself. There are a few versions of the heuristic, but one of them is a evaluation function. An evaluation function is a function that assign values to a position, and they’re often hard-coded (explicitly described by the people designing the engine). What alphago do is make a neural network to compute the evaluation functions. The general structure of the evaluation function is still hard-coded, but it depends on a large number of parameters that are not predetermined. Instead, there is an algorithm that re-adjust these parameters as alphago play more games, so that these parameters result in better and better evaluation function.
To be more specific, alphago’s neural network is very much the same as an image processing neural network. It detects patterns on the board as if it’s an image of black-and-white-and-empty pixels, starting with small patterns and use that to form larger patterns. The training allows it to detect more useful patterns that are relevant to evaluating whether the position is good or bad.
Latest Answers