A lot of it is heuristic based. Basically you use a heuristic that can estimate the quality of a future state, without working it all the way to completion.
For instance, in chess we might decide the count of our pieces Vs the opponents pieces.
So you can look a few steps ahead in each direction, and get a numeric quality of that branch. Then you abandon the branches where you’ve lost a bunch of pieces Vs the opponent, and focus on the more promising one.
Obviously you can come up with much better heuristics than that, e.g. weighting the importance of the pieces or maybe applying some logic about where you’d like your pieces to be on the board.
Alpha Go is an AI. It doesn’t work on decision trees in the same way as most of the other software you might use. It’s been trained through repetition and practice and has a brain that’s closer to a living creature’s brain than a computer algorithm. As such it’s more likely to see patterns and react to them rather than brute force processing of what might happen if a specific move were taken.
Humans don’t have decision trees (mostly…). We learn what works, what doesn’t, and make choices that tended to kick the ass of computers doing it the hard way. Alpha Go takes the human approach instead, building a brain and teaching it how to play the game by making it actually play the game.
The downside is the need for training. Alpha Go studied Go for who know how many years (equivalent to many human lifetimes), playing against variants of itself until the devs were satisfied with the wiring of its brain and estimating it to be good enough to beat a human grandmaster level player. And… *it did*.
Having watched both Alpha Go vs Lee Se-dol, and Alpha Star (Starcraft 2 AI) vs some humans, I have to say, both seem to fall back on some pretty absurd behaviours when losing. I can’t help but wonder if Alpha Go is actually incapable of resigning and its (official, by resignation) loss against Lee was actually a human at Google deciding to put an end to its misery. Decision tree programs usually have a good idea when they’re beaten and resign gracefully whereas Alpha Go throws a small tantrum or something… It’s not a decision tree, it has something sorta like a mind.
The main component is a heuristic to judge how good a position is, and how reliable is the judgement. This had been done by every chess engine, not just alphago. Alphago just one particular heuristic that work pretty well.
Once you have the heuristic, the tree can be trimmed:
– Eliminate branches that are much worse than other branches.
– Search randomly among seemingly-equally good branches, instead of trying out everything.
So the difficult task is the heuristic itself. There are a few versions of the heuristic, but one of them is a evaluation function. An evaluation function is a function that assign values to a position, and they’re often hard-coded (explicitly described by the people designing the engine). What alphago do is make a neural network to compute the evaluation functions. The general structure of the evaluation function is still hard-coded, but it depends on a large number of parameters that are not predetermined. Instead, there is an algorithm that re-adjust these parameters as alphago play more games, so that these parameters result in better and better evaluation function.
To be more specific, alphago’s neural network is very much the same as an image processing neural network. It detects patterns on the board as if it’s an image of black-and-white-and-empty pixels, starting with small patterns and use that to form larger patterns. The training allows it to detect more useful patterns that are relevant to evaluating whether the position is good or bad.
By exploring much less of the tree, because they focus on the better ideas with beefier evaluation. Neural Nets such as AlphaZero or Leela Chess Zero explore far fewer nodes/second, a few thousand nodes/sec on my *machine*(full CPU and GPU) versus about a million nodes/core/sec for SF. Basically, they spend more CPU cycles determining how good positions are, and explore far fewer positions. This is also, IMO, the correct human approach to take. We can’t explore the breadth of chess, but we don’t need to, we can just focus on the correct ideas.
Latest Answers