A* Heuristics

from Red Blob Games
DRAFT
13 Dec 2018

{{ Look at http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html[1] }}

{{ *None of this is specific to grids but I'm using grid examples* :-( }}

The A* heuristic is an estimate on the remaining cost to the goal. A* uses it to decide which paths to evaluate first. When the heuristic is zero, A* has no information, and explores all directions equally. When the heuristic is exact, A* has complete information, and explores the best paths first. The higher the value, the fewer nodes A* will explore, but if it goes above the exact cost, A* may not find the shortest path. On game maps we usually choose distance because it's easy to implement and gives a reasonable estimate.

{{ need separate diagram for 0-1 on exact heuristic and 0-2 on distance heuristic }}


Heuristic multiplier: {{slider}}
Heuristic add: {{h_delta}}
path length = {{pathLength}} {{(stats.map_distance | 0)}} {{(stats.path_distance | 0)}}
Diagonal movement cost: {{D2}}
explored = {{stats.explored}} frontier size = {{stats.frontier}}
Effect of heuristic on A* exploration

{{ if I want to show a chart with the tradeoff between explored space and path length, I think I need to precompute the results, which means I need to make the start and end point fixed. or maybe I can do it in webworkers }}
{{ can I "ladder up" by making the boolean explored/not into a color scale showing which areas are explored at each heuristic multiplier? }}

If the exact heuristic is so great, why don't we use it? We don't use it because it's expensive. First, we have to calculate it using Floyd-Warshall or Johnson's Algorithm, and then we have to store it in an N ✕ N table, where N is the number of nodes. If N is small, you wouldn't need to use A*. Instead, you'd save the pathfinding table. You only need A* when N is large. A large map might be over 100,000 nodes, which would require an array of 10,000,000,000 integers (40 GB).

Most of us using A* on maps will use distance heuristics. They're cheap and work reasonably well. They use 0 storage.

I'm going to write cost(A,B) for the cost of the shortest path from A to B. In simple cases, the cost is the length of the path, but we can also use A* for variable cost movement. For example, in Civilization, moving on grassland/desert tiles costs 1 and moving on forest/hill tiles costs 2.

I'm going to write distance(A,B) for the map distance between A and B. The main difference between map distance and cost is that map distance ignores obstacles and slow movement tiles (like forests). On a map with grid movement, we can use a grid distance function. For 4-way square tile movement, use Manhattan distance; for 8-way tile movement, use Diagonal distance; for 6-way hex grids, use Hex distance.

{{ diagram to show the difference between cost and distance – show N, G, the shortest path, its cost, and the distance }}

The graph cost and map distance may be measured in different "units". To construct a heuristic, we need to convert the map distance into the same units as the graph cost. Examples:

Keeping the units in mind helps me find bugs.

Grids#

In a game that uses grid movement, we can specialize the distance function to match the type of movement.

{{ diagram that lets you try different movement types (4-way, 8-way, 8-way with sqrt(2)) and different heuristics (manhattan, chebyshev, diagonal, euclidean, euclidean squared, exact}}


path length = {{pathLength}} {{(stats.map_distance | 0)}} {{(stats.path_distance | 0)}}
HeuristicMovement type
{{ M }}
{{ H }}

{{ I should probably color code the first chart so that slow vs optimal vs unoptimal are different colors, and then I can fill this table with the same colors }}

{{ chart for intuition about g/h relationship, and why they need to match: show x-axis as time on resulting path, y-axis being h, g, f values. The f value should rise over time. If you set heuristic to 0, h is always 0, f rises with g. If you set heuristic to exact cost, then f never rises. }}

{{ chart for intuition about performance: x-axis is g value, y-axis is number of nodes explored. When heuristic is 0, this will be y=x^2 I think, with few obstacles. When heuristic is exact, this will be a constant line x=1 or maybe x=4 depending on how I count. Keeping this line low is the key to performance, so the closer we get to exact, the better we do }}

TODO

  1. Inadmissible heuristics, and whether you need to drop the reprioritize check
  2. Tie breaking

Tricks

Tricks I used for these diagrams: