2

I have implemented an A* algorithm to find the shortest path between two points in a grid world. For large path lengths the algorithm takes a very long time. I was first wondering if my implementation is correct, and if any optimization could take place?

The arguments for the aStar algorithm, are the current position of you and the position you desire to travel to as (x,y) tuples.

The Node.value of a node is a direction to travel (NSEW), getAdjacentNodes() returns a list of nodes directly adjacent to this one that we can travel to.

#Perform an A* search to find the best path to the dirt
  def aStar(self, current, end):
    openSet = set()   #Set of explorable nodes
    openHeap = []     #All paths heap, lowest cost on top
    closedSet = set() #Best path so far
    curNode = Node(0, current, self.manHatDist(current, end))
    openSet.add(curNode)
    openHeap.append((curNode.cost,curNode))
    while openSet:
      curNode = heapq.heappop(openHeap)[1]
      if curNode.pos == end:
          return self.getDirections(curNode)
      openSet.remove(curNode)
      closedSet.add(curNode)
      for tile in self.getAdjacentNodes(curNode.pos):
         if tile not in closedSet:
             tile.parent = curNode
             tile.cost = self.manHatDist(curNode.pos, end) + self.euclidDist(curNode.pos, current) + curNode.cost
             if tile not in openSet:
                 openSet.add(tile)
                 heapq.heappush(openHeap, (tile.cost,tile))
    return []

  #Get the moves made to get to this endNode
  def getDirections(self, endNode):
    moves = []
    tmpNode = endNode
    while tmpNode.parent is not None:
      moves.append(tmpNode.value)
      tmpNode = tmpNode.parent
    moves.reverse()
    return moves

Node class

# Node class for A* search
class Node:
  def __init__(self, value, pos, cost):
    self.pos = pos
    self.cost = cost
    self.value = value
    self.parent = None

  def __lt__(a, b):
    if(a.cost < b.cost):
      return 1
    return 0

  def __gt__(a, b):
    if(a.cost > b.cost):
      return 1
    return 0

EDIT - Here is the getAdjacentNodes method

  #Return all possible moves from given tile as Node objects
  def getAdjacentNodes(self, curPos):
    allMoves = ['North','South','East','West']
    posMoves = []
    for direction in allMoves:
      if(self.canMove(direction, curPos)):
        posMoves.append(Node(direction, self.getLocIfMove(curPos, direction), 0))
    return posMoves

EDIT2 - Profiling result

Profile Result Pastebin Link

16
  • The code doesn't look too bad to me, I did a similar one looking for a path in a maze image about the same way. You can gain a little in your for tile in ... loop by assigning a local variable in the loop to tile so python doesn't have to look it up every time you use it. ie t = tile and use t throughout the rest of the loop, not tile. Have you tried profiling and seeing where it is hanging up most? cyrille.rossant.net/profiling-and-optimizing-python-code Commented Oct 1, 2013 at 21:22
  • Something like for t=tile in adjacentNodes ? Commented Oct 1, 2013 at 21:24
  • for tile in tiles: t = tile and use t thoughout, not tile Commented Oct 1, 2013 at 21:25
  • a better profiling link too, pymotw.com/2/profile Commented Oct 1, 2013 at 21:35
  • +1 to a line profiler. // I have several questions. Why did you choose A* (A* is hardly the best algorithm out there)? Is Numpy an option? Why do your __lt__/__gt__ classes return 1/0 and not return a.cost > b.cost? Is this Python 3? Does avoiding classes give you any speed boost (remember that tuples have a sorting order)? Commented Oct 1, 2013 at 21:42

1 Answer 1

3

This looks wrong to me:

for tile in self.getAdjacentNodes(curNode.pos):
    if tile not in closedSet:
        tile.parent = curNode
        tile.cost = self.manHatDist(curNode.pos, end) + self.euclidDist(curNode.pos, current) + curNode.cost
        if tile not in openSet:
            openSet.add(tile)
            heapq.heappush(openHeap, (tile.cost,tile))

First problem. The computation of the cost of the new tile is:

self.manHatDist(curNode.pos, end) + self.euclidDist(curNode.pos, current) + curNode.cost

but it ought to be:

curNode.cost
- self.manHatDist(curNode.pos, end)
+ self.euclidDist(curNode.pos, tile.pos)
+ self.manHatDist(tile.pos, end)

(You could avoid the subtraction in the computation of the cost of the new tile if you were cleverer about the way you represent the search nodes, but I will leave that to you.)

Second problem. Having discovered that tile is not in closedSet, you immediately assume that the best way to get to tile is via curNode. But isn't it possible that tile is already in openSet? If so, there might be another route to tile that's better than the one via curNode.* So this code ought to read:

for tile in self.getAdjacentNodes(curNode.pos):
    if tile not in closedSet:
        cost = (curNode.cost
                - self.manHatDist(curNode.pos, end)
                + self.euclidDist(curNode.pos, tile.pos)
                + self.manHatDist(tile.pos, end))
        if tile not in openSet or cost < tile.cost:
            tile.parent = curNode
            tile.cost = cost
            openSet.add(tile)
            heapq.heappush(openHeap, (cost,tile))

I can't say if this will solve your performance problems. But it might give better results.

* There couldn't be a shorter route if self.euclidDist(curNode.pos, tile.pos) is always 1. But if that's the case, why bother with the euclidDist method?

Sign up to request clarification or add additional context in comments.

10 Comments

(I don't know much about A*) What's the justification for your first point? aka. What's the subtraction needed for?
@Veedrac: you probably want to read up on A* search. If you understand the admissibility heuristic but still don't understand my first point, then do come back and ask.
@GarethRees if you look at my edit I added the getAdjacentNodes method source. In it, I never set the cost of a node to anything. Doesn't think make the line or cost < tile.cost: erroneous?
@BumSkeeter: The or operator in Python is short-circuiting, so cost < tile.cost will only be evaluated if tile not in openSet is false, and in that case tile.cost was set when tile was added to openSet.
That would be one approach. Basically you must ensure that you always generate the same Node for the same position, so that you can avoid revisiting that position in the course of your search (and so end up going round in circles). I think we have found the cause of your program's poor performance.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.