CS 312:
Algorithm Analysis
Project #5:
Solving the Traveling Salesperson Problem with Branch & Bound
Overview
In this project, you will implement a branch and bound algorithm to find solutions to the traveling salesperson problem (TSP).
Objectives
Background
The TSP problem consists of the following:
Given: a directed graph and a cost associated with each edge
Return: the lowest cost complete simple tour of the graph
A complete simple tour is a path through the graph that visits every vertex in the graph exactly once and ends at the starting point, also known as a Hamiltonian cycle or Rudrata cycle. Note that as formulated here, the TSP problem is an optimization problem, in so far as we are searching for the simple tour with minimum cost.
We cover branch and bound, as well as B&B solutions to the TSP, in great detail in the approximation lectures in class and the accompanying slides. The appendix below reviews the reduced cost matrix bounding function which we discussed in class and which you will use for this problem. The appendix also reviews the ÒincludeexcludeÓ approach to generating child (or ÒsuccessorÓ) states in the state space and which you will also use for this project.
Provided
Framework
Use the same Graphical User interface that is provided in for the group TSP project.
To
Do
1. Write a branch and bound algorithm (your TSP solver) to find the shortest complete simple tour through the City objects in the array Cities. You will use the reduced cost matrix for your lower bound function and ÒincludeexcludeÓ as your state space search approach. Implement your solver in the following method: ProblemAndSolver.solveProblem()
2. Your solver should include a timeout mechanism so that it will terminate and report the best solution so far after 30 seconds of execution time. Note that we arenÕt concerned that you use precisely 30 seconds. Running a timer and checking the time on every iteration through your branch and bound algorithm is sufficient if slightly imprecise. You can use timers to interrupt your search if you want to be more precise about ending exactly at 30 seconds. If the BSSF is the initial BSSF (i.e. B&B has not yet reached its first potential solution), note this with your output of the BSSF.
3. To display your solution, assign bssf to a TSPSolution that contains the path you have discovered. Then call the Program.MainForm.Invalidate() method to refresh the display. You can be creative on your initial BSSF value and it can have a significant impact on early pruning.
4. Set Program.MainForm.tbCostOfTour.Text to the cost of the tour you have discovered. Set Program.MainForm.tbElapsedTime.Text to the time that it took you to discover the solution. Also report the number of actual solutions generated. This will often be 1, as the first solution found will be optimal. We are making a box for this and other data in the build, but if not done yet, you can find a place to output it.
5. For this project, the performance analysis will focus on both time and space. You will need a mechanism to report the total number of states generated, and also the number of states pruned due to your evolving BSSF.
6.
There are three levels to set your city distributions: Easy
– symmetric, Normal – asymmetric, Hard: asymmetric and some
infinite distances. You can play
with all of them during testing, but just use the hard level for all of your
reporting below.
Report
1. [20] Include your well commented code
2.
[10]Describe the data structures you use to represent the
states
3. [5] Describe the priority queue data structure you used and how it works.
4. [5] Describe your approach for the initial BSSF
5. [45] Include a table containing the following columns.
# Cities 
Seed 
Running time (sec.) 
Cost of best tour found (*=optimal) 
Max. # of Stored states at a given time 
# of BSSF updates 
Total # of States Created 
Total # of States Pruned 
15 
20 



3 


16 
902 



4 




4.2 
156* 
600 
1 
700 
40 


2.8 
58* 
412 
3 
600 
324 


30 
213 
710 
2 
800 
120 


30 
265 
615 
0 
800 
155 
Note that the numbers in the
above table are completely made up and may or may not have any correlation with
reality.
Your table must include at least 10 rows of results for 10 different problems ranging between 10 and 50 cities. The first two rows should report your results on the specific cities/seeds shown above (15/20 and 16/902). Of these problems, at least 4 must run for the full 30 seconds (before timing out and returning the best solution found so far). # of BSSF updates is the number of times a goal was found which was better than the current BSSF. A value of 0 means the final solution was just the initial BSSF. Pruned states include a) those which are not put on the queue because their initial bound is greater than the current bssf, and also b) any states that are put on the priority queue, but when taken off the queue, their bound is now greater than the updated bssf, thus allowing the state to be immediately pruned without expansion. Just count the states actually pruned (not the many potential substates of those states which are also implicitly pruned).
6. [15] Discuss the results in the table and why you think the numbers are what they are, including how time complexity and pruned states vary with problem size.
7.
[Up to 5 extra credit points] Try some creative approach(es)
to encourage some amount of early Òdrilling downÓ so that you can have an
improved BSSF when the time limit is exceeded. This is especially interesting as n gets bigger and you start to time out before even the first
B&B potential solution has been found. Explain your approach(es) and show what (if any) improvement
was attained.
Further
Exploration
Here are a couple of suggestions for further exploration. They are listed without prejudice against things that are or are not on this list.
á Think of a better way to visualize the step by step performance of the algorithm. Perhaps show how the reduced cost matrix evolves during the search process. The visualization should do something for each state pulled from the priority queue.
á Implement a better ad hoc solution to compute your initial BSSF.
á Implement a better (and more expensive?) feasibility test to prune the space earlier.
á Implement a different search strategy (try the strategy that you didnÕt try in the main project: include/exclude edge, all next edges, É)
á Implement a randomized version of your algorithm.
Appendix:
Bounding
Function
Suppose we are given the following instance of the traveling salesperson problem for four cities in which the symbol "i" represents infinity.
i 5 4 3
3 i 8 2
5 3 i 9
6 4 3 i
One important element of a branch and bound solution to the problem is to define a bounding function. Our bounding function requires that we find a reduced cost matrix. The reduced cost matrix gives the additional cost of including an edge in the tour relative to a lower bound. The lower bound is computed by taking the sum of the cheapest way to leave each city plus any additional cost to enter each city. This bounding function is a lower bound because any tour must leave and enter each city exactly once, but choosing such edges may not define a solution.
First, let's reduce row 1. The smallest entry in row 1 is the cheapest way to leave city A. A row is reduced by taking the smallest entry in the row, 3 in this case, and subtracting it from every other entry in the row. The smallest entry (3) is also added to the lower bound. After reducing row 1, we have a bound of 3 and the following matrix:
i 2 1 0
3 i 8 2
5 3 i 9
6 4 3 i
Next, we reduce row 2 by taking the smallest entry in row 2, 2 in this case, and subtracting 2 from each entry in row 2. We add 2 to the bound and obtain the following matrix:
i 2 1 0
1 i 6 0
5 3 i 9
6 4 3 i
The remaining two rows are reduced in similar fashion. Lowest value 3 is subtracted from row 3, and 3 is likewise subtracted from row 4. The final bound is 3 + 2 + 3 + 3 = 11, and the reduced matrix so far is:
i 2 1 0
1 i 6 0
2 0 i 6
3 1 0 i
Reducing the rows only accounts for the cheapest way to leave every city. Reducing the columns includes the cheapest way to enter every city. Column reduction is similar to row reduction. A column is reduced by finding the smallest entry in a column of the reduced cost matrix, subtracting that entry from every other entry in the column and adding the entry to the bound.
The smallest entry in the first column is 1 so we subtract 1 from each entry in column 1 and add 1 to the bound. The new bound is 11 + 1 = 12 and the new matrix is:
i 2 1 0
0 i 6 0
1 0 i 6
2 1 0 i
The remaining columns are already reduced, since they already contain a 0.
IncludeExclude
Approach to Generating Successor States
Another important element of a branch and bound solution is to define the manner in which children (or ÒsuccessorÓ) states are expanded from a given state in the state space search. In the ÒincludeexcludeÓ approach, we generate two children for every parent: the left child represents the inclusion of an edge in the tour and the right child represents the exclusion of that edge. The next step is to decide which edge to include or exclude. We'll assume that we want to
1. minimize the bound on the left (include) child
and
2. maximize the bound on the right (exclude) child.
Choosing an edge so as to maximize a bound on the right child can lead to more aggressive pruning; consequently, this can be a compelling approach to finding a solution. To fulfill both of these choose the edge which maximizes the difference between then. (i.e. maximize bound(exclude child) – bound(include child))
We observe that it is advisable to avoid including edges that are nonzero in the reduced matrix. If a nonzero edge in the reduced matrix is included, then the extra cost of that edge (as contained in the reduced matrix) must be added to the bound on the left side. However, we are trying to minimize the bound on the left side.
Next, get the bounds of including or excluding an edge. In the reduced matrix above, there are 5 entries that contain 0. We'll compute a pair of bounds (one for include and one for exclude) for each 0residualcost edge and pick the one that has the maximum right child bound and the minimum left child bound.
Start with the 0 at entry (2,1). If the edge from city 2 to 1 is included in the solution, then the rest of row 2 and column 1 can be deleted since we will leave city 2 once and enter city 1 once. We get this matrix:
i 2 1 0
i i i i
i 0 i 6
i 1 0 i
This matrix must be reduced. The cost incurred during the reduction is added to the bound on the left child. In this case, no rows or columns need to be reduced. So the bound on the left child is 12 (which is the bound on the parent).
Now for the right child: If the edge between 2 and 1 is excluded, then we just replace entry 2,1 in the matrix with an infinity. We now have:
i 2 1 0
i i 6 0
1 0 i 6
2 1 0 i
This matrix must be reduced and the bound increased by the amount reduced. Only column 1 must be reduced, and it is reduced by subtracting 1 from each entry. The bound on the right child is then 12 + 1 = 13.
Stepping back for a minute, we have now determined that the bound on including edge 2,1 is 12 and the bound on excluding edge 2,1 is 13. Can we do better using a different edge? We'll answer that question by examining all of the other 0's in the matrix.
The easy way to examine the 0's is the following. To include an edge at row i column j, look at all of the 0s in row i. If column x of row i contains a 0, look at all of the entries in column x. If the 0 in row i of column x is the only 0 in column x, then replacing row i with infinities will force a reduction in column x. So add the smallest entry in column x to the bound on including the edge at row i and column j. Perform a similar analysis for the zeros in column j. This is the bound on including an edge.
To examine the bound on excluding an edge at row i and column j, add the smallest entries in row i and column j.
A complete examination of the 0 entries in the matrix reveals that the 0s at entries (3,2) and (4,3) give the greatest rightbound, 12+2, with the least leftbound, 12. You should verify this on your own.
So we'll split on either (3,2) or (4,3); it doesn't matter which.
After deciding which edge to split on, the next step is to do the split. Doing the split generates two new reduced matrices and bounds. These are then inserted into the priority queue.
Following the branchandbound algorithm from the lectures, the next step is to dequeue the most promising node and repeat the process. This continues until a solution is found. You know you've found a solution when you've included enough edges to form a tour. When a solution is found, check to see if that solution improves the previous best solution (so far). If so, the new solution is the best solution so far. If the new solution is now the best solution so far, then the priority queue may be trimmed to avoid keeping unpromising states around. We iterate until the queue is empty or until time is exhausted.
Another important aspect of the algorithm is preventing early cycles and keeping track of the best solution so far. They are related. As you add edges to a partial solution (state), you'll need to keep track of which edges are part of the solution. Since a simple tour of the graph can't visit the same city twice, you'll need to delete edges from the stateÕs residual cost matrix that might result in a city being visited twice. To do this, you'll need to know which cities have been included. The cities included in a partial solution will need to be stored (along with the matrix and bound) in each state in the priority queue.
Revised: August
30, 2010