TSP With Branch and Bound

CS 312: Algorithm Analysis

Project #5: Solving the Traveling Salesperson Problem with Branch & Bound

Overview

In this project, you will implement a branch and bound algorithm to find solutions to the traveling salesperson problem (TSP).

Objectives

Implement a branch and bound algorithm for finding solutions to the TSP
Consider issues with trying to solve an NP-complete problem
Further develop your ability to conduct empirical analysis

Background

The TSP problem consists of the following:

Given: a directed graph and a cost associated with each edge

Return: the lowest cost complete simple tour of the graph

A complete simple tour is a path through the graph that visits every vertex in the graph exactly once and ends at the starting point, also known as a Hamiltonian cycle or Rudrata cycle. Note that as formulated here, the TSP problem is an optimization problem, in so far as we are searching for the simple tour with minimum cost.

We cover branch and bound, as well as B&B solutions to the TSP, in great detail in the lectures in class and the accompanying slides. The appendix below reviews the reduced cost matrix bounding function which we discussed in class and which you will use for this problem. The appendix also reviews the “include-exclude” approach to generating child (or “successor”) states in the state space and which you will also use for this project.

Provided Framework

Use the same Graphical User interface that is provided in for the group TSP project.

To Do

1. Write a branch and bound algorithm (your TSP solver) to find the shortest complete simple tour through the City objects in the array Cities. You will use the reduced cost matrix for your lower bound function and “include-exclude” as your state space search approach. Implement your solver in the following method: ProblemAndSolver.solveProblem()

2. Your solver should include a time-out mechanism so that it will terminate and report the best solution so far after 30 seconds of execution time. Note that we aren’t concerned that you use precisely 30 seconds. Running a timer and checking the time on every iteration through your branch and bound algorithm is sufficient if slightly imprecise. You can use timers to interrupt your search if you want to be more precise about ending exactly at 30 seconds. If the BSSF is the initial BSSF (i.e. B&B has not yet reached its first potential solution), note this with your output of the BSSF.

3. To display your solution, assign bssf to a TSPSolution that contains the path you have discovered. Then call the Program.MainForm.Invalidate() method to refresh the display. You should be creative on your initial BSSF value and it can have a significant impact on early pruning.

4. Set Program.MainForm.tbCostOfTour.Text to the cost of the tour you have discovered. Set Program.MainForm.tbElapsedTime.Text to the time that it took you to discover the solution. Also report the number of actual solutions generated (not including your initial BSSF). This valued will often be 0 or 1, as the first solution found will be optimal.

5. For this project, the performance analysis will focus on both time and space. You will need a mechanism to report the total number of child states generated (whether they are put on the queue or not), and also the number of states pruned due to your evolving BSSF. This includes all child states generated that never get expanded, either because they are not put on the queue, pruned when dequeued, or because they never get dequeued before termination. You will also report the maximum size of the queue which is the upper bound of memory used.

There are three levels to set your city distributions: Easy – symmetric, Normal – asymmetric, Hard: asymmetric and some infinite distances. You can play with all of them during testing, but just use the hard level for all of your reporting below. Note easy is Euclidean, normal is metric, and hard is non-metric. In Euclidean optimal path can not have crossed paths, in non-metric (hard) optimal could have crossed paths.

Report

1. [10] On-time and correct “Whiteboard Experience” submission.

2. [20] Include your well commented code.

3. [10] Explain the time and space complexity of your algorithm by showing and summing up the complexity of each subsection of your code

[10]Describe the data structures you use to represent the states.
[5] Describe the priority queue data structure you used and how it works.
[5] Describe your approach for the initial BSSF.
[30] Include a table containing the following columns.

# Cities	Seed	Running time (sec.)	Cost of best tour found (*=optimal)	Max. # of Stored states at a given time	# of BSSF updates	Total # of States Created	Total # of States Pruned
15	20
16	902
		4.2	156*	600	1	700	40
		2.8	58*	412	3	600	324
		30	213	710	2	800	120
		30	265	615	0	800	155

Note that the numbers in the above table are completely made up and may or may not have any correlation with reality.

Your table must include at least 10 rows of results for 10 different problems ranging between 10 and 50 cities. The first two rows should report your results on the specific cities/seeds shown above (15/20 and 16/902). Of the 10 problems, 4 must run for the full 30 seconds (before timing out and returning the best solution found so far). # of BSSF updates is the number of times a goal was found which was better than the current BSSF. A value of 0 means the final solution was just the initial BSSF. Pruned states include a) those which are not put on the queue because their initial bound is greater than the current bssf, and also b) any states that are put on the priority queue, but when taken off the queue, their bound is now greater than the updated bssf, thus allowing the state to be immediately pruned without expansion, or that are never taken off the queue at all. Just count the states actually pruned (not the many potential sub-states of those states which are also implicitly pruned).

8. [10] Discuss the results in the table and why you think the numbers are what they are, including how time complexity and pruned states vary with problem size.

[Up to 5 extra credit points] Try some creative approach(es) to encourage some amount of early “drilling down” so that you can have an improved BSSF before the time limit is exceeded. This is especially interesting as n gets bigger and you start to time out before even the first B&B potential solution has been found. Explain your approach(es) and discuss how your results where affected.

To aid in your debugging here are some results we got which you can compare against.

problem size: 14; random seed: 1; cost of tour: 3697; avg time: ~.08 seconds

problem size: 14; random seed: 2; cost of tour: 3356; avg time: ~.065 seconds

problem size: 14; random seed: 3; cost of tour: 3866; avg time: ~.26 seconds

P.S. Last year some people experienced difficulty in getting the TSP project we've given you to work with newer versions of Visual Studio. The following has fixed that problem in the past if you have it:

- When you open the solution, click upgrade

- ignore the webpage that comes up

- Open the "PROJECT" drop-down menu and click "TSP properties"

- Then change the "Target Framework" from ".NET Framework 2.0" to ".NET Framework 3.5 Client Profile"

Appendix:

Bounding Function

Suppose we are given the following instance of the traveling salesperson problem for four cities in which the symbol "i" represents infinity.

i 5 4 3

3 i 8 2

5 3 i 9

6 4 3 i

One important element of a branch and bound solution to the problem is to define a bounding function. Our bounding function requires that we find a reduced cost matrix. The reduced cost matrix gives the additional cost of including an edge in the tour relative to a lower bound. The lower bound is computed by taking the sum of the cheapest way to leave each city plus any additional cost to enter each city. This bounding function is a lower bound because any tour must leave and enter each city exactly once, but choosing such edges may not define a solution.

First, let's reduce row 1. The smallest entry in row 1 is the cheapest way to leave city A. A row is reduced by taking the smallest entry in the row, 3 in this case, and subtracting it from every other entry in the row. The smallest entry (3) is also added to the lower bound. After reducing row 1, we have a bound of 3 and the following matrix:

i 2 1 0

3 i 8 2

5 3 i 9

6 4 3 i

Next, we reduce row 2 by taking the smallest entry in row 2, 2 in this case, and subtracting 2 from each entry in row 2. We add 2 to the bound and obtain the following matrix:

i 2 1 0

1 i 6 0

5 3 i 9

6 4 3 i

The remaining two rows are reduced in similar fashion. Lowest value 3 is subtracted from row 3, and 3 is likewise subtracted from row 4. The final bound is 3 + 2 + 3 + 3 = 11, and the reduced matrix so far is:

i 2 1 0

1 i 6 0

2 0 i 6

3 1 0 i

Reducing the rows only accounts for the cheapest way to leave every city. Reducing the columns includes the cheapest way to enter every city. Column reduction is similar to row reduction. A column is reduced by finding the smallest entry in a column of the reduced cost matrix, subtracting that entry from every other entry in the column and adding the entry to the bound.

The smallest entry in the first column is 1 so we subtract 1 from each entry in column 1 and add 1 to the bound. The new bound is 11 + 1 = 12 and the new matrix is:

i 2 1 0

0 i 6 0

1 0 i 6

2 1 0 i

The remaining columns are already reduced, since they already contain a 0.

Include-Exclude Approach to Generating Successor States

Another important element of a branch and bound solution is to define the manner in which children (or “successor”) states are expanded from a given state in the state space search. In the “include-exclude” approach, we generate two children for every parent: the left child represents the inclusion of an edge in the tour and the right child represents the exclusion of that edge. The next step is to decide which edge to include or exclude. We'll assume that we want to

1. minimize the bound on the left (include) child

and

2. maximize the bound on the right (exclude) child.

Choosing an edge so as to maximize a bound on the right child can lead to more aggressive pruning; consequently, this can be a compelling approach to finding a solution. To fulfill both of these choose the edge which maximizes the difference between then. (i.e. maximize bound(exclude child) – bound(include child))

We observe that it is advisable to avoid including edges that are non-zero in the reduced matrix. If a non-zero edge in the reduced matrix is included, then the extra cost of that edge (as contained in the reduced matrix) must be added to the bound on the left side. However, we are trying to minimize the bound on the left side.

Next, get the bounds of including or excluding an edge. In the reduced matrix above, there are 5 entries that contain 0. We'll compute a pair of bounds (one for include and one for exclude) for each 0-residual-cost edge and pick the one that has the maximum right child bound and the minimum left child bound.

Start with the 0 at entry (2,1). If the edge from city 2 to 1 is included in the solution, then the rest of row 2 and column 1 can be deleted since we will leave city 2 once and enter city 1 once. We get this matrix:

i 2 1 0

i i i i

i 0 i 6

i 1 0 i

This matrix must be reduced. The cost incurred during the reduction is added to the bound on the left child. In this case, no rows or columns need to be reduced. So the bound on the left child is 12 (which is the bound on the parent).

Now for the right child: If the edge between 2 and 1 is excluded, then we just replace entry 2,1 in the matrix with an infinity. We now have:

i 2 1 0

i i 6 0

1 0 i 6

2 1 0 i

This matrix must be reduced and the bound increased by the amount reduced. Only column 1 must be reduced, and it is reduced by subtracting 1 from each entry. The bound on the right child is then 12 + 1 = 13.

Stepping back for a minute, we have now determined that the bound on including edge 2,1 is 12 and the bound on excluding edge 2,1 is 13. Can we do better using a different edge? We'll answer that question by examining all of the other 0's in the matrix.

The easy way to examine the 0's is the following. To include an edge at row i column j, look at all of the 0s in row i. If column x of row i contains a 0, look at all of the entries in column x. If the 0 in row i of column x is the only 0 in column x, then replacing row i with infinities will force a reduction in column x. So add the smallest entry in column x to the bound on including the edge at row i and column j. Perform a similar analysis for the zeros in column j. This is the bound on including an edge.

To examine the bound on excluding an edge at row i and column j, add the smallest entries in row i and column j.

A complete examination of the 0 entries in the matrix reveals that the 0s at entries (3,2) and (4,3) give the greatest right-bound, 12+2, with the least left-bound, 12. You should verify this on your own.

So we'll split on either (3,2) or (4,3); it doesn't matter which.

After deciding which edge to split on, the next step is to do the split. Doing the split generates two new reduced matrices and bounds. These are then inserted into the priority queue.

Following the branch-and-bound algorithm from the lectures, the next step is to dequeue the most promising node and repeat the process. This continues until a solution is found. You know you've found a solution when you've included enough edges to form a tour. When a solution is found, check to see if that solution improves the previous best solution (so far). If so, the new solution is the best solution so far. If the new solution is now the best solution so far, then the priority queue may be trimmed to avoid keeping unpromising states around. We iterate until the queue is empty or until time is exhausted.

Another important aspect of the algorithm is preventing early cycles and keeping track of the best solution so far. They are related. As you add edges to a partial solution (state), you'll need to keep track of which edges are part of the solution. Since a simple tour of the graph can't visit the same city twice, you'll need to delete edges from the state’s residual cost matrix that might result in a city being visited twice. To do this, you'll need to know which cities have been included. The cities included in a partial solution will need to be stored (along with the matrix and bound) in each state in the priority queue.

Revised: November 18, 2014