CS 470 - Theorem Prover Programming Assignment
Another version of the unification algorithm
You will program a theorem prover based on refutation resolution. Your theorem prover will need to read in a knowledge base in normal form from a file. In order to make this part easier, and so that you can spend less time on the mundane problem of parsing and storing the knowledge base, we are providing a piece o' code which does the parsing of the file and storing of the knowledge base in appropriate data structures. This tar file includes a readme file which should help to get you going.
Go ahead and look over the provided code, and if it suits you use it, otherwise you're free to code up your own parser.
An explanation of the basic syntax for encoding the knowledge base can be found here
In addition to this syntax, we will employ another convention in order to seperate the rest of the knowledge base from the refuted part in the file (in case any of you want to use this information in your resolution strategies). The refuted part will always occur after the rest of the knowledge base, and will be seperated from it by a blank line in the file.
You can download some files to test your prover with from
You will need to use a resolution strategy in order to minimize the actual number of resolutions executed for your proof. You should also try to maintain completeness.
You do not need to automate the transformation into normal form (you can do that by hand). You will be asked to solve common FOL proofs such as the following.
a) Every ambassador speaks only to diplomats, and some amassador speaks to someone, therefore there is a diplomat.
b) Every computer science student works harder than somebody, and everyone who works harder than someone else gets less sleep than that person. Maria is a computer science student. Therefore Maria gets less sleep than someone else.
Note that one should not only be able to query whether the last sentence is true, but alse give a query of the form "does there exist some person x who gets less sleep than someone else." Your program should print out every step of the proof, why it did it (what rules it used, variable substitionts), and so forth. Any valid proof sequence is sufficient (you don't need to find more than 1 valid proof sequence). We will assume that equality (x=y) will not be used in sentences so that you will not need to implement a demodulator, and you will also not need to worry about skolem functions.
An example approach to doing the theorem prover can be found here.
You should only print out the steps in sequence that directly led to the contradiction. DO NOT print out all the dead ends and other blather. This is easy to do. When you reach a conclusion (contradiction), you know which two sentences (parents) were combined to lead to the contradiction, you also know the sentences (grandparents) which were combined to produce the parents of the conclusion, and etc. all the way back to the original sentences in the knowledge base. Anything that is not an ancestor of the conclusion did not lead to the conclusion and should not be printed out.
Your assignment will be uploaded in two parts.
First you will program all necessary parts to discover potential resolutions. However, the selection of which two sentences to resolve at any given time will just be random, but with no repeat resolutions (i.e. do not resolve the same two sentences more than once). Thus, your program will need to store the knowledge base in normal form, find possible resolutions (supported by a unification routine (Section 10.2)), and execute one of the possible resolutions, until the proof is complete or there are no more possible resolutions. Your program should keep a dynamic and final count (output to the screen) of the number of resolutions executed. Your program should also output the actual proof - the sequence of resolutions and resolvents which lead to the proof. Upload this part via anonymous ftp to axon in the pub/470/upload/prover1 directory.
The second part of the assignment will be to add heuristics such that the program makes more efficient choices about which sentences to resolve in order to accomplish the proof. You are free to use any heuristics you choose, but you should be guided by the discussion in Section 9.6 on resolution strategies. You should try using some combination of these heuristic techniques (see OTTER in section 10.4 for an example). With your final version you should upload an ~2 page discussion about the heuristics you tried and your evaluation of them. Your final program should be able to work in both modes: a) random (but no repeats) resolutions, and b) resolution with a heuristic strategy. Upload this part via anonymous ftp to axon in the pub/470/upload/prover2 directory.
We will be calling your program with the file name for the knowledge base entered on the command line. When your program sees the knowledge base entered on the command line it should by default print out the knowledge base that it is currently storing, then run the random resolve routine and print out all steps in the proof and report the total number of steps and the actual amount of time it took, then run your super smart heuristic resolver print out the proof steps it discovered, and report the number of steps and time required for it and then report the amount of improvement (ratios, for example "My smart routine took 12 steps or 75% as many steps as random, and 45 seconds or 78% as long as random"). After this your program SHOULD EXIT. DO NOT WAIT FOR USER INPUT.
Grading: Your program will be tested on a couple of proofs (we won't give them to you before hand) similar to those above. Part of the grade (~70%) will be based on whether your program correctly accomplishes the proof. The other portion (~30%) will be based on your efforts to improve efficiency through the heuristic version of your program. This will be based on a) your write-up, and b) how well your heuristic version does (number of resolutions and amount of time necessary) compared to your random version and to typical programs done by your class mates. Note that when proving simple problems the amount of time necessary for your random version may be comparable to, (and sometimes better than), your heuristic version. Thus, the most important criteria for this part is to minimize the number of resolutions while not using unreasonable amounts of time.