Notice: Undefined index: in /var/www/detailed_publications.php on line 18

Notice: Undefined variable: pageDisplayTitle in /var/www/template.inc on line 7

Notice: Undefined variable: page_logoImage in /var/www/template.inc on line 9

Notice: Undefined variable: site_logoWidth in /var/www/template.inc on line 11
<br /> <b>Notice</b>: Undefined variable: noHierarchyInTitle in <b>/var/www/template.inc</b> on line <b>17</b><br /> Publications (detailed list) - NNML Laboratory - BYU CS Department
Notice: Undefined variable: site_icon in /var/www/template.inc on line 23

Notice: Undefined variable: page_style in /var/www/template.inc on line 42

Notice: Undefined variable: pageStrippable in /var/www/template.inc on line 50

Notice: Undefined variable: site_titleStack in /var/www/template.inc on line 70
  Publications (detailed list)

THIS PAGE IS NO LONGER MAINTAINED. Click here for our new publications list, which is more up-to-date.


This page contains the titles and abstracts of papers written by members of the BYU Neural Networks and Machine Learning (NNML) Research Group. Postscript files are available for most papers. A more concise list is available.


DMP3: A Dynamic Multi-Layer Perceptron Construction Algorithm

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Reference: International Journal of Neural Systems, volume 2, pages 145–166, 2001.
  • BibTeX:
    @article{AndersenIJNS,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {{DMP3}: A Dynamic Multi-Layer Perceptron Construction Algorithm},
    journal = {International Journal of Neural Systems},
    volume = {2},
    pages = {145--166},
    year = {2001},
    }
  • Download the file: pdf

Optimal Artificial Neural Network Architecture Selection for Voting

  • Authors: Tim L. Andersen and Michael E. Rimer and Tony R. Martinez
  • Abstract: This paper studies the performance of standard architecture selection strategies, such as cost/performance and CV based strategies, for voting methods such as bagging. It is shown that standard architecture selection strategies are not optimal for voting methods and tend to underestimate the complexity of the optimal network architecture, since they only examine the performance of the network on an individual basis and do not consider the correlation between responses from multiple networks.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’01, pages 790–795, 2001.
  • BibTeX:
    @inproceedings{andersen.ijcnn01.oas,
    author = {Andersen, Tim L. and Rimer, Michael E. and Martinez, Tony R.},
    title = {Optimal Artificial Neural Network Architecture Selection for Voting},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'01},
    pages = {790--795},
    year = {2001},
    }
  • Download the file: pdf

Optimal Artificial Neural Network Architecture Selection for Bagging

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Abstract: This paper studies the performance of standard architecture selection strategies, such as cost/performance and CV based strategies, for voting methods such as bagging. It is shown that standard architecture selection strategies are not optimal for voting methods and tend to underestimate the complexity of the optimal network architecture, since they only examine the performance of the network on an individual basis and do not consider the correlation between responses from multiple networks.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’01, pages 790–795, 2001.
  • BibTeX:
    @inproceedings{andersen_2001b,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Optimal Artificial Neural Network Architecture Selection for Bagging},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'01},
    pages = {790--795},
    year = {2001},
    }
  • Download the file: pdf

The Little Neuron that Could

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’99, CD paper #191, 1999.
  • BibTeX:
    @inproceedings{andersen.ijcnn1999.wag,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {The Little Neuron that Could},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'99, {CD} paper #191},
    year = {1999},
    }
  • Download the file: pdf

Cross Validation and MLP Architecture Selection

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’99, CD paper #192, 1999.
  • BibTeX:
    @inproceedings{andersen.ijcnn99.cv,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Cross Validation and {MLP} Architecture Selection},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'99, {CD} paper #192},
    year = {1999},
    }
  • Download the file: ps, pdf

Constructing Higher Order Perceptrons with Genetic Algorithms

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’98, pages 1920–1925, 1998.
  • BibTeX:
    @inproceedings{andersen.ijcnn1998,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Constructing Higher Order Perceptrons with Genetic Algorithms},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'98},
    pages = {1920--1925},
    year = {1998},
    }
  • Download the file: pdf

Wagging: A learning approach which allows single layer perceptrons to outperform more complex learning algorithms

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Reference: In Submitted to IEEE Transactions on Neural Networks, 1997.
  • BibTeX:
    @inproceedings{andersen.ieee99.wag,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Wagging: A learning approach which allows single layer perceptrons to outperform more complex learning algorithms},
    booktitle = {Submitted to {IEEE} Transactions on Neural Networks},
    year = {1997},
    }
  • Download the file: ps

Genetic Algorithms and Higher Order Perceptron Networks

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Reference: In Proceedings of the International Workshop on Neural Networks and Neurocontrol, pages 217–223, 1997.
  • BibTeX:
    @inproceedings{andersen.sian97,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Genetic Algorithms and Higher Order Perceptron Networks},
    booktitle = {Proceedings of the International Workshop on Neural Networks and Neurocontrol},
    pages = {217--223},
    year = {1997},
    }
  • Download the file: pdf

Using Multiple Node Types to Improve the Performance of DMP (Dynamic Multilayer Perceptron)

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Abstract: This paper discusses a method for training multi-layer perceptron networks called DMP2 (Dynamic Multi-layer Perceptron 2). The method is based upon a divide and conquer approach which builds networks in the form of binary trees, dynamically allocating nodes and layers as needed. The focus of this paper is on the effects of using multiple node types within the DMP framework. Simulation results show that DMP2 performs favorably in comparison with other learning algorithms, and that using multiple node types can be beneficial to network performance.
  • Reference: In Proceedings of the IASTED International Conference on Artificial Intelligence, Expert Systems and Neural Networks, pages 249–252, 1996.
  • BibTeX:
    @inproceedings{andersen.iasted96.dmp2,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Using Multiple Node Types to Improve the Performance of {DMP} (Dynamic Multilayer Perceptron)},
    booktitle = {Proceedings of the {IASTED} International Conference on Artificial Intelligence, Expert Systems and Neural Networks},
    pages = {249--252},
    year = {1996},
    }
  • Download the file: ps

The Effect of Decision Surface Fitness on Dynamic Multi-layer Perceptron Networks

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Abstract: The DMP1 (Dynamic Multi-layer Perceptron 1) network training method is based upon a divide and conquer approach which builds networks in the form of binary trees, dynamically allocating nodes and layers as needed. This paper introduces the DMP1 method, and compares the preformance of DMP1 when using the standard delta rule training method for training individual nodes against the performance of DMP1 when using a genetic algorithm for training. While the basic model does not require the use of a genetic algorithm for training individual nodes, the results show that the convergence properties of DMP1 are enhanced by the use of a genetic algorithm with an appropriate fitness function.
  • Reference: In Proceedings of the World Congress on Neural Networks , pages 177–181, 1996.
  • BibTeX:
    @inproceedings{andersen.wcnn96.dmp_ga,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {The Effect of Decision Surface Fitness on Dynamic Multi-layer Perceptron Networks},
    booktitle = {Proceedings of the World Congress on Neural Networks },
    pages = {177--181},
    year = {1996},
    }
  • Download the file: ps

Learning and Generalization with Bounded Order Rule Sets

  • Authors: Tim L. Andersen
  • Abstract: This thesis deals with the problem of inducing useful rules, or extracting critical, higher-order conjunctions of attributes, from a set of preclassified examples, where little or nothing is known about the underlying functional form of the distribution from which the examples were taken. The approach taken in this thesis differs from that normally used in that it does not limit the size of the rule search space. Rather, every possible conjunction of input attributes is considered by the learning algorithm as a potential rule component. In so doing, this thesis is attempting to determine (mainly from an empirical standpoint) how generalization performance is affected when certain areas of the search space are ignored, as compared to when the entire search space is considered. In dealing with the above question, this thesis studies several methods for inducing rules and using them for classification of novel examples. This thesis also uses results obtained with the C4.5 rule-induction method for comparison purposes, and to support the main points of the thesis. The results show that higher-order rules are not required to approximate many real world learning problems. In addition, the difficulty of generating optimal rule sets is discussed, where the measure of optimality is the complexity or size of the rule set and/or the degree of predictive accuracy, and the problem of NP-completeness is discussed in relation to these two optimality measures.
  • Reference: Master’s thesis, Brigham Young University, April 1995.
  • BibTeX:
    @mastersthesis{andersen_95a,
    author = {Andersen, Tim L.},
    title = {Learning and Generalization with Bounded Order Rule Sets},
    school = {Brigham Young University},
    month = {April},
    year = {1995},
    }
  • Download the file: ps

Learning and Generalization with Bounded Order Rule Sets

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Abstract: All current rule-based methods found in the literature use some form of heuristic(s) in order to limit the size of the rule search space examined by the learning algorithm. This paper is an attempt to determine (mainly from an empirical standpoint) how generalization performance is affected when certain areas of the rule search space are ignored, as compared to when the entire search space is considered. This is done by exhaustively generating all rules for several small real-world problems and then determining how accuracy decreases as the size of the search space is iteratively reduced. The results show that higher-order rules are not required to approximate many real world learning problems. In dealing with the above question, several methods for inducing rules and using them for classification of novel examples are tested.
  • Reference: In Proceedings of the 10th International Symposium on Computer and Information Sciences, pages 419–426, 1995.
  • BibTeX:
    @inproceedings{andersen_95b,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Learning and Generalization with Bounded Order Rule Sets},
    booktitle = {Proceedings of the 10th International Symposium on Computer and Information Sciences},
    pages = {419--426},
    year = {1995},
    }
  • Download the file: ps

NP-Completeness of Minimum Rule Sets

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Abstract: Rule induction systems seek to generate rule sets which are optimal in the complexity of the rule set. This paper develops a formal proof of the NP-Completeness of the problem of generating the simplest rule set (MIN RS) which accurately predicts examples in the training set for a particular type of generalization algorithm and complexity measure. The proof is then informally extended to cover a broader spectrum of complexity measures and learning algorithms.
  • Reference: In Proceedings of the 10th International Symposium on Computer and Information Sciences, pages 411–418, 1995.
  • BibTeX:
    @inproceedings{andersen_95c,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {{NP}-Completeness of Minimum Rule Sets},
    booktitle = {Proceedings of the 10th International Symposium on Computer and Information Sciences},
    pages = {411--418},
    year = {1995},
    }
  • Download the file: ps

A Provably Convergent Dynamic Training Method for Multilayer Perceptron Networks

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Abstract: This paper presents a new method for training multi-layer perceptron networks called DMP1 (Dynamic Multi-layer Perceptron 1). The method is based upon a divide and conquer approach which builds networks in the form of binary trees, dynamically allocating nodes and layers as needed. The individual nodes of the network are trained using a gentetic algorithm. The method is capable of handling real-valued inputs and a proof is given concerning its convergence properties of the basic model. Simulation results show that DMP1 performs favorably in comparison with other learning algorithms.
  • Reference: In Proceedings of the 2nd International Symposium on Neuroinformatics and Neurocomputers, pages 77–84, 1995.
  • BibTeX:
    @inproceedings{andersen_95d,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {A Provably Convergent Dynamic Training Method for Multilayer Perceptron Networks},
    booktitle = {Proceedings of the 2nd International Symposium on Neuroinformatics and Neurocomputers},
    pages = {77--84},
    year = {1995},
    }
  • Download the file: ps

Learning and Generalization with Bounded Order Critical Feature Sets

  • Authors: Tim L. Andersen and Tony R. Martinez
  • Abstract: It is the case that many real world learning problems exhibit a great deal of regularity. It is likely that the solutions to learning problems which exhibit such regularity can be approximated utilizing only simple (low-order) features gathered from analysis of pre-classified examples. However, little specific work has been done to demonstrate the utility of low-order features. This paper presents methods for gathering low order features from an existing data set of preclassified examples, and using these features for classification of novel patterns from the problem domain. It then conducts experiments using the methods presented on several real world classification problems and reports the results. The results show that pattern classification methods involving low-order feature sets have promise and warrant further research.
  • Reference: In Proceedings of the AI’93 Australian Joint Conference on Artificial Intelligence, page 450, 1993.
  • BibTeX:
    @inproceedings{andersen_93a,
    author = {Andersen, Tim L. and Martinez, Tony R.},
    title = {Learning and Generalization with Bounded Order Critical Feature Sets},
    booktitle = {Proceedings of the {AI}'93 Australian Joint Conference on Artificial Intelligence},
    pages = {450},
    year = {1993},
    }
  • Download the file: ps

Efficient Construction of Networks for Learned Representations with General to Specific Relationships

  • Authors: J. Cory Barker and Tony R. Martinez
  • Abstract: Machine learning systems often represent concepts or rules as sets of attribute-value pairs. Many learning algorithms generalize or specialize these concept representations by removing or adding pairs. Thus concepts are created that have general to specific relationships. This paper presents algorithms to connect concepts into a network based on their general to specific relationships. Since any concept can access related concepts quickly, the resulting structure allows increased efficiency in learning and reasoning. The time complexity of one set of learning models improves from O(n log n) to O(log n) (where n is the number of nodes) when using the general to specific structure.
  • Reference: Yfantis, Evangelos A., editor, Intelligent Systems, volume 1, pages 617–625, Kluwer Academic Publishers, 1995.
  • BibTeX:
    @article{barker_95a,
    author = {Barker, J. Cory and Martinez, Tony R.},
    title = {Efficient Construction of Networks for Learned Representations with General to Specific Relationships},
    editor = {Yfantis, Evangelos A.},
    journal = {Intelligent Systems},
    volume = {1},
    pages = {617--625},
    publisher = {Kluwer Academic Publishers},
    year = {1995},
    }
  • Download the file: ps

Eclectic Machine Learning

  • Authors: J. Cory Barker
  • Abstract: This dissertation presents a family of inductive learning systems that derive general rules from specific examples. These systems combine the benefits of neural networks, ASOCS, and symbolic learning algorithms. The systems presented here learn incrementally with good speed and generalization. They are based on a parallel architectural model that adapts to the problem being learned. Learning is done without requiring user adjustment of sensitive parameters, and noise is tolerated with graceful degradation in performance. The systems described in this work are based on features. Features are subsets of the input space. One group of learning algorithms begins with general features and specializes those features to match the problem that is being learned. Another group creates specific features and then generalizes those features. The final group combines the approaches used in the first two groups to gain the benefits of both. The algorithms are O(m log m), where m is the number of nodes in the network, and the number of inputs and output values are treated as constants. An enhanced network topology reduces time complexity to O(log m). Empirical results show that the algorithms give good generalization and that learning converges in a small number of training passes.
  • Reference: PhD thesis, Brigham Young University, February 1994.
  • BibTeX:
    @phdthesis{barker_diss,
    author = {Barker, J. Cory},
    title = {Eclectic Machine Learning},
    school = {Brigham Young University},
    month = {February},
    year = {1994},
    }
  • Download the file: ps

Proof of Correctness for ASOCS AA3 Networks

  • Authors: J. Cory Barker and Tony R. Martinez
  • Abstract: This paper analyzes Adaptive Algorithm 3 (AA3) of Adaptive Self-Organizing Concurrent Systems (ASOCS) and proves that AA3 correctly fulfills the rules presented. Several different models for ASOCS have been developed. AA3 uses a distributed mechanism for implementing rules so correctness is not obvious. An ASOCS is an adaptive network composed of many simple computing elements operating in parallel. An ASOCS operates in one of two modes: learning and processing. In learning mode, rules are presented to the ASOCS and incorporated in a self-organizing fashion. In processing mode, the ASOCS acts as a parallel hardware circuit that performs the function defined by the learned rules.
  • Reference: IEEE Transactions on Systems, Man, and Cybernetics, volume 3, pages 503–510, 1994.
  • BibTeX:
    @article{barker_94a,
    author = {Barker, J. Cory and Martinez, Tony R.},
    title = {Proof of Correctness for {ASOCS} {AA3} Networks},
    journal = {{IEEE} Transactions on Systems, Man, and Cybernetics},
    volume = {3},
    pages = {503--510},
    year = {1994},
    }
  • Download the file: ps

Generalization by Controlled Expansion of Examples

  • Authors: J. Cory Barker and Tony R. Martinez
  • Abstract: SG (Specific to General) is a learning system that derives general rules from specific examples. SG learns incrementally with good speed and generalization. The SG network is built of many simple nodes that adapt to the problem being learned. Learning is done without requiring user adjustment of sensitive parameters and noise is tolerated with graceful degradation in performance. Nodes learn important features in the input space and then monitor the ability of the features to predict output values. Learning is O(n log n) for each example, where n is the number of nodes in the network, and the number of inputs and output values are treated as constants. An enhanced network topology reduces time complexity to O(log n). Empirical results show that the model gives good generalization and that learning converges in a small number of training passes.
  • Reference: In Proceedings of The Seventh International Symposium on Artificial Intelligence, pages 142–149, 1994.
  • BibTeX:
    @inproceedings{barker_94b,
    author = {Barker, J. Cory and Martinez, Tony R.},
    title = {Generalization by Controlled Expansion of Examples},
    booktitle = {Proceedings of The Seventh International Symposium on Artificial Intelligence},
    pages = {142--149},
    year = {1994},
    }
  • Download the file: ps

GS: A Network that Learns Important Features

  • Authors: J. Cory Barker and Tony R. Martinez
  • Abstract: GS is a network for supervised inductive learning from examples that uses ideas from neural networks and symbolic inductive learning to gain benefits of both methods. The network is built of many simple nodes that learn important features in the input space and then monitor the ability of the features to predict output values. The network avoids the exponential nature of the number of features by using information gained by general features to guide the creation of more specific features. Empirical evaluation of the model on real world data has shown that the network provides good generalization performance. Convergence is accomplished within a small number of training passes. The network provides these benefits while automatically allocating and deleting nodes and without requiring user adjustment of any parameters. The network learns incrementally and operates in a parallel fashion.
  • Reference: In Proceedings of The World Congress on Neural Networks, volume 3, pages 376–380, July 1993.
  • BibTeX:
    @inproceedings{barker_93c,
    author = {Barker, J. Cory and Martinez, Tony R.},
    title = {{GS}: A Network that Learns Important Features},
    booktitle = {Proceedings of The World Congress on Neural Networks},
    volume = {3},
    pages = {376--380},
    month = {July},
    year = {1993},
    }
  • Download the file: ps

Generalization by Controlled Intersection of Examples

  • Authors: J. Cory Barker and Tony R. Martinez
  • Abstract: SG (Specific to General) is a network for supervised inductive learning from examples that uses ideas from neural networks and symbolic inductive learning to gain benefits of both methods. The network is built of many simple nodes that learn important features in the input space and then monitor the ability of the features to predict output values. The network avoids the exponential nature of the number of features by creating specific features for each example and then expanding those features; making them more general. Expansion of a feature terminates when it encounters another feature with contradicting outputs. Empirical evaluation of the model on real-world data has shown that the network provides good generalization performance. Convergence is accomplished within a small number of training passes. The network provides these benefits while automatically allocating and deleting nodes and without requiring user adjustment of any parameters. The network learns incrementally and operates in a parallel fashion.
  • Reference: In Proceedings of The Sixth Australian Joint Conference on Artificial Intelligence, pages 323–327, 1993.
  • BibTeX:
    @inproceedings{barker_93a,
    author = {Barker, J. Cory and Martinez, Tony R.},
    title = {Generalization by Controlled Intersection of Examples},
    booktitle = {Proceedings of The Sixth Australian Joint Conference on Artificial Intelligence},
    pages = {323--327},
    year = {1993},
    }
  • Download the file: ps

Learning and Generalization Controlled by Contradiction

  • Authors: J. Cory Barker and Tony R. Martinez
  • Abstract: One page overview of the SG (Specific to General) learning model.
  • Reference: In Proceedings of The International Conference on Artificial Neural Networks, 1993.
  • BibTeX:
    @inproceedings{barker_93b,
    author = {Barker, J. Cory and Martinez, Tony R.},
    title = {Learning and Generalization Controlled by Contradiction},
    booktitle = {Proceedings of The International Conference on Artificial Neural Networks},
    year = {1993},
    }
  • Download the file: ps

Extending ID3 through Discretization of Continuous Inputs

  • Authors: Rick Bertelsen and Tony R. Martinez
  • Abstract: This paper presents a mechanism to extend ID3 by classifying real valued inputs. Real valued inputs are classified through a neural network model termed the Competitive Classifier (CC). The CC forwards discrete classification results to the ID3 system, and accepts feedback from the ID3 system. Through the use of feedback, the ID3 system guides the CC into improving classifications.
  • Reference: In Proceedings of FLAIRS’94 Florida Artificial Intelligence Research Symposium, pages 122–125, 1994.
  • BibTeX:
    @inproceedings{bertelsen_94,
    author = {Bertelsen, Rick and Martinez, Tony R.},
    title = {Extending {ID3} through Discretization of Continuous Inputs},
    booktitle = {Proceedings of {FLAIRS}'94 Florida Artificial Intelligence Research Symposium},
    pages = {122--125},
    year = {1994},
    }
  • Download the file: ps

Automatic Feature Extraction in Machine Learning

  • Authors: Rick Bertelsen
  • Abstract: This thesis presents a machine learning model capable of extracting discrete classes out of continuous valued input features. This is done using a neurally inspired novel competitive classifier (CC) which feeds the discrete classifications forward to a supervised machine learning model. The supervised learning model uses the discrete classifications and perhaps other information available to solve a problem. The supervised learner then generates feedback to guide the CC into potentially more useful classifications of the continuous valued input features. Two supervised learning models are combined with the CC creating ASOCS-AFE and ID3-AFE. Both models are simulated and the results are analyzed. Based on these results, several areas of future research are proposed.
  • Reference: Master’s thesis, Brigham Young University, 1994.
  • BibTeX:
    @mastersthesis{bertelsen_th,
    author = {Bertelsen, Rick},
    title = {Automatic Feature Extraction in Machine Learning},
    school = {Brigham Young University},
    year = {1994},
    }
  • Download the file: ps

Automatic Composition of Themed Mood Pieces

  • Authors: Heather Chan and Dan Ventura
  • Abstract: Musical harmonization of a given melody is a nontrivial problem; slight variations in instrumentation, voicing, texture, and bass rhythm can lead to significant differences in the mood of the resulting piece. This study explores the possibility of automatic musical composition by using machine learning and statistical natural language processing to tailor a piece to a particular mood using an existing melody.
  • Reference: In Proceedings of the International Joint Workshop on Computational Creativity, pages 109–115, September 2008.
  • BibTeX:
    @inproceedings{chan.ijwcc08,
    author = {Chan, Heather and Ventura, Dan},
    title = {Automatic Composition of Themed Mood Pieces},
    booktitle = {Proceedings of the International Joint Workshop on Computational Creativity},
    pages = {109--115},
    month = {September},
    year = {2008},
    }
  • Download the file: pdf

Improved Hopfield Nets by Training with Noisy Data

  • Authors: Fred Clift and Tony R. Martinez
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’01, pages 1138–1143, 2001.
  • BibTeX:
    @inproceedings{Cliftijcnn2001,
    author = {Clift, Fred and Martinez, Tony R.},
    title = {Improved Hopfield Nets by Training with Noisy Data},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'01},
    pages = {1138--1143},
    year = {2001},
    }
  • Download the file: pdf

Using Self-Organizing Maps to Implicitly Model Preference for a Musical Query-by-Content System

  • Authors: Kyle Dickerson and Dan Ventura
  • Abstract: The ever-increasing density of computer storage devices has allowed the average user to store enormous quantities of multimedia content, and a large amount of this content is usually music. Current search techniques for musical content rely on meta-data tags which describe artist, album, year, genre, etc. Query-by-content systems allow users to search based upon the acoustical content of the songs. Recent systems have mainly depended upon textual representations of the queries and targets in order to apply common string-matching algorithms. However, these methods lose much of the information content of the song and limit the ways in which a user may search. We have created a music recommendation system that uses Self-Organizing Maps to find similarities between songs while preserving more of the original acoustical content. We build on the design of the recommendation system to create a musical query-by-content system. We discuss the weaknesses of the naive solution and then implement a quasi-supervised design and discuss some preliminary results.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 705–710, June 2009.
  • BibTeX:
    @inproceedings{dickerson.ijcnn09,
    author = {Dickerson, Kyle and Ventura, Dan},
    title = {Using Self-Organizing Maps to Implicitly Model Preference for a Musical Query-by-Content System},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {705--710},
    month = {June},
    year = {2009},
    }
  • Download the file: pdf

Search Techniques for Fourier-Based Learning

  • Authors: Adam Drake and Dan Ventura
  • Abstract: Fourier-based learning algorithms rely on being able to efficiently find the large coefficients of a function’s spectral representation. In this paper we introduce and analyze techniques for finding large coefficients. We show how a previously introduced search technique can be generalized from the Boolean case to the real-valued case, and we apply it in branch-and-bound and beam search algorithms that have significant advantages over the best-first algorithm in which the technique was originally introduced.
  • Reference: In Proceedings of the International Joint Conference on Artificial Intelligence, pages 1040–1045, July 2009. (First appeared in Proceedings of the AAAI Workshop on Search in Artificial Intelligence and Robotics, 2008).
  • BibTeX:
    @inproceedings{drake2009a,
    author = {Drake, Adam and Ventura, Dan},
    title = {Search Techniques for {F}ourier-Based Learning},
    booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence},
    pages = {1040--1045},
    month = {July},
    year = {2009},
    note = {(First appeared in Proceedings of the {AAAI} Workshop on Search in Artificial Intelligence and Robotics, 2008)},
    }
  • Download the file: pdf

Sentiment Regression: Using Real-Valued Scores to Summarize Overall Document Sentiment

  • Authors: Adam Drake and Eric Ringger and Dan Ventura
  • Abstract: In this paper, we consider a sentiment regression problem: summarizing the overall sentiment of a review with a real-valued score. Empirical results on a set of labeled reviews show that real-valued sentiment modeling is feasible, as several algorithms improve upon baseline performance. We also analyze performance as the granularity of the classification problem moves from two-class (positive vs. negative) towards infinite-class (real-valued).
  • Reference: In Proceedings of the IEEE International Conference on Semantic Computing, pages 152–157, August 2008.
  • BibTeX:
    @inproceedings{drv.icsc2008,
    author = {Drake, Adam and Ringger, Eric and Ventura, Dan},
    title = {Sentiment Regression: Using Real-Valued Scores to Summarize Overall Document Sentiment},
    booktitle = {Proceedings of the {IEEE} International Conference on Semantic Computing},
    pages = {152--157},
    month = {August},
    year = {2008},
    }
  • Download the file: pdf

Comparing High-Order Boolean Features

  • Authors: Adam Drake and Dan Ventura
  • Abstract: Many learning algorithms attempt, either explicitly or implicitly, to discover useful high-order features. When considering all possible functions that could be encountered, no particular type of high-order feature should be more useful than any other. However, this paper presents arguments and empirical results that suggest that for the learning problems typically encountered in practice, some high-order features may be more useful than others.
  • Reference: In Proceedings of the Joint Conference on Information Sciences, pages 428–431, July 2005.
  • BibTeX:
    @inproceedings{drake.ventura.jcis2005,
    author = {Drake, Adam and Ventura, Dan},
    title = {Comparing High-Order Boolean Features},
    booktitle = {Proceedings of the Joint Conference on Information Sciences},
    pages = {428--431},
    month = {July},
    year = {2005},
    }
  • Download the file: pdf

A Practical Generalization of Fourier-Based Learning

  • Authors: Adam Drake and Dan Ventura
  • Abstract: This paper presents a search algorithm for finding functions that are highly correlated with an arbitrary set of data. The functions found by the search can be used to approximate the unknown function that generated the data. A special case of this approach is a method for learning Fourier representations. Empirical results demonstrate that on typical real-world problems the most highly correlated functions can be found very quickly, while combinations of these functions provide good approximations of the unknown function.
  • Reference: In ICML ’05: Proceedings of the 22nd International Conference on Machine Learning, pages 185–192, New York, NY, USA, 2005. ACM Press.
  • BibTeX:
    @inproceedings{drake.ventura.icml2005,
    author = {Drake, Adam and Ventura, Dan},
    title = {A Practical Generalization of Fourier-Based Learning},
    booktitle = {ICML '05: Proceedings of the 22nd International Conference on Machine Learning},
    pages = {185--192},
    publisher = {ACM Press},
    address = {New York, NY, USA},
    year = {2005},
    }
  • Download the file: pdf

Predicting and Preventing Coordination Problems in Cooperative Learning Systems

  • Authors: Nancy Fulda and Dan Ventura
  • Abstract: We present a conceptual framework for creating Q-learning-based algorithms that converge to optimal equilibria in cooperative multiagent settings. This framework includes a set of conditions that are sufficient to guarantee optimal system performance. We demonstrate the efficacy of the framework by using it to analyze several well-known multi-agent learning algorithms and conclude by employing it as a design tool to construct a simple, novel multiagent learning algorithm.
  • Reference: In Proceedings of the International Joint Conference on Artificial Intelligence, pages 780–785, Hyderabad, India, January 2007.
  • BibTeX:
    @inproceedings{fulda.ijcai07,
    author = {Fulda, Nancy and Ventura, Dan},
    title = {Predicting and Preventing Coordination Problems in Cooperative Learning Systems},
    booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence},
    pages = {780--785},
    address = {Hyderabad, India},
    month = {January},
    year = {2007},
    }
  • Download the file: pdf

Learning a Rendezvous Task with Dynamic Joint Action Perception

  • Authors: Nancy Fulda and Dan Ventura
  • Abstract: Groups of reinforcement learning agents interacting in a common environment often fail to learn optimal behaviors. Poor performance is particularly common in environments where agents must coordinate with each other to receive rewards and where failed coordination attempts are penalized. This paper studies the effectiveness of the Dynamic Joint Action Perception (DJAP) algorithm on a grid-world rendezvous task with this characteristic. The effects of learning rate, exploration strategy, and training time on algorithm effectiveness are discussed. An analysis of the types of tasks for which DJAP learning is appropriate is also presented.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 627–632, Vancouver, BC, July 2006.
  • BibTeX:
    @inproceedings{fulda.ijcnn06,
    author = {Fulda, Nancy and Ventura, Dan},
    title = {Learning a Rendezvous Task with Dynamic Joint Action Perception},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {627--632},
    address = {Vancouver, BC},
    month = {July},
    year = {2006},
    }
  • Download the file: pdf

Incremental Policy Learning: An Equilibrium Selection Algorithm for Reinforcement Learning Agents with Common Interests

  • Authors: Nancy Fulda and Dan Ventura
  • Abstract: We present an equilibrium selection algorithm for reinforcement learning agents that incrementally adjusts the probabilityof executing each action based on the desirability of the outcome obtained in the last time step. The algorithm assumes that at least one coordination equilibrium exists and requires that the agents have a heuristic for determining whether or not the equilibrium was obtained. In deterministic environments with one or more strict coordination equilibria, the algorithm will learn to play an optimal equilibrium as long as the heuristic is accurate. Empirical data demonstrate that the algorithm is also effective in stochastic environments and is able to learn good joint policies when the heuristic’s parameters are estimated during learning, rather than known in advance.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 1121–1126, July 2004.
  • BibTeX:
    @inproceedings{fulda.ijcnn04,
    author = {Fulda, Nancy and Ventura, Dan},
    title = {Incremental Policy Learning: An Equilibrium Selection Algorithm for Reinforcement Learning Agents with Common Interests},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {1121--1126},
    month = {July},
    year = {2004},
    }
  • Download the file: pdf

Target Sets: A Tool for Understanding and Predicting the Behavior of Interacting Q-learners

  • Authors: Nancy Fulda and Dan Ventura
  • Abstract: Reinforcement learning agents that interact in a common environment frequently affect each others’ perceived transition and reward distributions. This can result in convergence of the agents to a sub-optimal equilibrium or even to a solution that is not an equilibrium at all. Several modifications to the Q-learning algorithm have been proposed which enable agents to converge to optimal equilibria under specified conditions. This paper presents the concept of target sets as an aid to understanding why these modifications have been successful and as a tool to assist in the development of new modifications which are applicable in a wider range of situations.
  • Reference: In Proceedings of the Joint Conference on Information Sciences, pages 1549–1552, September 2003.
  • BibTeX:
    @inproceedings{fulda.jcis03,
    author = {Fulda, Nancy and Ventura, Dan},
    title = {Target Sets: A Tool for Understanding and Predicting the Behavior of Interacting Q-learners},
    booktitle = {Proceedings of the Joint Conference on Information Sciences},
    pages = {1549--1552},
    month = {September},
    year = {2003},
    }
  • Download the file: pdf

Concurrently Learning Neural Nets: Encouraging Optimal Behavior in Reinforcement Learning Systems.

  • Authors: Nancy Fulda and Dan Ventura
  • Reference: In IEEE International Workshop on Soft Computing Techniques in Instrumentation, Measurement, and Related Applications (SCIMA), May 2003.
  • BibTeX:
    @incollection{fulda_2003a,
    author = {Fulda, Nancy and Ventura, Dan},
    title = {Concurrently Learning Neural Nets: Encouraging Optimal Behavior in Reinforcement Learning Systems.},
    booktitle = {{IEEE} International Workshop on Soft Computing Techniques in Instrumentation, Measurement, and Related Applications ({SCIMA})},
    month = {May},
    year = {2003},
    }
  • Download the file: pdf, ps

Dynamic Joint Action Perception for Q-Learning Agents.

  • Authors: Nancy Fulda and Dan Ventura
  • Reference: In To Appear in Proceedings of the 2003 International Conference on Machine Learning and Applications, Los Angeles, CA, 2003.
  • BibTeX:
    @inproceedings{fulda_2003b,
    author = {Fulda, Nancy and Ventura, Dan},
    title = {Dynamic Joint Action Perception for Q-Learning Agents.},
    booktitle = {To Appear in Proceedings of the 2003 International Conference on Machine Learning and Applications, Los Angeles, {CA}},
    year = {2003},
    }
  • Download the file: ps, pdf

Towards Automatic Shaping in Robot Navigation.

  • Authors: Todd S. Peterson and Nancy Owens and James L. Carroll
  • Reference: In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2001.
  • BibTeX:
    @inproceedings{fulda_2001a,
    author = {Peterson, Todd S. and Owens, Nancy and Carroll, James L.},
    title = {Towards Automatic Shaping in Robot Navigation.},
    booktitle = {Proceedings of the {IEEE} International Conference on Robotics and Automation ({ICRA})},
    year = {2001},
    }
  • Download the file: ps, pdf

Memory-guided Exploration in Reinforcement Learning.

  • Authors: James L. Carroll and Todd S. Peterson and Nancy Owens
  • Reference: In Proceedings of the INNS-IEEE International Joint Conference on Neural Networks (IJCNN), 2001.
  • BibTeX:
    @inproceedings{fulda_2001b,
    author = {Carroll, James L. and Peterson, Todd S. and Owens, Nancy},
    title = {Memory-guided Exploration in Reinforcement Learning.},
    booktitle = {Proceedings of the {INNS}-{IEEE} International Joint Conference on Neural Networks ({IJCNN})},
    year = {2001},
    }
  • Download the file: ps, pdf

Using a Reinforcement Learning Controller to Overcome Simulator/Environment Discrepancies.

  • Authors: Nancy Owens and Todd S. Peterson
  • Reference: In Proceedings of the IEEE Conference on Systems, Man, and Cybernetics, 2001.
  • BibTeX:
    @inproceedings{fulda_2001c,
    author = {Owens, Nancy and Peterson, Todd S.},
    title = {Using a Reinforcement Learning Controller to Overcome Simulator/Environment Discrepancies.},
    booktitle = {Proceedings of the {IEEE} Conference on Systems, Man, and Cybernetics},
    year = {2001},
    }
  • Download the file: pdf

Waffles: A Machine Learning Toolkit

  • Authors: Michael S. Gashler
  • Abstract: We present a breadth-oriented collection of cross-platform command-line tools for researchers in machine learning called Waffles. The Waffles tools are designed to offer a broad spectrum of functionality in a manner that is friendly for scripted automation. All functionality is also available in a C++ class library. Waffles is available under the GNU Lesser General Public License.
  • Reference: Journal of Machine Learning Research, volume MLOSS 12, pages 2383–2387, JMLR.org and Microtome Publishing, July 2011.
  • BibTeX:
    @article{gashler2011jmlr,
    author = {Gashler, Michael S.},
    title = {Waffles: A Machine Learning Toolkit},
    journal = {Journal of Machine Learning Research},
    volume = {MLOSS 12},
    pages = {2383--2387},
    publisher = {JMLR.org and Microtome Publishing},
    month = {July},
    year = {2011},
    issn = {1532-4435},
    url = {http://www.jmlr.org/papers/volume12/gashler11a/gashler11a.pdf},
    }
  • Download the file: pdf

Tangent Space Guided Intelligent Neighbor Finding

  • Authors: Michael S. Gashler and Tony Martinez
  • Abstract: We present an intelligent neighbor-finding algorithm called SAFFRON that chooses neighboring points while avoiding making connections between points on geodesically distant regions of a manifold. SAFFRON identifies the suitability of points to be neighbors by using a relaxation technique that alternately estimates the tangent space at each point, and measures how well the estimated tangent spaces align with each other. This technique enables SAFFRON to form high-quality local neighborhoods, even on manifolds that pass very close to themselves. SAFFRON is even able to find neighborhoods that correctly follow the manifold topology of certain self-intersecting manifolds.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’11, pages 2617–2624, IEEE Press, 2011.
  • BibTeX:
    @incollection{gashler2011ijcnn1,
    author = {Gashler, Michael S. and Martinez, Tony},
    title = {Tangent Space Guided Intelligent Neighbor Finding},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'11},
    pages = {2617--2624},
    publisher = {IEEE Press},
    year = {2011},
    location = {San Jose, California, U.S.A.},
    }
  • Download the file: pdf

Temporal Nonlinear Dimensionality Reduction

  • Authors: Michael S. Gashler and Tony Martinez
  • Abstract: Existing Nonlinear dimensionality reduction (NLDR) algorithms make the assumption that distances between observations are uniformly scaled. Unfortunately, with many interesting systems, this assumption does not hold. We present a new technique called Temporal NLDR (TNLDR), which is specifically designed for analyzing the high-dimensional observations obtained from random-walks with dynamical systems that have external controls. It uses the additional information implicit in ordered sequences of observations to compensate for non-uniform scaling in observation space. We demonstrate that TNLDR computes more accurate estimates of intrinsic state than regular NLDR, and we show that accurate estimates of state can be used to train accurate models of dynamical systems.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’11, pages 1959–1966, IEEE Press, 2011.
  • BibTeX:
    @incollection{gashler2011ijcnn2,
    author = {Gashler, Michael S. and Martinez, Tony},
    title = {Temporal Nonlinear Dimensionality Reduction},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'11},
    pages = {1959--1966},
    publisher = {IEEE Press},
    year = {2011},
    location = {San Jose, California, U.S.A.},
    }
  • Download the file: pdf

Manifold Learning by Graduated Optimization

  • Authors: Michael S. Gashler and Dan Ventura and Tony Martinez
  • Abstract: We present an algorithm for manifold learning called Manifold Sculpting, which utilizes graduated optimization to seek an accurate manifold embedding. Empirical analysis across a wide range of manifold problems indicates that Manifold Sculpting yields more accurate results than a number of existing algorithms, including Isomap, LLE, HLLE, and L-MVU, and is significantly more efficient than HLLE and L-MVU. Manifold Sculpting also has the ability to benefit from prior knowledge about expected results.
  • Reference: IEEE Transactions on Systems, Man, and Cybernetics, Part B, volume PP (99), pages 1–13, 2011.
  • BibTeX:
    @article{gashler2011smc,
    author = {Gashler, Michael S. and Ventura, Dan and Martinez, Tony},
    title = {Manifold Learning by Graduated Optimization},
    journal = {{IEEE} Transactions on Systems, Man, and Cybernetics, Part B},
    volume = {PP},
    number = {99},
    pages = {1--13},
    year = {2011},
    doi = {10.1109/TSMCB.2011.2151187},
    }
  • Download the file: pdf

Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous

  • Authors: Michael S. Gashler and Christophe Giraud-Carrier and Tony Martinez
  • Abstract: Using decision trees that split on randomly selected attributes is one way to increase the diversity within an ensemble of decision trees. Another approach increases diversity by combining multiple tree algorithms. The random forest approach has become popular because it is simple and yields good results with common datasets. We present a technique that combines heterogeneous tree algorithms and contrast it with homogeneous forest algorithms. Our results indicate that random forests do poorly when faced with irrelevant attributes, while our heterogeneous technique handles them robustly. Further, we show that large ensembles of random trees are more susceptible to diminishing returns than our technique. We are able to obtain better results across a large number of common datasets with a significantly smaller ensemble.
  • Reference: In Seventh International Conference on Machine Learning and Applications, 2008. ICMLA ’08., pages 900–905, Dec. 2008.
  • BibTeX:
    @incollection{gashler2008icmla,
    author = {Gashler, Michael S. and Giraud-Carrier, Christophe and Martinez, Tony},
    title = {Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous},
    booktitle = {Seventh International Conference on Machine Learning and Applications, 2008. ICMLA '08.},
    pages = {900--905},
    month = {Dec.},
    year = {2008},
    doi = {10.1109/ICMLA.2008.154},
    }
  • Download the file: pdf

Iterative Non-linear Dimensionality Reduction with Manifold Sculpting

  • Authors: Michael S. Gashler and Dan Ventura and Tony Martinez
  • Abstract: Many algorithms have been recently developed for reducing dimensionality by projecting data onto an intrinsic non-linear manifold. Unfortunately, existing algorithms often lose significant precision in this transformation. Manifold Sculpting is a new algorithm that iteratively reduces dimensionality by simulating surface tension in local neighborhoods. We present several experiments that show Manifold Sculpting yields more accurate results than existing algorithms with both generated and natural data-sets. Manifold Sculpting is also able to benefit from both prior dimensionality reduction efforts.
  • Reference: In Platt, J.C. and Koller, D. and Singer, Y. and Roweis, S., editor, Advances in Neural Information Processing Systems 20, pages 513–520, MIT Press, Cambridge, MA, 2008.
  • BibTeX:
    @incollection{gashler2007nips,
    author = {Gashler, Michael S. and Ventura, Dan and Martinez, Tony},
    title = {Iterative Non-linear Dimensionality Reduction with Manifold Sculpting},
    editor = {Platt, J.C. and Koller, D. and Singer, Y. and Roweis, S.},
    booktitle = {Advances in Neural Information Processing Systems 20},
    pages = {513--520},
    publisher = {MIT Press},
    address = {Cambridge, MA},
    year = {2008},
    }
  • Download the file: pdf

Learning by Discrimination: A Constructive Incremental Approach

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: This paper presents i-AA1*, a constructive, incremental learning algorithm for a special class of weightless, self-organizing networks. In i-AA1*, learning consists of adapting the nodes’ functions and the network’s overall topology as each new training pattern is presented. Provided the training data is consistent, computational complexity is low and prior factual knowledge may be used to “prime” the network and improve its predictive accuracy and/or efficiency. Empirical generalization results on both toy problems and more realistic tasks demonstrate promise.
  • Reference: Journal of Computers, volume 2 (7), pages 49–58, September 2007.
  • BibTeX:
    @article{cgc.jcp2007,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {Learning by Discrimination: A Constructive Incremental Approach},
    journal = {Journal of Computers},
    volume = {2},
    number = {7},
    pages = {49--58},
    month = {September},
    year = {2007},
    issn = {1796-203X},
    }
  • Download the file: pdf

A Constructive Incremental Learning Algorithm for Binary Classification Tasks

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Reference: In Proceedings of SMCals/06, pages 213–218, 2006.
  • BibTeX:
    @inproceedings{cgc_smc2006,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {A Constructive Incremental Learning Algorithm for Binary Classification Tasks},
    booktitle = {Proceedings of SMCals/06},
    pages = {213--218},
    year = {2006},
    }
  • Download the file: pdf

An Efficient Metric for Heterogeneous Inductive Learning Applications in the Attribute-Value Language

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: Many inductive learning problems can be expressed in the classical attribute-value language. In order to learn and to generalize, learning systems often rely on some measure of similarity between their current knowledge base and new information. The attribute-value language defines a heterogeneous multi-dimensional input space, where some attributes are nominal and others linear. Defining similarity, or proximity, of two points in such input spaces is non trivial. We discuss two representative homogeneous metrics and show examples of why they are limited to their own domains. We then address the issues raised by the design of a heterogeneous metric for inductive learning systems. In particular, we discuss the need for normalization and the impact of don’t-care values. We propose a heterogeneous metric and evaluate it empirically on a simplified version of ILA.
  • Reference: Yfantis, Evangelos A., editor, Intelligent Systems), volume 1, pages 341–350, Kluwer Academic Publishers, 1995.
  • BibTeX:
    @article{cgc_94c,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {An Efficient Metric for Heterogeneous Inductive Learning Applications in the Attribute-Value Language},
    editor = {Yfantis, Evangelos A.},
    journal = {Intelligent Systems)},
    volume = {1},
    pages = {341--350},
    publisher = {Kluwer Academic Publishers},
    year = {1995},
    }
  • Download the file: ps

An Integrated Framework for Learning and Reasoning

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: Learning and reasoning are both aspects of what is considered to be intelligence. Their studies within AI have been separated historically, learning being the topic of machine learning and neural networks, and reasoning falling under classical (or sym b olic) AI. However, learning and reasoning are in many ways interdependent. This paper discusses the nature of some of these interdependencies, and proposes a general framework called FLARE, that combines inductive learning using prior knowledge together with reasoning. Several examples are presented that serve as a benchmark to test the framework, including classical induction, several commonsense protocols, and the use of reasoning to discover prior knowledge that can be used as a learning bias for inductive learning.
  • Reference: Journal of Artificial Intelligence Research, volume 3, pages 147–185, 1995.
  • BibTeX:
    @article{cgc_95a,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {An Integrated Framework for Learning and Reasoning},
    journal = {Journal of Artificial Intelligence Research},
    volume = {3},
    pages = {147--185},
    year = {1995},
    }
  • Download the file: ps

AA1*: A Dynamic Incremental Network that Learns by Discrimination

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: An incremental learning algorithm for a special class of self-organising, dynamic networks is presented. Learning is effected by adapting both the function performed by the nodes and the overall network topology, so that the network grows (or shrinks) over time to fit the problem. Convergence is guaranteed on any arbitrary Boolean dataset and empirical generalisation results demonstrate promise.
  • Reference: In Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms (ICANNGA’95), pages 45–48, 1995.
  • BibTeX:
    @inproceedings{cgc_95b,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {{AA1}*: A Dynamic Incremental Network that Learns by Discrimination},
    booktitle = {Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms ({ICANNGA}'95)},
    pages = {45--48},
    year = {1995},
    }
  • Download the file: ps

Analysis of the Convergence and Generalization of AA1

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: AA1 is an incremental learning algorithm for Adaptive Self-Organizing Concurrent Systems (ASOCS). ASOCS are self-organizing, dynamically growing networks of computing nodes. AA1 learns by discrimination and implements knowledge in a distributed fashion over all the nodes. This paper reviews AA1 from the perspective of convergence and generalization. A formal proof that AA1 converges on any arbitrary Boolean instance set is given. A discussion of generalization and other aspects of AA1, including the problem of handling inconsistency, follows. Results of simulations with real-world data are presented. They show that AA1 gives promising generalization.
  • Reference: Journal of Parallel and Distributed Computing, volume 26, pages 125–131, 1995.
  • BibTeX:
    @article{cgc_95c,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {Analysis of the Convergence and Generalization of {AA1}},
    journal = {Journal of Parallel and Distributed Computing},
    volume = {26},
    pages = {125--131},
    year = {1995},
    }
  • Download the file: ps

On Integrating Inductive Learning with Prior Knowledge and Reasoning

  • Authors: Christophe Giraud-Carrier
  • Abstract: Learning and reasoning are both aspects of what is considered to be intelligence. Their studies within AI have been separated historically, learning being the topic of neural networks and machine learning, and reasoning falling under classical (or symbolic) AI. However, learning and reasoning share many interdependencies, and the integration of the two may lead to more powerful models. This dissertation examines some of these interdependencies, and describes several models, culminating in a system called FLARE (Framework for Learning And REasoning). The proposed models integrate inductive learning with prior knowledge and reasoning. Learning is incremantal, prior knowledge is given by a teacher or deductively obtained by instantiating commonsense knowledge, and reasoning is non-monotonic. Simulation results on several datasets and classical commonsense protocols demonstrate promise.
  • Reference: PhD thesis, Brigham Young University, December 1994.
  • BibTeX:
    @phdthesis{cgc_diss,
    author = {Giraud-Carrier, Christophe},
    title = {On Integrating Inductive Learning with Prior Knowledge and Reasoning},
    school = {Brigham Young University},
    month = {December},
    year = {1994},
    }
  • Download the file: ps

Seven Desirable Properties for Artificial Learning Systems

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: Much effort has been devoted to understanding learning and reasoning in artificial intelligence, giving rise to a wide collection of models. For the most part, these models focus on some observed characteristic of human learning, such as induction or analogy, in an effort to emulate (and possibly exceed) human abilities. We propose seven desirable properties for artificial learning systems: incrementality, non-monotonicity, inconsistency and conflicting defaults handling, abstraction, self- organization, generalization, and computational tractability. We examine each of these properties in turn and show how their (combined) use can improve learning and reasoning, as well as potentially widen the range of applications of artificial learning systems. An overview of the algorithm PDL2, that begins to integrate the above properties, is given as a proof of concept.
  • Reference: In Proceedings of FLAIRS’94 Florida Artificial Intelligence Research Symposium, pages 16–20, 1994.
  • BibTeX:
    @inproceedings{cgc_94a,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {Seven Desirable Properties for Artificial Learning Systems},
    booktitle = {Proceedings of {FLAIRS}'94 Florida Artificial Intelligence Research Symposium},
    pages = {16--20},
    year = {1994},
    }
  • Download the file: ps

An Incremental Learning Model for Commonsense Reasoning

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: A self-organizing incremental learning model that attempts to combine inductive learning with prior knowledge and default reasoning is described. The inductive learning scheme accounts for useful generalizations and dynamic priority allocation, and effectively supplements prior knowledge. New rules may be created and existing rules modified, thus allowing the system to evolve over time. By combining the extensional and intensional approaches to learning rules, the model remains self-adaptive, while not having to unnecessarily suffer from poor (or atypical) learning environments. By combining rule-based and similarity-based reasoning, the model effectively deals with many aspects of brittleness.
  • Reference: In Proceedings of the Seventh International Symposium on Artificial Intelligence (ISAI’94), pages 134–141, 1994.
  • BibTeX:
    @inproceedings{cgc_94b,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {An Incremental Learning Model for Commonsense Reasoning},
    booktitle = {Proceedings of the Seventh International Symposium on Artificial Intelligence ({ISAI}'94)},
    pages = {134--141},
    year = {1994},
    }
  • Download the file: ps

Using Precepts to Augment Training Set Learning

  • Authors: Christophe Giraud-Carrier and Tony R. Martinez
  • Abstract: The goal of learning systems is to generalize. Generalization is commonly based on the set of criticalfeatures the system has available. Training set learners typically extract critical features from a random set of examples. While this approach is attractive, it suffers from the exponential growth of the number of features to be searched. We propose to extend it by endowing the system with some a priori knowledge, in the form of precepts. Advantages of the augmented system are speed-up, improved generalization, and greater parsimony. This paper presents a precept-driven learning algorithm. Its main features include: 1) distributed implementation, 2) bounded learning and execution times, and 3) ability to handle both correct and incorrect precepts. Results of simulations on real-world data demonstrate promise.
  • Reference: In Proceedings of the First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems (ANNES’93), pages 46–51, November 1993.
  • BibTeX:
    @inproceedings{cgc_93a,
    author = {Giraud-Carrier, Christophe and Martinez, Tony R.},
    title = {Using Precepts to Augment Training Set Learning},
    booktitle = {Proceedings of the First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems ({ANNES}'93)},
    pages = {46--51},
    month = {November},
    year = {1993},
    }
  • Download the file: ps

A Precept-Driven Learning Algorithm

  • Authors: Christophe Giraud-Carrier
  • Abstract: Machine learning is an attempt at devising mechanisms that machines can use to learn, rather than being explicitly programmed for, real-world applications. The goal of learning systems is to generalize. Generalization is based on the set of critical features the system has available. Training set learners typically extract critical features from a random set of examples drawn from experimentation. This approach can beneficially be extended by endowing the system with some a priori knowledge, in the form of precepts. Advantages of the augmented system include speed-up, improved generalization and greater parsimony. This thesis presents a precept-driven learning algorithm. The main characteristics of the algorithm include: 1) neurally inspired architecture, 2) bounded learning and execution times, and 3) ability to handle both correct and incorrect precepts. Results of simulations on real-world data demonstrate promise.
  • Reference: Master’s thesis, Brigham Young University, April 1993.
  • BibTeX:
    @mastersthesis{cgc_th,
    author = {Giraud-Carrier, Christophe},
    title = {A Precept-Driven Learning Algorithm},
    school = {Brigham Young University},
    month = {April},
    year = {1993},
    }
  • Download the file: ps

Spatiotemporal Pattern Recognition in Liquid State Machines

  • Authors: Eric Goodman and Dan Ventura
  • Abstract: The applicability of complex networks of spiking neurons as a general purpose machine learning technique remains open. Building on previous work using macroscopic exploration of the parameter space of an (artificial) neural microcircuit, we investigate the possibility of using a liquid state machine to solve two real-world problems: stockpile surveillance signal alignment and spoken phoneme recognition.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 7979–7584, Vancouver, BC, July 2006.
  • BibTeX:
    @inproceedings{goodman.ijcnn06,
    author = {Goodman, Eric and Ventura, Dan},
    title = {Spatiotemporal Pattern Recognition in Liquid State Machines},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {7979--7584},
    address = {Vancouver, BC},
    month = {July},
    year = {2006},
    }
  • Download the file: pdf

Effectively Using Recurrently Connected Spiking Neural Networks

  • Authors: Eric Goodman and Dan Ventura
  • Abstract: Recurrently connected spiking neural networks are difficult to use and understand because of the complex nonlinear dynamics of the system. Through empirical studies of spiking networks, we deduce several principles which are critical to success. Network parameters such as synaptic time delays and time constants and the connection probabilities can be adjusted to have a significant impact on accuracy. We show how to adjust these parameters to fit the type of problem.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 1542–1547, July 2005.
  • BibTeX:
    @inproceedings{goodman.ijcnn05,
    author = {Goodman, Eric and Ventura, Dan},
    title = {Effectively Using Recurrently Connected Spiking Neural Networks},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {1542--1547},
    month = {July},
    year = {2005},
    }
  • Download the file: pdf

Time Invariance and Liquid State Machines

  • Authors: Eric Goodman and Dan Ventura
  • Abstract: Time invariant recognition of spatiotemporal patterns is a common task of signal processing. The liquid state machine (LSM) is a paradigm which robustly handles this type of classification. Using an artificial dataset with target pattern lengths ranging from 0.1 to 1.0 seconds, we train an LSM to find the start of the pattern with a mean absolute error of 0.18 seconds. Also, LSMs can be trained to identify spoken digits, 1-9, with an accuracy of 97.6%, even with scaling by factors ranging from 0.5 to 1.5.
  • Reference: In Proceedings of the Joint Conference on Information Sciences, pages 420–423, July 2005.
  • BibTeX:
    @inproceedings{goodman.jcis05,
    author = {Goodman, Eric and Ventura, Dan},
    title = {Time Invariance and Liquid State Machines},
    booktitle = {Proceedings of the Joint Conference on Information Sciences},
    pages = {420--423},
    month = {July},
    year = {2005},
    }
  • Download the file: pdf

Extending ASOCS to Training-Set-Style Data

  • Authors: Edward F. Hart
  • Abstract: This thesis studies extensions to help the basic ASOCS neural network handle training set data. Background is discussed on the need for neural networks. Instance set math for the ASOCS model is presented. Six data sets are gathered and mapped into the ASOCS neural net. Tests are done with the data sets to assist in the discovery of several problems. Some of these problems have solutions suggested in this thesis.

    Mutual exclusion, arbitrary discretization, One Difference minimization, noisy and missing data, generalization using a Hamming distance metric are all discussed during the research section. As a result of these extensions to the basic ASOCS model, training-set-style data is recognized much better than without the extensions. Further research using large test sets, other generalization techniques, and different discretization techniques is suggested.
  • Reference: Master’s thesis, Brigham Young University, August 1992.
  • BibTeX:
    @mastersthesis{hart_th,
    author = {Hart, Edward F.},
    title = {Extending {ASOCS} to Training-Set-Style Data},
    school = {Brigham Young University},
    month = {August},
    year = {1992},
    }
  • Download the file: ps

Constructing Low-Order Discriminant Neural Networks Using Statistical Feature Selection

  • Authors: Eric Henderson and Tony R. Martinez
  • Reference: Journal of Intelligent Systems, volume 14, 2005.
  • BibTeX:
    @article{henderson.jis2005,
    author = {Henderson, Eric and Martinez, Tony R.},
    title = {Constructing Low-Order Discriminant Neural Networks Using Statistical Feature Selection},
    journal = {Journal of Intelligent Systems},
    volume = {14},
    year = {2005},
    }
  • Download the file: pdf

Pair Attribute Learning: Network Construction Using Pair Features

  • Authors: Eric Henderson and Tony R. Martinez
  • Abstract: We present the Pair Attribute Learning (PAL) algorithm for the selection of relevant inputs and network topology. Correlations on training instance pairs are used to drive network construction of a single-hidden layer MLP. Results on nine learning problems demonstrate 70% less complexity, on average, without a significant loss of accuracy.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’02, pages 2556–2561, 2002.
  • BibTeX:
    @inproceedings{EricIJCNN,
    author = {Henderson, Eric and Martinez, Tony R.},
    title = {Pair Attribute Learning: Network Construction Using Pair Features},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'02},
    pages = {2556--2561},
    year = {2002},
    }
  • Download the file: pdf

Prioritized Rule Systems

  • Authors: Brent W. Hughes
  • Abstract: Non-von Neumann architectures attempt to overcome the “word-at-a-time” bottleneck of traditional computing systems. Neural nets are a class of non-von architectures whose goal is, among other things, to learn input-output mappings using highly distributed processing and memory. A neural net consists of many simple processing elements (nodes) with modifiable links between them, allowing for a high degree of parallelism. A typical neural net has fixed a topology. It learns by modifying the “weights” or “conductances” of the links between nodes.

    Another model with similar goals, called ASOCS, learns by modifying its topology. Unlike a typical neural net, ASOCS is guaranteed to be able to represent and learn any desired mapping and to do so efficiently. This thesis presents an extension of ASOCS called Prioritized Rules Systems. The PRS abstract model provides a foundation for various architectural models that have a number of advantages over other ASOCS models. One example of such an architectural model is presented in the thesis along with a description of a program written to simulate that model. The processing and learning algorithms of the architectural model are based on theorems proved in the thesis.
  • Reference: Master’s thesis, Brigham Young University, November 1989.
  • BibTeX:
    @mastersthesis{hughes_th,
    author = {Hughes, Brent W.},
    title = {Prioritized Rule Systems},
    school = {Brigham Young University},
    month = {November},
    year = {1989},
    }
  • Download the file: ps

Improved Backpropagation Learning in Neural Networks with Windowed Momentum

  • Authors: Butch Istook and Tony R. Martinez
  • Abstract: Backpropagation, which is frequently used in Neural Network training, often takes a great deal of time to converge on an acceptable solution. Momentum is a standard technique that is used to speed up convergence and maintain generalization performance. In this paper we present the Windowed momentum algorithm, which increases speedup over standard momentum. Windowed momentum is designed to use a fixed width history of recent weight updates for each connection in a neural network. By using this additional information, Windowed momentum gives significant speed-up over a set of applications with same or improved accuracy. Windowed Momentum achieved an average speed-up of 32% in convergence time on 15 data sets, including a large OCR data set with over 500,000 samples. In addition to this speedup, we present the consequences of sample presentation order. We show that Windowed momentum is able to overcome these effects that can occur with poor presentation order and still maintain its speed-up advantages.
  • Reference: International Journal of Neural Systems, volume 3&4, pages 303–318, 2002.
  • BibTeX:
    @article{IstookIJNS,
    author = {Istook, Butch and Martinez, Tony R.},
    title = {Improved Backpropagation Learning in Neural Networks with Windowed Momentum},
    journal = {International Journal of Neural Systems},
    volume = {3&4},
    pages = {303--318},
    year = {2002},
    }
  • Download the file: pdf

Improving Text Classification using Conceptual and Contextual Features

  • Authors: Lee S. Jensen and Tony R. Martinez
  • Abstract: The exponential growth of text available on the Internet has created a critical need for accurate, fast, and general-purpose text classification algorithms. This paper examines the improvement of broadly based text classification by using features that are easily extracted from training documents. These features represent both the conceptual and contextual properties of a target class, and include synonyms, hypernyms, term frequency, and bigrams of nouns, synonyms and hypernyms. Multiple permutations of the features are applied to three different classification models (Coordinate matching, TF*IDF, and naive Bayes) over three diverse data sets (Reuters, USENET, and folk songs). The findings are also compared to previously published results for the rule-based learner Ripper and results obtained by using another naive Bayes classifier, Rainbow. Suggestions are made about how to automatically determine which features to use, based upon the data set in question. The results demonstrate that the introduction of both conceptual and contextual features decreases the error by as much as 33%.
  • Reference: , pages 101–102, KDD 2000, Text Mining Workshop, Boston. 2000.
  • BibTeX:
    @misc{jensen,
    author = {Jensen, Lee S. and Martinez, Tony R.},
    title = {Improving Text Classification using Conceptual and Contextual Features},
    pages = {101--102},
    howpublished = {{KDD} 2000, Text Mining Workshop, Boston.},
    year = {2000},
    }
  • Download the file: ps

A Data-dependent Distance Measure for Transductive Instance-based Learning

  • Authors: Jared Lundell and Dan Ventura
  • Abstract: We consider learning in a transductive setting using instance-based learning (k-NN) and present a method for constructing a data-dependent distance “metric” using both labeled training data as well as available unlabeled data (that is to be classified by the model). This new data-driven measure of distance is empirically studied in the context of various instance-based models and is shown to reduce error (compared to traditional models) under certain learning conditions. Generalizations and improvements are suggested.
  • Reference: In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 2825–2830, October 2007.
  • BibTeX:
    @inproceedings{lundell.smc07,
    author = {Lundell, Jared and Ventura, Dan},
    title = {A Data-dependent Distance Measure for Transductive Instance-based Learning},
    booktitle = {Proceedings of the {IEEE} International Conference on Systems, Man and Cybernetics},
    pages = {2825--2830},
    month = {October},
    year = {2007},
    }
  • Download the file: pdf

Priority ASOCS

  • Authors: Tony R. Martinez and Brent W. Hughes and Douglas M. Campbell
  • Abstract: This paper presents an ASOCS (Adaptive Self-Organizing Concurrent System) model for massively parallel processing of incrementally defined rule systems in such areas as adaptive logic, robotics, logical inference, and dynamic control. An ASOCS is an adaptive network composed of many simple computing elements operating asynchronously and in parallel. An ASOCS can operate in either a data processing mode or a learning mode. During data processing mode, an ASOCS acts as a parallel hardware circuit. During learning mode, an ASOCS incorporates a rule expressed as a Boolean conjunction in a distributed fashion in time logarithmic in the number of rules. This paper proposes a learning algorithm and architecture for Priority ASOCS. This new ASOCS model uses rules with priorities. The new model has significant learning time and space complexity improvements over previous models.
  • Reference: Journal of Artificial Neural Networks , volume 3, pages 403–429, 1994.
  • BibTeX:
    @article{martinez_94a,
    author = {Martinez, Tony R. and Hughes, Brent W. and Campbell, Douglas M.},
    title = {Priority {ASOCS}},
    journal = {Journal of Artificial Neural Networks },
    volume = {3},
    pages = {403--429},
    year = {1994},
    }
  • Download the file: ps

A Generalizing Adaptive Discriminant Network

  • Authors: Tony R. Martinez and J. Cory Barker and Christophe Giraud-Carrier
  • Abstract: This paper overviews the AA1 (Adaptive Algorithm 1) model of ASOCS the (Adaptive Self-Organizing Concurrent Systems) approach. It also presents promising empirical generalization results of AA1 with actual data. AA1 is a topologically dynamic network which grows to fit the problem being learned. AA1 generalizes in a self-organizing fashion to a network which seeks to find features which discriminate between concepts. Convergence to a training set is both guaranteed and bounded linearly in time.
  • Reference: In Proceedings of the World Congress on Neural Networks, volume 1, pages 613–616, 1993.
  • BibTeX:
    @inproceedings{martinez_93a,
    author = {Martinez, Tony R. and Barker, J. Cory and Giraud-Carrier, Christophe},
    title = {A Generalizing Adaptive Discriminant Network},
    booktitle = {Proceedings of the World Congress on Neural Networks},
    volume = {1},
    pages = {613--616},
    year = {1993},
    }
  • Download the file: ps

Towards a General Distributed Platform for Learning and Generalization

  • Authors: Tony R. Martinez and Brent W. Hughes
  • Abstract: Different learning models employ different styles of generalization on novel inputs. This paper proposes the need for multiple styles of generalization to support a broad application base. The Priority ASOCS model (Priority Adaptive Self-Organizing Concurrent System) is overviewed and presented as a potential platform which can support multiple generalization styles. PASOCS is an adaptive network composed of many simple computing elements operating asynchronously and in parallel. The PASOCS can operate in either a data processing mode or a learning mode. During data processing mode, the system acts as a parallel hardware circuit. During learning mode, the PASOCS incorporates rules, with attached priorities, which represent the application being learned. Learning is accomplished in a distributed fashion in time logarithmic in the number of rules. The new model has significant learning time and space complexity improvements over previous models.
  • Reference: In Proceedings of the Conference on Artificial Neural Networks and Expert Systems ANNES’93, pages 216–219, 1993.
  • BibTeX:
    @inproceedings{martinez_93b,
    author = {Martinez, Tony R. and Hughes, Brent W.},
    title = {Towards a General Distributed Platform for Learning and Generalization},
    booktitle = {Proceedings of the Conference on Artificial Neural Networks and Expert Systems {ANNES}'93},
    pages = {216--219},
    year = {1993},
    }
  • Download the file: ps

A Learning Model for Adaptive Network Routing

  • Authors: Tony R. Martinez and George L. Rudolph
  • Abstract: Increasing size, complexity, and dynamics of networks require adaptive routing mechanisms. This paper proposes initial concepts towards a learning and generalization mechanism to support adaptive real-time routing. An ASOCS learning model is employed as the basic adaptive router. Generalization of routing is based not only on source/destination address, but also on such factors as packet size, priority, privacy, network congestion, etc. Mechanisms involving continual adaptation based on feedback are presented. Extensions to conventional addressing which can support learning and generalization are proposed.
  • Reference: In Proceedings of the International Workshop on Applications of Neural Networks to Telecommunications IWANNT’93, pages 183–187, 1993.
  • BibTeX:
    @inproceedings{martinez_93c,
    author = {Martinez, Tony R. and Rudolph, George L.},
    title = {A Learning Model for Adaptive Network Routing},
    booktitle = {Proceedings of the International Workshop on Applications of Neural Networks to Telecommunications {IWANNT}'93},
    pages = {183--187},
    year = {1993},
    }
  • Download the file: ps

A Survey of Neural Network Research and Fielded Applications

  • Authors: David Kemsley and Tony R. Martinez and Douglas M. Campbell
  • Abstract: This paper gives a tabular presentation of approximately one hundred current neural network applications at different levels of maturity, from research to fielded products. The goal of this paper is not to be exhaustive, but to give a sampling overview demonstrating the diversity and amount of current application effort in different areas. The paper should aid both researchers and implementors to understand the diverse and potential impact of neural networks in real world applications. Tabular information is given regarding different features of neural network application efforts including model used, types of input and output data, accuracy, and research status. An extended bibliography allows a mechanism for further study into promising areas.
  • Reference: International Journal of Neural Networks, volume 2/3/4, pages 123–133, 1992.
  • BibTeX:
    @article{kemsley_92,
    author = {Kemsley, David and Martinez, Tony R. and Campbell, Douglas M.},
    title = {A Survey of Neural Network Research and Fielded Applications},
    journal = {International Journal of Neural Networks},
    volume = {2/3/4},
    pages = {123--133},
    year = {1992},
    }
  • Download the file: ps

A Self-Adjusting Dynamic Logic Module

  • Authors: Tony R. Martinez and Douglas M. Campbell
  • Abstract: This paper presents an ASOCS (Adaptive Self-Organizing Concurrent System) model for massively parallel processing of incrementally defined rule systems in such areas as adaptive logic, robotics, logical inference, and dynamic control. An ASOCS is an adaptive network composed of many simple computing elements operating asynchronously and in parallel. This paper focuses on Adaptive Algorithm 2 (AA2) and details its architecture and learning algorithm. AA2 has significant memory and knowledge maintenance advantages over previous ASOCS models. An ASOCS can operate in either a data processing mode or a learning mode. During learning mode, the ASOCS is given a new rule expressed as a boolean conjunction. The AA2 learning algorithm incorporates the new rule in a distributed fashion in a short, bounded time. During data processing mode, the ASOCS acts as a parallel hardware circuit.
  • Reference: Journal of Parallel and Distributed Computing, volume 4, pages 303–313, 1991.
  • BibTeX:
    @article{martinez_91a,
    author = {Martinez, Tony R. and Campbell, Douglas M.},
    title = {A Self-Adjusting Dynamic Logic Module},
    journal = {Journal of Parallel and Distributed Computing},
    volume = {4},
    pages = {303--313},
    year = {1991},
    }
  • Download the file: pdf

A Self-Organizing Binary Decision Tree for Incrementally Defined Rule Based Systems

  • Authors: Tony R. Martinez and Douglas M. Campbell
  • Abstract: This paper presents an ASOCS (adaptive self-organizing concurrent system) model for massively parallel processing of incrementally defined rule systems in such areas as adaptive logic, robotics, logical inference, and dynamic control. An ASOCS is an adaptive network composed of many simple computing elements operating asynchronously and in parallel. This paper focuses on adaptive algorithm 3 (AA3) and details its architecture and learning algorithm. It has advantages over previous ASOCS models in simplicity, implementability, and cost. An ASOCS can operate in either a data processing mode or a learning mode. During the data processing mode, an ASOCS acts as a parallel hardware circuit. In learning mode, rules expressed as boolean conjunctions are incrementally presented to the ASOCS. All ASOCS learning algorithms incorporate a new rule in a distributed fashion in a short, bounded time.
  • Reference: In IEEE Transactions on Systems, Man, and Cybernetics, volume 5, pages 1231–1238, 1991.
  • BibTeX:
    @inproceedings{martinez_91b,
    author = {Martinez, Tony R. and Campbell, Douglas M.},
    title = {A Self-Organizing Binary Decision Tree for Incrementally Defined Rule Based Systems},
    booktitle = {{IEEE} Transactions on Systems, Man, and Cybernetics},
    volume = {5},
    pages = {1231--1238},
    year = {1991},
    }
  • Download the file: pdf

ASOCS: Towards Bridging Neural Network and Artificial Intelligence Learning

  • Authors: Tony R. Martinez
  • Abstract: A new class of connectionist architectures is presented called ASOCS (Adaptive Self-Organizing Concurrent Systems) [3,4]. ASOCS models support efficient computation through self-organized learning and parallel execution. Learning is done through the incremental presentation of rules and/or examples. Data types include Boolean and multi-state variables; recent models support analog variables. The model incorporates rules into an adaptive logic network in a parallel and self organizing fashion. The system itself resolves inconsistencies and generalizes as the rules are presented. After an introduction to the ASOCS paradigm, the abstract introduces current research thrusts which significantly increase the power and applicability of ASOCS models. For simplicity, we discuss only boolean mappings in the ASOCS overview.
  • Reference: In Proceedings of the 2nd Government Neural Network Workshop, 1991.
  • BibTeX:
    @inproceedings{martinez_91c,
    author = {Martinez, Tony R.},
    title = {{ASOCS}: Towards Bridging Neural Network and Artificial Intelligence Learning},
    booktitle = {Proceedings of the 2nd Government Neural Network Workshop},
    year = {1991},
    }
  • Download the file: ps

A Connectionist Method for Adaptive Real-Time Network Routing

  • Authors: Kelly C. McDonald and Tony R. Martinez and Douglas M. Campbell
  • Abstract: This paper proposes a connectionist mechanism to support adaptive real-time routing in computer networks. In particular, an Adaptive Self-Organizing Concurrent System (ASOCS) model is used as the basic network router. ASOCS are connectionist models which achieve learning and processing in a parallel and self-organizing fashion. By exploiting parallel processing the ASOCS network router addresses the increased speed and complexity in computer networks. By using the ASOCS adaptive learning paradigm, a network router can utilize more flexible routing algorithms.
  • Reference: In Proceedings of the 4th International Symposium on Artificial Intelligence, pages 371–377, 1991.
  • BibTeX:
    @inproceedings{mcdonald_91,
    author = {McDonald, Kelly C. and Martinez, Tony R. and Campbell, Douglas M.},
    title = {A Connectionist Method for Adaptive Real-Time Network Routing},
    booktitle = {Proceedings of the 4th International Symposium on Artificial Intelligence},
    pages = {371--377},
    year = {1991},
    }
  • Download the file: ps

Smart Memory: The Memory Processor Model

  • Authors: Tony R. Martinez
  • Abstract: This paper overviews and proposes a class of smart memory devices called the memory processor model. Smart memory entails the tight coupling of memory and logic. The model seeks to alleviate the von Neumann bottleneck, take advantage of technology trends, improve overall system speed, and add encapsulation advantages. Speed is increased through locality of processing, communication savings, higher-level functionality, and parallelism. Parallelism is exploited at both the micro and macro levels. Data objects are accessed through descriptors, which give the memory a meta-knowledge concerning the objects, allowing for nontraditional access mechanisms. Both data types and operations are programmable. Innovative processing schemes, coupled with emerging technology densities, allow for substantial speed-up in traditional and novel memory operations. Three important paradigms introduced are descriptor processing, where operations are accomplished without access to the actual data, associative descriptor processing, supporting highly parallel access and processing, and the single-program multiple-data method, allowing parallelism by simultaneous processing of data objects distributed amongst multiple smart memories. Examples of specific operations are presented. This papers presents initial studies into the smart memory mechanism with the goal of describing its potential and stimulating further work.
  • Reference: In IFIP International Conference, 1990. In Modeling the Innovation: Communications, Automation and Information Systems, Carnevale, Lucertini, and Nicosia (Eds), North-Holland, pages 481–488, 1990.
  • BibTeX:
    @inproceedings{martinez_90a,
    author = {Martinez, Tony R.},
    title = {Smart Memory: The Memory Processor Model},
    booktitle = {{IFIP} International Conference, 1990. In Modeling the Innovation: Communications, Automation and Information Systems, Carnevale, Lucertini, and Nicosia (Eds), North-Holland},
    pages = {481--488},
    year = {1990},
    }
  • Download the file: ps

Consistency and Generalization of Incrementally Trained Connectionist Models

  • Authors: Tony R. Martinez
  • Abstract: This paper discusses aspects of consistency and generalization in connectionist networks which learn through incremental training by examples or rules. Differences between training set learning and incremental rule or example learning are presented. Generalization, the ability to output reasonable mappings when presented with novel input patterns, is discussed in light of the above learning methods. In particular, the contrast between hamming distance generalization and generalizing by high order combinations of critical variables is overviewed. Examples of detailed rules for an incremental learning model are presented for both consistency and generalization constraints.
  • Reference: In Proceedings of the International Symposium on Circuits and Systems, pages 706–709, 1990.
  • BibTeX:
    @inproceedings{martinez_90b,
    author = {Martinez, Tony R.},
    title = {Consistency and Generalization of Incrementally Trained Connectionist Models},
    booktitle = {Proceedings of the International Symposium on Circuits and Systems},
    pages = {706--709},
    year = {1990},
    }
  • Download the file: ps

Progress in Neural Networks, ch. 5

  • Authors: Tony R. Martinez
  • Abstract: (Book chapter that overviews the first three ASOCS models.)
  • Reference: Omidvar, Omid, editor, volume 1, chapter Adaptive Self-Organizing Concurrent Systems, pages 105–126, 1990. Ablex Publishing.
  • BibTeX:
    @inbook{martinez_90c,
    author = {Martinez, Tony R.},
    title = {Progress in Neural Networks, ch. 5},
    editor = {Omidvar, Omid},
    volume = {1},
    chapter = {Adaptive Self-Organizing Concurrent Systems},
    pages = {105--126},
    year = {1990},
    note = {Ablex Publishing},
    }
  • Download the file: ps

Smart Memory Architecture and Methods

  • Authors: Tony R. Martinez
  • Abstract: This paper discusses potential functionalities of smart memories. Smart memory entails the tight coupling of memory and logic. A specific architecture called the memory processor model is proposed. The model seeks to alleviate the von Neumann bottleneck, take advantage of technology trends, improve overall system speed, and add encapsulation advantages. Speed is increased through locality of processing, communication savings, higher-level functionality, and parallelism. Data objects are accessed through descriptors, which give the memory a meta-knowledge concerning the objects, allowing for nontraditional access mechanisms. Both data types and operations are programmable, and the model is streamlined for memory operations and services. Innovative processing schemes, coupled with emerging technology densities, allow for substantial fine-grain parallelism in traditional and novel memory operations. Three important paradigms introduced are descriptor processing, where operations are accomplished without access to the actual data, associative descriptor processing, supporting highly parallel access and processing, and the single-program multiple-data method, allowing parallelism by simultaneous processing of data objects distributed amongst multiple smart memories. Examples of specific operations are presented. This paper presents initial studies into the smart memory mechanism with the goal of describing its potential and stimulating further work.
  • Reference: Future Generation Computer Systems, volume 6, pages 145–162, 1990.
  • BibTeX:
    @article{martinez_90d,
    author = {Martinez, Tony R.},
    title = {Smart Memory Architecture and Methods},
    journal = {Future Generation Computer Systems},
    volume = {6},
    pages = {145--162},
    year = {1990},
    }
  • Download the file: ps

On the Pseudo Multilayer Learning of Backpropagation

  • Authors: Tony R. Martinez and M. Lindsey
  • Abstract: Rosenblatt’s convergence theorem for the simple perceptron initiated much excitement about iterative weight modifying neural networks. However, this convergence only holds for the class of linearly separable functions, which is vanishingly small compared to arbitrary functions. With multilayer networks of nonlinear units it is possible, though not guaranteed, to solve arbitrary functions. Backpropagation is a method of training multilayer networks to converge to the solution of arbitrary functions. This paper describes how classification takes place in single and multilayer networks using threshold or sigmoid nodes. It then shows that the current backpropagation method can only do effective learning on one layer of a network at a time.
  • Reference: In Proceedings of the IEEE Symposium on Parallel and Distributed Processing, pages 308–315, 1989.
  • BibTeX:
    @inproceedings{martinez_89a,
    author = {Martinez, Tony R. and Lindsey, M.},
    title = {On the Pseudo Multilayer Learning of Backpropagation},
    booktitle = {Proceedings of the {IEEE} Symposium on Parallel and Distributed Processing},
    pages = {308--315},
    year = {1989},
    }
  • Download the file: ps

Neural Network Applicability: Classifying the Problem Space

  • Authors: Tony R. Martinez
  • Abstract: The tremendous current effort to propose neurally inspired methods of computation forces closer scrutiny of real world application potential of these models. This paper categorizes applications into classes and particularly discusses features of applications which make them efficiently amenable to neural network methods. Computational machines do deterministic mappings of inputs to outputs and many computational mechanisms have been proposed for problem solutions. Neural network features include parallel execution, adaptive learning, generalization, and fault tolerance. Often, much effort is given to a model and applications which can already be implemented in a much more efficient way with an alternate technology. Neural networks are potentially powerful devices for many classes of applications, but not all. However, it is proposed that the class of applications for which neural networks are efficient is both large and commonly occurring in nature. Comparison of supervised, unsupervised, and generalizing systems is also included.
  • Reference: In Proceedings of the IASTED International Symposium on Expert Systems and Neural Networks, pages 41–44, 1989.
  • BibTeX:
    @inproceedings{martinez_89b,
    author = {Martinez, Tony R.},
    title = {Neural Network Applicability: Classifying the Problem Space},
    booktitle = {Proceedings of the {IASTED} International Symposium on Expert Systems and Neural Networks},
    pages = {41--44},
    year = {1989},
    }
  • Download the file: ps

ASOCS: A Multilayered Connectionist Network with Guaranteed Learning of Arbitrary Mappings

  • Authors: Tony R. Martinez
  • Abstract: This paper reviews features of a new class of multilayer connectionist architectures known as ASOCS (Adaptive Self-Organizing Concurrent Systems). ASOCS is similar to most decision-making neural network models in that it attempts to learn an adaptive set of arbitrary vector mappings. However, it differs dramatically in its mechanisms. ASOCS is based on networks of adaptive digital elements which self-modify using local information. Function specification is entered incrementally by use of rules, rather than complete input-output vectors, such that a processing network is able to extract critical features from a large environment and give output in a parallel fashion. Learning also uses parallelism and self-organization such that a new rule is completely learned in time linear with the depth of the network. The model guarantees learning of any arbitrary mapping of boolean input-output vectors. The model is also stable in that learning does not erase any previously learned mappings except those explicitly contradicted.
  • Reference: In 2nd IEEE International Conference on Neural Networks, August, August 1988.
  • BibTeX:
    @inproceedings{martinez_88c,
    author = {Martinez, Tony R.},
    title = {{ASOCS}: A Multilayered Connectionist Network with Guaranteed Learning of Arbitrary Mappings},
    booktitle = {2nd {IEEE} International Conference on Neural Networks, August},
    month = {August},
    year = {1988},
    }
  • Download the file: ps

Adaptive Parallel Logic Networks

  • Authors: Tony R. Martinez and J. J. Vidal
  • Abstract: This paper presents a detailed discussion of the ASOCS AA1 Algorithm.
  • Reference: Journal of Parallel and Distributed Computing, volume 1, pages 26–58, February 1988.
  • BibTeX:
    @article{martinez_88d,
    author = {Martinez, Tony R. and Vidal, J. J.},
    title = {Adaptive Parallel Logic Networks},
    journal = {Journal of Parallel and Distributed Computing},
    volume = {1},
    pages = {26--58},
    month = {February},
    year = {1988},
    }

On the Expedient Use of Neural Networks

  • Authors: Tony R. Martinez
  • Abstract: (Single Page Paper)
  • Reference: volume 1, Neural Networks, S1, p. 552, Presented at the 1st Meeting of the International Neural Network Society, 1988.
  • BibTeX:
    @misc{martinez_88a,
    author = {Martinez, Tony R.},
    title = {On the Expedient Use of Neural Networks},
    volume = {1},
    howpublished = {Neural Networks, S1, p. 552, Presented at the 1st Meeting of the International Neural Network Society},
    year = {1988},
    }
  • Download the file: ps

Digital Neural Networks

  • Authors: Tony R. Martinez
  • Abstract: Demands for applications requiring massive parallelism in symbolic environments have given rebirth to research in models labeled as neural networks. These models are made up of many simple nodes which are highly interconnected such that computation takes place as data flows amongst the nodes of the network. To present, most models have proposed nodes based on simple analog functions, where inputs are multiplied by weights and summed, the total then optionally being transformed by an arbitrary function at the node. Learning in these systems is accomplished by adjusting the weights on the input lines. This paper discusses the use of digital (boolean) nodes as a primitive building block in connectionist systems. Digital nodes naturally engender new paradigms and mechanisms for learning and processing in connectionist networks. The digital nodes are used as the basic building block of a class of models called ASOCS (Adaptive Self-Organizing Concurrent Systems). These models combine massive parallelism with the ability to adapt in a self-organizing fashion. Basic features of standard neural network learning algorithms and those proposed using digital nodes are compared and contrasted. The latter mechanisms can lead to vastly improved efficiency for many applications.
  • Reference: In Proceedings of the 1988 IEEE Systems, Man, and Cybernetics Conference, pages 681–684, 1988.
  • BibTeX:
    @inproceedings{martinez_88b,
    author = {Martinez, Tony R.},
    title = {Digital Neural Networks},
    booktitle = {Proceedings of the 1988 {IEEE} Systems, Man, and Cybernetics Conference},
    pages = {681--684},
    year = {1988},
    }
  • Download the file: ps

Models of Parallel Adaptive Logic

  • Authors: Tony R. Martinez
  • Abstract: This paper overviews a proposed architecture for adaptive parallel logic referred to as ASOCS (Adaptive Self-Organizing Concurrent System). The ASOCS approach is based on an adaptive network composed of many simple computing elements which operate in a parallel asynchronous fashion. Problem specification is given to the system by presenting if-then rules in the form of boolean conjunctions. Rules are added incrementally and the system adapts to the changing rule-base. Adaptation and data processing form two separate phases of operation. During processing the system acts as a parallel hardware circuit. The adaptation process is distributed amongst the computing elements and efficiently exploits parallelism. Adaptation is done in a self-organizing fashion and takes place in time linear with the depth of the network. This paper summarizes the overall ASOCS concept and overviews three specific architectures.
  • Reference: In Proceedings of the 1987 IEEE Systems, Man, and Cybernetics Conference, pages 290–296, 1987.
  • BibTeX:
    @inproceedings{martinez_87a,
    author = {Martinez, Tony R.},
    title = {Models of Parallel Adaptive Logic},
    booktitle = {Proceedings of the 1987 {IEEE} Systems, Man, and Cybernetics Conference},
    pages = {290--296},
    year = {1987},
    }
  • Download the file: ps

Adaptive Self-Organizing Logic Networks

  • Authors: Tony R. Martinez
  • Abstract: Along with the development of contemporary computer science the limitations of sequential “von Neumann” machines have become more apparent. It is now becoming clear that to handle projected needs in speed and throughput, massively parallel architectures will be needed. In this dissertation we propose a special purpose architectural model that satisfies a general class of propositional logic problems in a totally distributed and concurrent fashion. The architectural model is identified as ASOCS (Adaptive Self-Organizing Concurrent System). Problem specification is incremental and takes the form of if-then rules (instances) expressed as Boolean conjunctions. Possible applications include symbolic decision systems, propositional production systems, digital pattern recognition and real-time process control. The approach is based on an adaptive network composed of many simple computing elements (nodes) which operate in a combinational and asynchronous fashion. Control and processing in the network is distributed amongst the network nodes. Adaptation and data processing form two separate phases of operation. During processing, the network acts as a parallel network of Boolean gates. Inputs and outputs of the network are also Boolean. During adaptation the network structure and the node functions can change to update the overall network function as specified. As new rules are added to the rule base, the network independently reconfigures to a logic circuit that remains both minimal and consistent with the rule base. Thus, there is no explicit programming. Desired network response is simply presented to the system, following which the network adjusts itself accordingly. Although the functionality of the network can be observed from the outside, the internal network structure is unknown. The control of the adaptive process is almost completely distributed and efficiently exploits parallelism. Most communication takes place between neighboring nodes with only minimal need for centralized processing. The network modification is performed with considerable concurrency and the adaptation time grows only linearly with the depth of the network.
  • Reference: technical report, UCLA Technical Report - CSD 860093, June 1986. Ph.D. Dissertation.
  • BibTeX:
    @techreport{martinez_diss,
    author = {Martinez, Tony R.},
    title = {Adaptive Self-Organizing Logic Networks},
    institution = {{UCLA} Technical Report - {CSD} 860093},
    month = {June},
    year = {1986},
    note = {Ph.D. Dissertation},
    }

Artificial Neural Network Reduction through Oracle Learning

  • Authors: Joshua Menke and Tony R. Martinez
  • Reference: Intelligent Data Analysis, volume 13 (1), pages 135–149, 2009.
  • BibTeX:
    @article{Menke.OracleJournal,
    author = {Menke, Joshua and Martinez, Tony R.},
    title = {Artificial Neural Network Reduction through Oracle Learning},
    journal = {Intelligent Data Analysis},
    volume = {13},
    number = {1},
    pages = {135--149},
    year = {2009},
    }
  • Download the file: pdf

Improving Supervised Learning by Adapting the Problem to the Learner

  • Authors: Joshua Menke and Tony R. Martinez
  • Reference: In International Journal of Neural Systems, volume 19 (1), pages 1–9, 2009.
  • BibTeX:
    @inproceedings{menke_2009_adapting,
    author = {Menke, Joshua and Martinez, Tony R.},
    title = {Improving Supervised Learning by Adapting the Problem to the Learner},
    booktitle = {International Journal of Neural Systems},
    volume = {19},
    number = {1},
    pages = {1--9},
    year = {2009},
    }
  • Download the file: pdf

A Bradley-Terry Artificial Neural Network Model for Individual Ratings in Group Competitions

  • Authors: Joshua Menke and Tony R. Martinez
  • Reference: Journal of Neural Computing and Applications, volume 17 (2), pages 175–186, 2008.
  • BibTeX:
    @article{Menke.BTJournal,
    author = {Menke, Joshua and Martinez, Tony R.},
    title = {A {B}radley-{T}erry Artificial Neural Network Model for Individual Ratings in Group Competitions},
    journal = {Journal of Neural Computing and Applications},
    volume = {17},
    number = {2},
    pages = {175--186},
    year = {2008},
    }
  • Download the file: pdf

Domain Expert Approximation Through Oracle Learning

  • Authors: Joshua Menke and Tony R. Martinez
  • Reference: In Proceedings of the 13th European Symposium on Artificial Neural Networks (ESANN 2005), pages 205–210, 2005.
  • BibTeX:
    @inproceedings{menke_2005_domain,
    author = {Menke, Joshua and Martinez, Tony R.},
    title = {Domain Expert Approximation Through Oracle Learning},
    booktitle = {Proceedings of the 13th European Symposium on Artificial Neural Networks ({ESANN} 2005)},
    pages = {205--210},
    year = {2005},
    }
  • Download the file: pdf

Using Permutations Instead of Student’s t Distribution for p-values in Paired-Difference Algorithm Comparisons.

  • Authors: Joshua Menke and Tony R. Martinez
  • Reference: In Proceedings of the 2004 IEEE Joint Conference on Neural Networks IJCNN’04, 2004.
  • BibTeX:
    @inproceedings{menke_2004_permutations,
    author = {Menke, Joshua and Martinez, Tony R.},
    title = {Using Permutations Instead of Student's t Distribution for p-values in Paired-Difference Algorithm Comparisons.},
    booktitle = {Proceedings of the 2004 IEEE Joint Conference on Neural Networks {IJCNN}'04},
    year = {2004},
    }
  • Download the file: pdf

Simplifying OCR Neural Networks Through Oracle Learning

  • Authors: Joshua Menke and Tony R. Martinez
  • Reference: In Proceedings of the 2003 IEEE International Workshop on Soft Computing Techniques in Instrumentation, Measurement, and Related Applications, 2003. IEEE Press.
  • BibTeX:
    @inproceedings{menke_2003a,
    author = {Menke, Joshua and Martinez, Tony R.},
    title = {Simplifying {OCR} Neural Networks Through Oracle Learning},
    booktitle = {Proceedings of the 2003 {IEEE} International Workshop on Soft Computing Techniques in Instrumentation, Measurement, and Related Applications},
    publisher = {IEEE Press},
    year = {2003},
    location = {Provo, UT, U.S.A.},
    }
  • Download the file: pdf

Neural Network Simplification Through Oracle Learning

  • Authors: Joshua Menke
  • Reference: Master’s thesis, Brigham Young University, November 2002.
  • BibTeX:
    @mastersthesis{menke_2002a,
    author = {Menke, Joshua},
    title = {Neural Network Simplification Through Oracle Learning},
    school = {Brigham Young University},
    month = {November},
    year = {2002},
    }
  • Download the file: pdf

Neural Network Simplification Through Oracle Learning

  • Authors: Joshua Menke and Adam H. Peterson and Michael E. Rimer and Tony R. Martinez
  • Abstract: Often the best artificial neural network to solve a real world problem is relatively complex. However, with the growing popularity of smaller computing devices (handheld computers, cellular telephones, automobile interfaces, etc.), there is a need for simpler models with comparable accuracy. The following research presents evidence that using a larger model as an oracle to train a smaller model on unlabeled data results in 1) a simpler acceptable model and 2) improved results over standard training methods on a similarly sized smaller model. On automated spoken digit recognition, oracle learning resulted in an artificial neural network of half the size that 1) maintained comparable accuracy to the larger neural network, and 2) obtained up to a 25% decrease in error over standard training methods.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’02 , pages 2482–2497, 2002. IEEE Press.
  • BibTeX:
    @inproceedings{menke_2002b,
    author = {Menke, Joshua and Peterson, Adam H. and Rimer, Michael E. and Martinez, Tony R.},
    title = {Neural Network Simplification Through Oracle Learning},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'02 },
    pages = {2482--2497},
    publisher = {IEEE Press},
    year = {2002},
    location = {Honolulu, Hawaii, U.S.A.},
    }
  • Download the file: pdf

Computational Modeling of Emotional Content in Music

  • Authors: Kristine Monteith and Tony R. Martinez and Dan Ventura
  • Reference: In Proceedings of International Conference on Cognitive Science, pages x–x, 2010.
  • BibTeX:
    @inproceedings{Kristine.Cogsci10,
    author = {Monteith, Kristine and Martinez, Tony R. and Ventura, Dan},
    title = {Computational Modeling of Emotional Content in Music},
    booktitle = {Proceedings of International Conference on Cognitive Science},
    pages = {x--x},
    year = {2010},
    }
  • Download the file: pdf

Automatic Generation of Music for Inducing Emotive Response

  • Authors: Kristine Monteith and Tony R. Martinez and Dan Ventura
  • Reference: In Proceedings of International Conference on Computational Creativity, ICCC-X, pages 140–149, 2010.
  • BibTeX:
    @inproceedings{Kristine.ICCC10,
    author = {Monteith, Kristine and Martinez, Tony R. and Ventura, Dan},
    title = {Automatic Generation of Music for Inducing Emotive Response},
    booktitle = {Proceedings of International Conference on Computational Creativity, ICCC-X},
    pages = {140--149},
    year = {2010},
    }
  • Download the file: pdf

Using Multiple Measures to Predict Confidence in Instance Classification

  • Authors: Kristine Monteith and Tony R. Martinez
  • Reference: In To appear in Proceedings of IJCNN 2010, pages x–x, 2010.
  • BibTeX:
    @inproceedings{Kristine.IJCNN10,
    author = {Monteith, Kristine and Martinez, Tony R.},
    title = {Using Multiple Measures to Predict Confidence in Instance Classification},
    booktitle = {To appear in Proceedings of IJCNN 2010},
    pages = {x--x},
    year = {2010},
    }
  • Download the file: pdf

Weighted Instance Typicality Search (WITS): A Nearest Neighbor Data Reduction Algorithm

  • Authors: Brent D. Morring and Tony R. Martinez
  • Abstract: Two disadvantages of the standard nearest neighbor algorithm are 1) it must store all the instances of the training set, thus creating a large memory footprint and 2) it must search all the instances of the training set to predict the classification of a new query point, thus it is slow at run time. Much work has been done to remedy these shortcomings. This paper presents a new algorithm WITS (Weighted-Instance Typicality Search) and a modified version, Clustered-WITS (C-WITS), designed to address these issues. Data reduction algorithms address both issues by storing and using only a portion of the available instances. WITS is an incremental data reduction algorithm with O(n2) complexity, where n is the training set size. WITS uses the concept of Typicality in conjunction with Instance-Weighting to produce minimal nearest neighbor solutions. WITS and C-WITS are compared to three other state of the art data reduction algorithms on ten real-world datasets. WITS achieved the highest average accuracy, showed fewer catastrophic failures, and stored an average of 71% fewer instances than DROP-5, the next most competitive algorithm in terms of accuracy and catastrophic failures. The C-WITS algorithm provides a user-defined parameter that gives the user control over the training-time vs. accuracy balance. This modification makes C-WITS more suitable for large problems, the very problems data reductions algorithms are designed for. On two large problems (10,992 and 20,000 instances), C-WITS stores only a small fraction of the instances (0.88% and 1.95% of the training data) while maintaining generalization accuracies comparable to the best accuracies reported for these problems.
  • Reference: Intelligent Data Analysis, volume 8 (1), pages 61–78, 2004.
  • BibTeX:
    @article{MorringIDA,
    author = {Morring, Brent D. and Martinez, Tony R.},
    title = {Weighted Instance Typicality Search ({WITS}): A Nearest Neighbor Data Reduction Algorithm},
    journal = {Intelligent Data Analysis},
    volume = {8},
    number = {1},
    pages = {61--78},
    year = {2004},
    }
  • Download the file: pdf

Improving the Separability of a Reservoir Facilitates Learning Transfer

  • Authors: David Norton and Dan Ventura
  • Abstract: We use a type of reservoir computing called the liquid state machine (LSM) to explore learning transfer. The Liquid State Machine (LSM) is a neural network model that uses a reservoir of recurrent spiking neurons as a filter for a readout function. We develop a method of training the reservoir, or liquid, that is not driven by residual error. Instead, the liquid is evaluated based on its ability to separate different classes of input into different spatial patterns of neural activity. Using this method, we train liquids on two qualitatively different types of artificial problems. Resulting liquids are shown to substantially improve performance on either problem regardless of which problem was used to train the liquid, thus demonstrating a significant level of learning transfer.
  • Reference: Proceedings of the International Joint Conference on Neural Networks, pages 2288–2293, 2009.
  • BibTeX:
    @article{norton.ijcnn09,
    author = {Norton, David and Ventura, Dan},
    title = {Improving the Separability of a Reservoir Facilitates Learning Transfer},
    journal = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {2288--2293},
    year = {2009},
    }
  • Download the file: pdf

Preparing More Effective Liquid State Machines Using Hebbian Learning

  • Authors: David Norton and Dan Ventura
  • Abstract: In Liquid State Machines, separation is a critical attribute of the liquid–which is traditionally not trained. The effects of using Hebbian learning in the liquid to improve separation are investigated in this paper. When presented with random input, Hebbian learning does not dramatically change separation. However, Hebbian learning does improve separation when presented with real-world speech data.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’06, pages 8359–8364, 2006.
  • BibTeX:
    @inproceedings{norton.ijcnn06,
    author = {Norton, David and Ventura, Dan},
    title = {Preparing More Effective Liquid State Machines Using {H}ebbian Learning},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'06},
    pages = {8359--8364},
    year = {2006},
    }
  • Download the file: pdf

Reducing Decision Tree Ensemble Size using Parallel Decision DAGs

  • Authors: Adam H. Peterson and Tony R. Martinez
  • Reference: International Journal of Artificial Intelligence Tools, volume 18 (4), pages 613–620, 2009.
  • BibTeX:
    @article{peterson.IJAIT2009,
    author = {Peterson, Adam H. and Martinez, Tony R.},
    title = {Reducing Decision Tree Ensemble Size using Parallel Decision DAGs},
    journal = {International Journal of Artificial Intelligence Tools},
    volume = {18},
    number = {4},
    pages = {613--620},
    year = {2009},
    }
  • Download the file: pdf

COD: Measuring the Similarity of Classifiers

  • Authors: Adam H. Peterson
  • Abstract: In the practice of machine learning, one must select a learning algorithm to employ for a problem. Common questions are: Are some algorithms basically the same, or are they fundamentally different? How different? Does an algorithm that solves a particular problem well exist? What algorithms should be tried for a particular problem? Could performance be improved by combining more than one solution? This research presents the COD (Classifier Output Difference) distance metric for finding similarity between classifiers and classifier families. This metric is a tool which begins to address such questions, in both theoretical and practical aspects. It goes beyond simple accuracy comparisons and provides insights about fundamental differences between learning algorithms and the effectiveness of algorithm variations, fills a niche in meta-learning, may be used to improve the effectiveness of the construction of ensemble classifiers, and can give guidance in research towards hybrid systems. This paper describes how COD works and provides examples showing its utility in providing research insights. Results from this research show that there are clearly measurable differences in the behaviors of hypotheses produced by different learning algorithms, as well as clearly measurable differences between learning paradigms.
  • Reference: Master’s thesis, Brigham Young University, January 2005.
  • BibTeX:
    @mastersthesis{peterson.thesis2004,
    author = {Peterson, Adam H.},
    title = {{COD}: Measuring the Similarity of Classifiers},
    school = {Brigham Young University},
    month = {January},
    year = {2005},
    }
  • Download the file: pdf, ps

Estimating The Potential For Combining Learning Models

  • Authors: Adam H. Peterson and Tony R. Martinez
  • Reference: In Proceedings of the ICML Workshop on Meta-Learning, pages 68–75, 2005.
  • BibTeX:
    @inproceedings{peterson.icmlws05,
    author = {Peterson, Adam H. and Martinez, Tony R.},
    title = {Estimating The Potential For Combining Learning Models},
    booktitle = {Proceedings of the ICML Workshop on Meta-Learning},
    pages = {68--75},
    year = {2005},
    }
  • Download the file: pdf

Real-time Automatic Price Prediction for eBay Online Trading

  • Authors: Ilya Raykhel and Dan Ventura
  • Abstract: We develop a system for attribute-based prediction of final (online) auction pricing, focusing on the eBay laptop category. The system implements a feature-weighted k-NN algorithm, using evolutionary computation to determine feature weights, with prior trades used as training data. The resulting average prediction error is 16%. Mostly automatic trading using the system greatly reduces the time a reseller needs to spend on trading activities, since the bulk of market research is now done automatically with the help of the learned model. The result is a 562% increase in trading efficiency (measured as profit/hour).
  • Reference: In Proceedings of the Innovative Applications of Artificial Intelligence Conference, pages 135–140, July 2009.
  • BibTeX:
    @inproceedings{raykhel.iaai09,
    author = {Raykhel, Ilya and Ventura, Dan},
    title = {Real-time Automatic Price Prediction for eBay Online Trading},
    booktitle = {Proceedings of the Innovative Applications of Artificial Intelligence Conference},
    pages = {135--140},
    month = {July},
    year = {2009},
    }
  • Download the file: pdf

Choosing a Starting Configuration for Particle Swarm Optimization

  • Authors: Mark Richards and Dan Ventura
  • Abstract: The performance of Particle Swarm Optimization can be improved by strategically selecting the starting positions of the particles. This work suggests the use of generators from centroidal Voronoi tessellations as the starting points for the swarm. The performance of swarms initialized with this method is compared with the standard PSO algorithm on several standard test functions. Results suggest that CVT initialization improves PSO performance in high-dimensional spaces.
  • Reference: Proceedings of the Joint Conference on Neural Networks, pages 2309–2312, July 2004.
  • BibTeX:
    @article{richards.ijcnn04,
    author = {Richards, Mark and Ventura, Dan},
    title = {Choosing a Starting Configuration for Particle Swarm Optimization},
    journal = {Proceedings of the Joint Conference on Neural Networks},
    pages = {2309--2312},
    month = {July},
    year = {2004},
    }
  • Download the file: pdf

Dynamic Sociometry in Particle Swarm Optimization

  • Authors: Mark Richards and Dan Ventura
  • Abstract: The performance of Particle Swarm Optimization is greatly affected by the size and sociometry of the swarm. This research proposes a dynamic sociometry, which is shown to be more effective on some problems than the standard star and ring sociometries. The performance of various combinations of swarm size and sociometry on six different test functions is qualitatively analyzed.
  • Reference: Proceedings of the Joint Conference on Information Sciences, pages 1557–1560, September 2003.
  • BibTeX:
    @article{richards.jcis03,
    author = {Richards, Mark and Ventura, Dan},
    title = {Dynamic Sociometry in Particle Swarm Optimization},
    journal = {Proceedings of the Joint Conference on Information Sciences},
    pages = {1557--1560},
    month = {September},
    year = {2003},
    }
  • Download the file: pdf

Classification-based Objective Functions

  • Authors: Michael E. Rimer and Tony R. Martinez
  • Reference: Machine Learning, March 2006.
  • BibTeX:
    @article{rimer.ml2006,
    author = {Rimer, Michael E. and Martinez, Tony R.},
    title = {Classification-based Objective Functions},
    journal = {Machine Learning},
    month = {March},
    year = {2006},
    }
  • Download the file: pdf

CB3: An Adaptive Error Function for Backpropagation Training

  • Authors: Michael E. Rimer and Tony R. Martinez
  • Reference: Neural Processing Letters, volume 24 (1), pages 81–92, 2006.
  • BibTeX:
    @article{rimer.npl2006,
    author = {Rimer, Michael E. and Martinez, Tony R.},
    title = {{CB3}: An Adaptive Error Function for Backpropagation Training},
    journal = {Neural Processing Letters},
    volume = {24},
    number = {1},
    pages = {81--92},
    year = {2006},
    }
  • Download the file: pdf

Softprop: Softmax Neural Network Backpropagation Learning

  • Authors: Michael E. Rimer and Tony R. Martinez
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’04, pages 979–984, 2004.
  • BibTeX:
    @inproceedings{rimer.ijcnn2004.softprop,
    author = {Rimer, Michael E. and Martinez, Tony R.},
    title = {Softprop: Softmax Neural Network Backpropagation Learning},
    booktitle = {Proceedings of the IEEE International Joint Conference on Neural Networks {IJCNN}'04},
    pages = {979--984},
    year = {2004},
    }
  • Download the file: pdf

Lazy Training: Interactive Classification Learning

  • Authors: Michael E. Rimer
  • Reference: Master’s thesis, Brigham Young University, April 2002.
  • BibTeX:
    @mastersthesis{rimer.thesis2002,
    author = {Rimer, Michael E.},
    title = {Lazy Training: Interactive Classification Learning},
    school = {Brigham Young University},
    month = {April},
    year = {2002},
    }
  • Download the file: ps, pdf

Improving Speech Recognition Learning through Lazy Training

  • Authors: Michael E. Rimer and Tony R. Martinez and D. Randall Wilson
  • Abstract: Backpropagation, like most high-order learning algorithms, is prone to overfitting. We present a novel approach, called lazy training, for reducing the overfit in multiple-output networks. Lazy training has been shown to reduce the error of optimized neural networks more than half on a large OCR data set and on several problems from the UCI machine learning database. Here, lazy training is shown to be effective in a multi-layered adaptive learning system, reducing the error of an optimized backpropagation network in a speech recognition system by 55.0% on the TIDIGITS corpus.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’02, pages 2568–2573, 2002.
  • BibTeX:
    @inproceedings{rimer.ijcnn02,
    author = {Rimer, Michael E. and Martinez, Tony R. and Wilson, D. Randall},
    title = {Improving Speech Recognition Learning through Lazy Training},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'02},
    pages = {2568--2573},
    year = {2002},
    }
  • Download the file: ps

Improving Backpropagation Ensembles through Lazy Training

  • Authors: Michael E. Rimer and Tim L. Andersen and Tony R. Martinez
  • Abstract: Backpropagation, similar to most high-order learning algorithms, is prone to overfitting. We address this issue by introducing interactive training (IT), a logical extension to backpropagation training that employs interaction among multiple networks. This method is based on the theery that centralized control is more effective for learning in deep problem spaces in a multigent paradigm. IT methods allow networks to work together to form more complex systems while not restraining their individual ability to specialize. Lazy training, an implementation of IT that minimizes misclassification error, is presented. Lazy training discourages overfitting and is conducive to higher accuracy in multiclass problems than standard backpropagation. Experiments on a large, real world OCR data set have shown interactive training to significantly increase generalization accuracy, from 97.86% to 99.11%. These results are supported by theoretical and conceptual extensions from algorithmic to interactive training models.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’01, pages 2007–2012, 2001.
  • BibTeX:
    @inproceedings{rimer.ijcnn01a,
    author = {Rimer, Michael E. and Andersen, Tim L. and Martinez, Tony R.},
    title = {Improving Backpropagation Ensembles through Lazy Training},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'01},
    pages = {2007--2012},
    year = {2001},
    }
  • Download the file: ps

Speed Training: Improving Learning Speed for Large Data Sets

  • Authors: Michael E. Rimer and Tim L. Andersen and Tony R. Martinez
  • Abstract: Artificial neural networks provide an effective empirical predictive model for pattern classification. However, using complex neural networks to learn very large training sets is often problematic, imposing prohibitive time constraints on the training process. We present four practical methods for dramatically decreasing training time through dynamic stochastic sample presentation, a technique we call speed training. These methods are shown to be robust to retaining generalization accuracy over a diverse collection of real world data sets. In particular, the SET technique achieves a training speedup of 4278% on a large OCR database with no detectable loss in generalization.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’01, pages 2662–2666, 2001.
  • BibTeX:
    @inproceedings{rimer.ijcnn01b,
    author = {Rimer, Michael E. and Andersen, Tim L. and Martinez, Tony R.},
    title = {Speed Training: Improving Learning Speed for Large Data Sets},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'01},
    pages = {2662--2666},
    year = {2001},
    }
  • Download the file: ps

A Transformation Strategy for Implementing Distributed Multilayer Feedfoward Networks: Backpropagation Transformation

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: Most Artificial Neural Networks (ANNs) have a fixed topology during learning, and often suffer from a number of short-comings as a result. Variations of ANNs that use dynamic topologies have shown ability to overcome many of these problems. This paper introduces Location-Independent Transformations (LITs) as a general strategy for implementing distributed feed forward networks that use dynamic topologies (dynamic ANNs) efficiently in parallel hardware. A LIT creates a set of location-independent nodes, where each node computes its part of the network output independent of other nodes, using local information. This type of transformation allows efficient support for adding and deleting nodes dynamically during learning. In particular, this paper presents a LIT that supports both the standard (static) multilayer backpropagation network, and backpropagation with dynamic extensions. The complexity of both learning and execution algorithms is O(q(N+logM)) for a single pattern, where q is the number of weight layers in the original network, Nis the number of nodes in the widest node layer in the original network, and M is the number of nodes in the transformed network (which is linear in the number hidden nodes in the original network). This paper extends previous work with 2-weight-layer backpropagation networks.
  • Reference: Future Generation Computer Systems, volume 6, pages 547–564, 1997.
  • BibTeX:
    @article{rudolph_97,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {A Transformation Strategy for Implementing Distributed Multilayer Feedfoward Networks: Backpropagation Transformation},
    journal = {Future Generation Computer Systems},
    volume = {6},
    pages = {547--564},
    year = {1997},
    }
  • Download the file: ps

LIA: A Location-Independent Transformation for ASOCS Adaptive Algorithm 2

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: Most Artificial Neural Networks (ANNs) have a fixed topology during learning, and often suffer from a number of short-comings as a result. ANNs that use dynamic topologies have shown ability to overcome many of these problems. Adaptive Self Organizing Concurrent Systems (ASOCS) are a class of learning models with inherently dynamic topologies. This paper introduces Location-Independent Transformations (LITs) as a general strategy for implementing learning models that use dynamic topologies efficiently in parallel hardware. A LIT creates a set of location-independent nodes, where each node computes its part of the network output independent of other nodes, using local information. This type of transformation allows efficient support for adding and deleting nodes dynamically during learning. In particular, this paper presents the Location-Independent ASOCS (LIA) model as a LIT for ASOCS Adaptive Algorithm 2. The description of LIA gives formal definitions for LIA algorithms. Because LIA implements basic ASOCS mechanisms, these definitions provide a formal description of basic ASOCS mechanisms in general, in addition to LIA.
  • Reference: International Journal of Neural Systems, 1996.
  • BibTeX:
    @article{rudolph.lia96,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {{LIA}: A Location-Independent Transformation for {ASOCS} Adaptive Algorithm 2},
    journal = {International Journal of Neural Systems},
    year = {1996},
    }
  • Download the file: ps

Location-Independent Neural Network Models

  • Authors: George L. Rudolph
  • Abstract: Neural networks that use a static topology, i.e. a topology that remains fixed throughout learning, suffer from a number of short-comings. Current research is demonstrating the use of dynamic topologies in overcoming some of these problems. The Location-Independent Transformation (LIT) is a general strategy for implementing in hardware neural networks with static and dynamic topologies. LIT maps an arbitrary neural network onto a uniform tree topology which uses global broadcast and gather operations for communication. This dissertation formally defined LITs with associated execution and learning algorithms for static and dynamic versions of competitive learning, counterpropagation, 2-layer backpropagation, and multilayer feedforward networks. An LIT for ASOCS AA2 model (which is only dynamic) is also described. This is a representative set of neural network models, not an exhaustive list, providing potential guidelines for implementing other models of interest.
  • Reference: PhD thesis, Brigham Young University, Computer Science Department, August 1995.
  • BibTeX:
    @phdthesis{rudolph_diss,
    author = {Rudolph, George L.},
    title = {Location-Independent Neural Network Models},
    school = {Brigham Young University},
    address = {Computer Science Department},
    month = {August},
    year = {1995},
    }
  • Download the file: ps

An Efficient Transformation for Implementing Two-layer Feedforward Neural Networks

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: Most Artificial Neural Networks (ANNs) have a fixed topology during learning, and often suffer from a number of short-comings as a result. Variations of ANNs that use dynamic topologies have shown ability to overcome many of these problems. This paper introduces Location-Independent Transformations (LITs) as a general strategy for implementing distributed feedforward networks that use dynamic topologies (dynamic ANNs) efficiently in parallel hardware. A LIT creates a set of location-independent nodes, where each node computes its part of the network output independent of other nodes, using local information. This type of transformation allows efficient support for adding and deleting nodes dynamically during learning. In particular, this paper presents an LIT for dynamic Backpropagation networks with a single hidden layer. The complexity of both learning and execution algorithms is O(n+p+logm) for a single pattern, where n is the number of inputs, p is the number of outputs, and m is the number of hidden nodes in the original network.
  • Reference: Journal of Artificial Neural Networks, volume 3, pages 263–282, 1995.
  • BibTeX:
    @article{rudolph_95a,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {An Efficient Transformation for Implementing Two-layer Feedforward Neural Networks},
    journal = {Journal of Artificial Neural Networks},
    volume = {3},
    pages = {263--282},
    year = {1995},
    }
  • Download the file: ps

A Transformation for Implementing Localist Neural Networks

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: Most Artificial Neural Networks (ANNs) have a fixed topology during learning, and typically suffer from a number of short-comings as a result. Variations of ANNs that use dynamic topologies have shown ability to overcome many of these problems. This paper introduces Location-Independent Transformations (LITs) as a general strategy for parallel implementation of feedforward networks that use dynamic topologies. A LIT creates a set of location-independent nodes, where each node computes its part of the network output independent of other nodes, using local information. This type of transformation allows efficient support for adding and deleting nodes dynamically during learning. This paper deals specifically with LITs for localist ANNs–localist in the sense that ultimately one node is responsible for each output. In particular, this paper presents LITs for two ANNs: a) the single-layer competitive learning network, and b) the counterpropagation network, which combines elements of supervised learning with competitive learning. The complexity of both learning and execution algorithms for both ANNs is linear in the number of inputs and logarithmic in the number of nodes in the original network.
  • Reference: Neural Parallel and Scientific Computations, volume 2, pages 173–188, 1995.
  • BibTeX:
    @article{rudolph_95b,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {A Transformation for Implementing Localist Neural Networks},
    journal = {Neural Parallel and Scientific Computations},
    volume = {2},
    pages = {173--188},
    year = {1995},
    }
  • Download the file: ps

A Transformation for Implementing Efficient Dynamic Backpropagation Neural Networks

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: Most Artificial Neural Networks (ANNs) have a fixed topology during learning, and often suffer from a number of short-comings as a result. Variations of ANNs that use dynamic topologies have shown ability to overcome many of these problems. This paper introduces Location-Independent Transformations (LITs) as a general strategy for implementing distributed feed-forward networks that use dynamic topologies (dynamic ANNs) efficiently in parallel hardware. A LIT creates a set of location-independent nodes, where each node computes its part of the network output independent of other nodes, using local information. This type of transformation allows efficient support for adding and deleting nodes dynamically during learning. In particular, this paper presents an LIT for standard Backpropagation with two layers of weights, and shows how dynamic extensions to Backpropagation can be supported.
  • Reference: In Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms, pages 41–44, 1995.
  • BibTeX:
    @inproceedings{rudolph_95c,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {A Transformation for Implementing Efficient Dynamic Backpropagation Neural Networks},
    booktitle = {Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms},
    pages = {41--44},
    year = {1995},
    }
  • Download the file: ps

A Transformation for Implementing Neural Networks with Localist Properties

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: Most Artificial Neural Networks (ANNs) have a fixed topology during learning, and typically suffer from a number of short-comings as a result. Variations of ANNs that use dynamic topologies have shown ability to overcome many of these problems. This paper introduces Location-Independent Transformations (LITs) as a general strategy for implementing feedforward networks that use dynamic topologies. A LIT creates a set of location-independent nodes, where each node computes its part of the network output independent of other nodes, using local information. This type of transformation allows efficient support for adding and deleting nodes dynamically during learning. In particular, this paper presents LITs for the single-layer competitve learning network, and the counterpropagation network, which combines elements of supervised learning with competitive learning. These two networks are localist in the sense that ultimately one node is responsible for each output. LITs for other models are presented in other papers.
  • Reference: Yfantis, Evangelos A., editor, Intelligent Systems, volume 1, pages 637–645, Kluwer Academic Publishers, 1995.
  • BibTeX:
    @article{rudolph_95d,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {A Transformation for Implementing Neural Networks with Localist Properties},
    editor = {Yfantis, Evangelos A.},
    journal = {Intelligent Systems},
    volume = {1},
    pages = {637--645},
    publisher = {Kluwer Academic Publishers},
    year = {1995},
    }
  • Download the file: ps

Location-Independent Transformations: A General Strategy for Implementing Neural Networks

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: Most Artificial Neural Networks (ANNs) have a fixed topology during learning, and typically suffer from a number of short-comings as a result. Variations of ANNs that use dynamic topologies have shown ability to overcome many of these problems. This paper introduces Location-Independent Transformations (LITs) as a general strategy for implementing neural networks that use static and dynamic topologies. A LIT creates a set of location-independent nodes, where each node computes its part of the network output independent of other nodes, using local information. This type of transformation allows efficient support for adding and deleting nodes dynamically during learning. Two simple networks, the single-layer competitive learning network, and the counterpropagation network, which combines elements of supervised learning with competitive learning, are used in this paper to illustrate the LIT strategy. These two networks are localist in the sense that ultimately one node is responsible for each output. LITs for other models are presented in other papers.
  • Reference: International Journal on Artificial Intelligence Tools, volume 3, pages 417–427, 1994.
  • BibTeX:
    @article{rudolph_94a,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {Location-Independent Transformations: A General Strategy for Implementing Neural Networks},
    journal = {International Journal on Artificial Intelligence Tools},
    volume = {3},
    pages = {417--427},
    year = {1994},
    }
  • Download the file: ps

A VLSI Implementation of a Parallel Self-Organizing Learning Model

  • Authors: Matthew Stout and George L. Rudolph and Tony R. Martinez and Linton Salmon
  • Abstract: This paper presents a VLSI implementation of the Priority Adaptive Self-Organizing Concurrent System (PASOCS) learning model that is built using a multi-chip module (MCM) substrate. Many current hardware implementations of neural network learning models are direct implementations of classical neural network structures—a large number of simple computing nodes connected by a dense number of weighted links. PASOCS is one of a class of ASOCS (Adaptive Self-Organizing Concurrent System) connectionist models whose overall goal is the same as classical neural networks models, but whose functional mechanisms differ significantly. This model has potential application in areas such as pattern recognition, robotics, logical inference, and dynamic control.
  • Reference: In Proceedings of the 12th International Conference on Pattern Recognition, volume 3, pages 373–376, 1994.
  • BibTeX:
    @inproceedings{stout_icpr,
    author = {Stout, Matthew and Rudolph, George L. and Martinez, Tony R. and Salmon, Linton},
    title = {A {VLSI} Implementation of a Parallel Self-Organizing Learning Model},
    booktitle = {Proceedings of the 12th International Conference on Pattern Recognition},
    volume = {3},
    pages = {373--376},
    year = {1994},
    }
  • Download the file: ps

A Multi-Chip Module Implementation of a Neural Network

  • Authors: Matthew Stout and Linton Salmon and George L. Rudolph and Tony R. Martinez
  • Abstract: The requirement for dense interconnect in artificial neural network systems has led researchers to seek high-density interconnect technologies. This paper reports an implementation using multi-chip modules (MCMs) as the interconnect medium. The specific system described is a self-organizing, parallel, and dynamic learning model which requires a dense interconnect technology for effective implementation; this requirement is fulfilled by exploiting MCM technology. The ideas presented in this paper regarding an MCM implementation of artificial neural networks are versatile and can be adapted to apply to other neural network and connectionist models.
  • Reference: In Proceedings of the IEEE Multi-Chip Module Conference MCMC-94, pages 20–25, 1994.
  • BibTeX:
    @inproceedings{stout_mcm,
    author = {Stout, Matthew and Salmon, Linton and Rudolph, George L. and Martinez, Tony R.},
    title = {A Multi-Chip Module Implementation of a Neural Network},
    booktitle = {Proceedings of the {IEEE} Multi-Chip Module Conference {MCMC}-94},
    pages = {20--25},
    year = {1994},
    }
  • Download the file: ps

A Location-Independent ASOCS Model

  • Authors: George L. Rudolph
  • Abstract: This thesis proposes a new Adaptive Self-Organizing Concurrent System (ASOCS) model called Location-Independent ASOCS (LIA). The location-independent properties of LIA give it improved potential for implementation using current technology over previous ASOCS models. This thesis focuses on an architectural model and execution as well as on learning algorithms for LIA. Theoretical background, an analysis of LIA, and a brief comparison with other ASOCS models also are presented. The thesis does not discuss details of a hardware implementation of LIA beyond the model level.
  • Reference: Master’s thesis, Brigham Young University, June 1991.
  • BibTeX:
    @mastersthesis{rudolph_th,
    author = {Rudolph, George L.},
    title = {A Location-Independent {ASOCS} Model},
    school = {Brigham Young University},
    month = {June},
    year = {1991},
    }
  • Download the file: ps

An Efficient Static Topology for Modeling ASOCS

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: ASOCS (Adaptive Self-Organizing Concurrent Systems) are a class of connectionist computational models which feature self-organized learning and parallel execution. One area of ASOCS research is the development of an efficient implementation of ASOCS using current technology. A result of this research is the Location-Independent ASOCS model (LIA). LIA uses a parallel, asynchronous network of independent nodes, which leads to an efficient physical realization using current technology. This paper reviews the behavior of the LIA model, and shows how a static binary tree topology efficiently supports that behavior. The binary tree topology allows for O(log(n)) learning and execution times, where n is the number of nodes in the network.
  • Reference: In Kohonen, et. al., editor, Artificial Neural Networks, pages 729–734, 1991. Elsevier Science Publishers.
  • BibTeX:
    @inproceedings{rudolph_91a,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {An Efficient Static Topology for Modeling {ASOCS}},
    editor = {Kohonen, et. al.},
    booktitle = {Artificial Neural Networks},
    pages = {729--734},
    publisher = {Elsevier Science Publishers},
    year = {1991},
    }
  • Download the file: ps

DNA: A New ASOCS Model With Improved Implementation Potential

  • Authors: George L. Rudolph and Tony R. Martinez
  • Abstract: A new class of high-speed, self-adaptive, massively parallel computing models called ASOCS (Adaptive Self-Organizing Concurrent Systems) has been proposed. Current analysis suggests that there may be problems implementing ASOCS models in VLSI using the hierarchical network structures originally proposed. The problems are not inherent in the models, but rather in the technology used to implement them. This has led to the development of a new ASOCS model called DNA (Discriminant-Node ASOCS) that does not depend on a hierarchical node structure for success. Three areas of the DNA model are briefly discussed in this paper: DNA’s flexible nodes, how DNA overcomes problems other models have allocating unused nodes, and how DNA operates during processing and learning.
  • Reference: In Proceedings of the IASTED International Symposium on Expert Systems and Neural Networks, pages 12–15, 1989.
  • BibTeX:
    @inproceedings{rudolph_89a,
    author = {Rudolph, George L. and Martinez, Tony R.},
    title = {{DNA}: A New {ASOCS} Model With Improved Implementation Potential},
    booktitle = {Proceedings of the {IASTED} International Symposium on Expert Systems and Neural Networks},
    pages = {12--15},
    year = {1989},
    }
  • Download the file: ps

An Instance Level Analysis of Data Complexity

  • Authors: Michael R. Smith and Tony R. Martinez and Christophe Giraud-Carrier
  • Abstract: Most data complexity studies have focused on characterizing the complexity of the entire data set and do not provide information about individual instances. Knowing which instances are misclassified and understanding why they are misclassified and how they contribute to data set complexity can improve the learning process and could guide the future development of learning algorithms and data analysis methods. The goal of this paper is to better understand the data used in machine learning problems by identifying and analyzing the instances that are frequently misclassified by learning algorithms that have shown utility to date and are commonly used in practice. We identify instances that are hard to classify correctly (instance hardness) by classifying over 190,000 instances from 64 data sets with 9 learning algorithms. We then use a set of hardness measures to understand why some instances are harder to classify correctly than others. We find that class overlap is a principal contributor to instance hardness. We seek to integrate this information into the training process to alleviate the effects of class overlap and present ways that instance hardness can be used to improve learning.
  • Reference: Machine Learning, 2013.
  • BibTeX:
    @article{smith.ml2013,
    author = {Smith, Michael R. and Martinez, Tony R. and Giraud-Carrier, Christophe},
    title = {An Instance Level Analysis of Data Complexity},
    journal = {Machine Learning},
    year = {2013},
    }

Improving Classification Accuracy by Identifying and Removing Instances that Should Be Misclassified

  • Authors: Michael R. Smith and Tony Martinez
  • Abstract: Appropriately handling noise and outliers is an important issue in data mining. In this paper we examine how noise and outliers are handled by learning algorithms. We introduce a filtering method called PRISM that identifies and removes instances that should be misclassified. We refer to the set of removed instances as ISMs (instances that should be misclassified). We examine PRISM and compare it against 3 existing outlier detection methods and 1 noise reduction technique on 48 data sets using 9 learning algorithms. Using PRISM the classification accuracy increases from 78.5% to 79.8% on a set of 53 data sets and is statistically significant. In addition, the accuracy on the non-outlier instances increases from 82.8% to 84.7%. PRISM achieves a higher classification accuracy than the outlier detection methods and compares favorably with the noise reduction method.
  • Reference: In To appear Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2011), pages x–x, August 2011.
  • BibTeX:
    @inproceedings{smith.ijcnn2011,
    author = {Smith, Michael R. and Martinez, Tony},
    title = {Improving Classification Accuracy by Identifying and Removing Instances that Should Be Misclassified},
    booktitle = {To appear Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2011)},
    pages = {x--x},
    month = {August},
    year = {2011},
    }
  • Download the file: pdf

Time Series Gene Expression Prediction using Neural Networks with Hidden Layers

  • Authors: Michael R. Smith and Mark Clement and Tony Martinez and Quinn Snell
  • Abstract: A central issue to systems biology is modeling how genes interact with each other. The non-linear relationships between genes and feedback loops in the network makes modeling gene regulatory networks (GRNs) a difficult problem. In this paper we examine modeling GRNs using neural networks (NNs) with hidden layers to predict gene expression levels. Some assume that single layer NNs are sufficient to model GRNs. However, NNs without hidden layers are only capable of modeling first order correlations. We find that the addition of a hidden layer is better able to model gene expression data than a NN without a hidden layer and that the addition of a hidden layer results in a lower error than previous models on a benchmark data set. Recurrent NNs have also been used to handle the feedback in GRNs and the temporal nature of the data, but they lack a formal training method. To incorporate time without recurrence, we propose a novel approach to combining time series data to generate additional training points. The additional training points also address the problem that generally only a limited number of noisy data points are available to build the model. The additional data points have a smoothing effect on the predictions providing the overall trend of each gene. The error values for models trained using the additional training points are competitive to those from recurrent NNs.
  • Reference: In Proceedings of the 7th Annual Biotechnology and Bioinformatics Symposium (BIOT 2010), pages 67–69, October 2010.
  • BibTeX:
    @inproceedings{smith_2010biot,
    author = {Smith, Michael R. and Clement, Mark and Martinez, Tony and Snell, Quinn},
    title = {Time Series Gene Expression Prediction using Neural Networks with Hidden Layers},
    booktitle = {Proceedings of the 7th Annual Biotechnology and Bioinformatics Symposium (BIOT 2010)},
    pages = {67--69},
    month = {October},
    year = {2010},
    }
  • Download the file: pdf

Super-Resolution via Recapture and Bayesian Effect Modeling

  • Authors: Neil Toronto and Bryan Morse and Kevin Seppi and Dan Ventura
  • Abstract: This paper presents Bayesian edge inference (BEI), a single-frame super-resolution method explicitly grounded in Bayesian inference that addresses issues common to existing methods. Though the best give excellent results at modest magnification factors, they suffer from gradient stepping and boundary coherence problems by factors of 4x. Central to BEI is a causal framework that allows image capture and recapture to be modeled differently, a principled way of undoing downsampling blur, and a technique for incorporating Markov random field potentials arbitrarily into Bayesian networks. Besides addressing gradient and boundary issues, BEI is shown to be competitive with existing methods on published correctness measures. The model and framework are shown to generalize to other reconstruction tasks by demonstrating BEI’s effectiveness at CCD demosaicing and inpainting with only trivial changes.
  • Reference: In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, page to appear, June 2009.
  • BibTeX:
    @inproceedings{toronto.cvpr09,
    author = {Toronto, Neil and Morse, Bryan and Seppi, Kevin and Ventura, Dan},
    title = {Super-Resolution via Recapture and Bayesian Effect Modeling},
    booktitle = {Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition},
    pages = {to appear},
    month = {June},
    year = {2009},
    }
  • Download the file: pdf

The Hough Transform’s Implicit Bayesian Foundation

  • Authors: Neil Toronto and Bryan Morse and Dan Ventura and Kevin Seppi
  • Abstract: This paper shows that the basic Hough transform is implicitly a Bayesian process—that it computes an unnormalized posterior distribution over the parameters of a single shape given feature points. The proof motivates a purely Bayesian approach to the problem of finding parameterized shapes in digital images. A proof-of-concept implementation that finds multiple shapes of four parameters is presented. Extensions to the basic model that are made more obvious by the presented reformulation are discussed.
  • Reference: In Proceedings of the IEEE International Conference on Image Processing, pages 377–380, September 2007.
  • BibTeX:
    @inproceedings{toronto.icip07,
    author = {Toronto, Neil and Morse, Bryan and Ventura, Dan and Seppi, Kevin},
    title = {The Hough Transform's Implicit Bayesian Foundation},
    booktitle = {Proceedings of the IEEE International Conference on Image Processing},
    pages = {377--380},
    month = {September},
    year = {2007},
    }
  • Download the file: pdf

Learning Quantum Operators From Quantum State Pairs

  • Authors: Neil Toronto and Dan Ventura
  • Abstract: Developing quantum algorithms has proven to be very difficult. In this paper, the concept of using classical machine learning techniques to derive quantum operators from examples is presented. A gradient descent algorithm for learning unitary operators from quantum state pairs is developed as a starting point to aid in developing quantum algorithms. The algorithm is used to learn the quantum Fourier transform, an underconstrained two-bit function, and Grover’s iterate.
  • Reference: In IEEE World Congress on Computational Intelligence, pages 2607–2612, July 2006.
  • BibTeX:
    @inproceedings{toronto2006,
    author = {Toronto, Neil and Ventura, Dan},
    title = {Learning Quantum Operators From Quantum State Pairs},
    booktitle = {IEEE World Congress on Computational Intelligence},
    pages = {2607--2612},
    month = {July},
    year = {2006},
    }
  • Download the file: pdf

Edge Inference for Image Interpolation

  • Authors: Neil Toronto and Dan Ventura and Bryan S. Morse
  • Abstract: Image interpolation algorithms try to fit a function to a matrix of samples in a "natural-looking" way. This paper presents edge inference, an algorthm that does this by mixing neural network regression with standard image interpolation techniques. Results on gray level images are presented. Extension into RGB color space and additional applications of the algorithm are discussed.
  • Reference: In International Joint Conference on Neural Networks, pages 1782–1787, 2005.
  • BibTeX:
    @inproceedings{ntoronto-ijcnn05,
    author = {Toronto, Neil and Ventura, Dan and Morse, Bryan S.},
    title = {Edge Inference for Image Interpolation},
    booktitle = {International Joint Conference on Neural Networks},
    pages = {1782--1787},
    year = {2005},
    }
  • Download the file: pdf

Adapting ADtrees for High Arity Features

  • Authors: Rob Van Dam and Irene Geary and Dan Ventura
  • Abstract: ADtrees, a data structure useful for caching sufficient statistics, have been successfully adapted to grow lazily when memory is limited and to update sequentially with an incrementally updated dataset. For low arity symbolic features, ADtrees trade a slight increase in query time for a reduction in overall tree size. Unfortunately, for high arity features, the same technique can often result in a very large increase in query time and a nearly negligible tree size reduction. In the dynamic (lazy) version of the tree, both query time and tree size can increase for some applications. Here we present two modifications to the ADtree which can be used separately or in combination to achieve the originally intended space-time tradeoff in the ADtree when applied to datasets containing very high arity features.
  • Reference: In Proceedings of the Association for the Advancement of Artificial Intelligence, pages 708–713, July 2008.
  • BibTeX:
    @inproceedings{vandam.aaai08,
    author = {Van Dam, Rob and Geary, Irene and Ventura, Dan},
    title = {Adapting {AD}trees for High Arity Features},
    booktitle = {Proceedings of the Association for the Advancement of Artificial Intelligence},
    pages = {708--713},
    month = {July},
    year = {2008},
    }
  • Download the file: pdf

ADtrees for Sequential Data and N-gram Counting

  • Authors: Rob Van Dam and Dan Ventura
  • Abstract: We consider the problem of efficiently storing n-gram counts for large n over very large corpora. In such cases, the efficient storage of sufficient statistics can have a dramatic impact on system performance. One popular model for storing such data derived from tabular data sets with many attributes is the ADtree. Here, we adapt the ADtree to benefit from the sequential structure of corpora-type data. We demonstrate the usefulness of our approach on a portion of the well-known Wall Street Journal corpus from the Penn Treebank and show that our approach is exponentially more efficient than the naïve approach to storing n-grams and is also significantly more efficient than a traditional prefix tree.
  • Reference: In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 492–497, October 2007.
  • BibTeX:
    @inproceedings{vandam.smc07,
    author = {Van Dam, Rob and Ventura, Dan},
    title = {{AD}trees for Sequential Data and N-gram Counting},
    booktitle = {Proceedings of the {IEEE} International Conference on Systems, Man and Cybernetics},
    pages = {492--497},
    month = {October},
    year = {2007},
    }
  • Download the file: pdf

Robust Trainability of Single Neurons

  • Authors: Klaus-Uwe Hoeffgen and Hans-Ulrich Simon and Kevin S. Van Horn
  • Abstract: It is well known that (McCulloch-Pitts) neurons are efficiently trainable to learn an unknown halfspace from examples, using linear-programming methods. We want to analyze how the learning performance degrades when the representational power of the neuron is overstrained, i.e., if more complex concepts than just halfspaces are allowed. We show that the problem of learning a probably almost optimal weight vector for a neuron is so difficult that the minimum error cannot even be approximated to within a constant factor in polynomial time (unless RP = NP); we obtain the same hardness result for several variants of this problem. We considerably strengthen these negative results for neurons with binary weights 0 or 1. We also show that neither heuristical learning nor learning by sigmoidal neurons with a constant reject rate is efficiently possible (unless RP = NP).
  • Reference: Journal of Computer and System Sciences, volume 50 (1), pages 114–125, 1995.
  • BibTeX:
    @article{vanhorn_4,
    author = {Hoeffgen, Klaus-Uwe and Simon, Hans-Ulrich and Van Horn, Kevin S.},
    title = {Robust Trainability of Single Neurons},
    journal = {Journal of Computer and System Sciences},
    volume = {50},
    number = {1},
    pages = {114--125},
    year = {1995},
    }
  • Download the file: pdf

Learning as Optimization

  • Authors: Kevin S. Van Horn
  • Reference: PhD thesis, Brigham Young University, August 1994.
  • BibTeX:
    @phdthesis{vanhorn_6,
    author = {Van Horn, Kevin S.},
    title = {Learning as Optimization},
    school = {Brigham Young University},
    month = {August},
    year = {1994},
    }
  • Download the file: pdf

Extending Occam’s Razor

  • Authors: Kevin S. Van Horn and Tony R. Martinez
  • Abstract: Occam’s Razor states that, all other things being equal, the simpler of two possible hypotheses is to be preferred. A quantified version of Occam’s Razor has been proven for the PAC model of learning, giving sample-complexity bounds for learning using what Blumer et al. call an Occam algorithm. We prove an analog of this result for Haussler’s more general learning model, which encompasses learning in stochastic situations, learning real-valued functions, etc.
  • Reference: In Proceedings of the Third Golden West International Conference on Intelligent Systems, Las Vegas, Nevada, June 1994.
  • BibTeX:
    @inproceedings{vanhorn_5,
    author = {Van Horn, Kevin S. and Martinez, Tony R.},
    title = {Extending Occam's Razor},
    booktitle = {Proceedings of the Third Golden West International Conference on Intelligent Systems},
    address = {Las Vegas, Nevada},
    month = {June},
    year = {1994},
    }
  • Download the file: pdf

The Minimum Feature Set Problem

  • Authors: Kevin S. Van Horn and Tony R. Martinez
  • Abstract: One approach to improving the generalization power of a neural net is to try to minimize the number of non-zero weights used. We examine two issues relevant to this approach, for single-layer nets. First we bound the VC dimension of the set of linear-threshold functions that have non-zero weights for at most s of n inputs. Second, we show that the problem of minimizing the number of non-zero input weights used (without misclassifying training examples) is both NP-hard and difficult to approximate.
  • Reference: Neural Networks , volume 3, pages 491–494, 1994.
  • BibTeX:
    @article{vanhorn_3,
    author = {Van Horn, Kevin S. and Martinez, Tony R.},
    title = {The Minimum Feature Set Problem},
    journal = {Neural Networks },
    volume = {3},
    pages = {491--494},
    year = {1994},
    }
  • Download the file: pdf

The BBG Rule Induction Algorithm

  • Authors: Kevin S. Van Horn and Tony R. Martinez
  • Abstract: We present an algorithm (BBG) for inductive learning from examples that outputs a rule list. BBG uses a combination of greedy and branch-and-bound techniques, and naturally handles noisy or stochastic learning situations. We also present the results of an empirical study comparing BBG with Quinlan’s C4.5 on 1050 synthetic data sets. We find that BBG greatly outperforms C4.5 on rule-oriented problems, and equals or exceeds C4.5’s performance on tree-oriented problems.
  • Reference: In Proceedings of the 6th Australian Joint Conference on Artificial Intelligence, pages 348–355, Melbourne, Australia, November 1993.
  • BibTeX:
    @inproceedings{vanhorn_1,
    author = {Van Horn, Kevin S. and Martinez, Tony R.},
    title = {The {BBG} Rule Induction Algorithm},
    booktitle = {Proceedings of the 6th Australian Joint Conference on Artificial Intelligence},
    pages = {348--355},
    address = {Melbourne, Australia},
    month = {November},
    year = {1993},
    }
  • Download the file: pdf

The Design and Evaluation of a Rule Induction Algorithm

  • Authors: Kevin S. Van Horn and Tony R. Martinez
  • Abstract: This technical report expands on and fills in some of the details of the paper “The BBG Rule Induction Algorithm”
  • Reference: technical report, Technical Report BYU-CS-93-11, November 1993.
  • BibTeX:
    @techreport{vanhorn_2,
    author = {Van Horn, Kevin S. and Martinez, Tony R.},
    title = {The Design and Evaluation of a Rule Induction Algorithm},
    institution = {Technical Report {BYU}-{CS}-93-11},
    month = {November},
    year = {1993},
    }
  • Download the file: pdf

“A Sub-symbolic Model of the Cognitive Processes of Re-representation and Insight

  • Authors: Dan Ventura
  • Abstract: We present a sub-symbolic computational model for effecting knowledge re-representation and insight. Given a set of data, manifold learning is used to automatically organize the data into one or more representational transformations, which are then learned with a set of neural networks. The result is a set of neural filters that can be applied to new data as re-representation operators.
  • Reference: In Proceedings of ACM Creativity and Cognition, page to appear, October 2009.
  • BibTeX:
    @inproceedings{ventura.cc09,
    author = {Ventura, Dan},
    title = {“A Sub-symbolic Model of the Cognitive Processes of Re-representation and Insight},
    booktitle = {Proceedings of ACM Creativity and Cognition},
    pages = {to appear},
    month = {October},
    year = {2009},
    }
  • Download the file: pdf

A Reductio Ad Absurdum Experiment in Sufficiency for Evaluating (Computational) Creative Systems

  • Authors: Dan Ventura
  • Abstract: We consider a combination of two recent proposals for characterizing computational creativity and explore the sufficiency of the resultant framework. We do this in the form of a gedanken experiment designed to expose the nature of the framework, what it has to say about computational creativity, how it might be improved and what questions this raises.
  • Reference: In Proceedings of the International Joint Workshop on Computational Creativity, pages 11–19, September 2008.
  • BibTeX:
    @inproceedings{ventura.ijwcc08,
    author = {Ventura, Dan},
    title = {A Reductio Ad Absurdum Experiment in Sufficiency for Evaluating (Computational) Creative Systems},
    booktitle = {Proceedings of the International Joint Workshop on Computational Creativity},
    pages = {11--19},
    month = {September},
    year = {2008},
    }
  • Download the file: pdf

Data-Driven Programming and Behavior for Autonomous Virtual Characters

  • Authors: Jonathan Dinerstein and Dan Ventura and Michael Goodrich and Parris Egbert
  • Abstract: We present a high-level overview of a system for programming autonomous virtual characters by demonstration. The result is a deliberative model of agent behavior that is stylized and effective, as demonstrated in five different cases studies.
  • Reference: In Proceedings of the Association for the Advancement of Artificial Intelligence, pages 1450–1451, July 2008.
  • BibTeX:
    @inproceedings{dinerstein.aaai08,
    author = {Dinerstein, Jonathan and Ventura, Dan and Goodrich, Michael and Egbert, Parris},
    title = {Data-Driven Programming and Behavior for Autonomous Virtual Characters},
    booktitle = {Proceedings of the Association for the Advancement of Artificial Intelligence},
    pages = {1450--1451},
    month = {July},
    year = {2008},
    }
  • Download the file: pdf

Sub-symbolic Re-representation to Facilitate Learning Transfer

  • Authors: Dan Ventura
  • Abstract: We consider the issue of knowledge (re-)representation in the context of learning transfer and present a sub- symbolic approach for e?ecting such transfer. Given a set of data, manifold learning is used to automatically organize the data into one or more representational transformations, which are then learned with a set of neural networks. The result is a set of neural filters that can be applied to new data as re-representation operators. Encouraging preliminary empirical results elucidate the approach and demonstrate its feasibility, suggesting possible implications for the broader field of creativity.
  • Reference: In Creative Intelligent Systems, AAAI 2008 Spring Symposium Technical Report SS-08-03, pages 128–134, March 2008.
  • BibTeX:
    @inproceedings{ventura.aaaiss08,
    author = {Ventura, Dan},
    title = {Sub-symbolic Re-representation to Facilitate Learning Transfer},
    booktitle = {Creative Intelligent Systems, {AAAI} 2008 Spring Symposium Technical Report {SS}-08-03},
    pages = {128--134},
    month = {March},
    year = {2008},
    }

Robust Multi-Modal Biometric Fusion via SVM Ensemble

  • Authors: Sabra Dinerstein and Jon Dinerstein and Dan Ventura
  • Abstract: Existing learning-based multi-modal biometric fusion techniques typically employ a single static Support Vector Machine (SVM). This type of fusion improves the accuracy of biometric classification, but it also has serious limitations because it is based on the assumptions that the set of biometric classifiers to be fused is local, static, and complete. We present a novel multi-SVM approach to multi-modal biometric fusion that addresses the limitations of existing fusion techniques and show empirically that our approach retains good classification accuracy even when some of the biometric modalities are unavailable.
  • Reference: In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 1530–1535, October 2007.
  • BibTeX:
    @inproceedings{dinerstein.smc07,
    author = {Dinerstein, Sabra and Dinerstein, Jon and Ventura, Dan},
    title = {Robust Multi-Modal Biometric Fusion via {SVM} Ensemble},
    booktitle = {Proceedings of the {IEEE} International Conference on Systems, Man and Cybernetics},
    pages = {1530--1535},
    month = {October},
    year = {2007},
    }
  • Download the file: pdf

Learning Policies for Embodied Virtual Agents Through Demonstration

  • Authors: Jonathan Dinerstein and Parris Egbert and Dan Ventura
  • Abstract: Although many powerful AI and machine learning techniques exist, it remains difficult to quickly create AI for embodied virtual agents that produces visually lifelike behavior. This is important for applications (e.g., games, simulators, interactive displays) where an agent must behave in a manner that appears human-like. We present a novel technique for learning reactive policies that mimic demonstrated human behavior. The user demonstrates the desired behavior by dictating the agent’s actions during an interactive animation. Later, when the agent is to behave autonomously, the recorded data is generalized to form a continuous state-to-action mapping. Combined with an appropriate animation algorithm (e.g., motion capture), the learned policies realize stylized and natural-looking agent behavior. We empirically demonstrate the efficacy of our technique for quickly producing policies which result in lifelike virtual agent behavior.
  • Reference: In Proceedings of the International Joint Conference on Artificial Intelligence, pages 1257–1262, Hyderabad, India, January 2007.
  • BibTeX:
    @inproceedings{dinerstein.ijcai07,
    author = {Dinerstein, Jonathan and Egbert, Parris and Ventura, Dan},
    title = {Learning Policies for Embodied Virtual Agents Through Demonstration},
    booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence},
    pages = {1257--1262},
    address = {Hyderabad, India},
    month = {January},
    year = {2007},
    }
  • Download the file: pdf

Clustering Music via the Temporal Similarity of Timbre

  • Authors: Jake Merrell and Dan Ventura and Bryan Morse
  • Abstract: We consider the problem of measuring the similarity of streaming music content and present a method for modeling, on the fly, the temporal progression of a song’s timbre. Using a minimum distance classification scheme, we give an approach to classifying streaming music sources and present performance results for auto- associative song identification and for content-based clustering of streaming music. We discuss possible extensions to the approach and possible uses for such a system.
  • Reference: In IJCAI Workshop on Artificial Intelligence and Music, pages 153–164, January 2007.
  • BibTeX:
    @inproceedings{merrell.ijcai07,
    author = {Merrell, Jake and Ventura, Dan and Morse, Bryan},
    title = {Clustering Music via the Temporal Similarity of Timbre},
    booktitle = {{IJCAI} Workshop on Artificial Intelligence and Music},
    pages = {153--164},
    month = {January},
    year = {2007},
    }
  • Download the file: pdf

Geometric Task Decomposition in a Multi-agent Environment

  • Authors: Kaivan Kamali and Dan Ventura and Amulya Garga and Soundar Kumara
  • Abstract: Task decomposition in a multi-agent environment is often performed online. This paper proposes a method for sub-task allocation that can be performed before the agents are deployed, reducing the need for communication among agents during their mission. The proposed method uses a Voronoi diagram to partition the task-space among team members and includes two phases: static and dynamic. Static decomposition (performed in simulation before the start of the mission) repeatedly partitions the task-space by generating random diagrams and measuring the efficacy of the corresponding sub-task allocation. If necessary, dynamic decomposition (performed in simulation after the start of a mission) modifies the result of a static decomposition (i.e. in case of resource limitations for some agents). Empirical results are reported for the problem of surveillance of an arbitrary region by a team of agents.
  • Reference: Applied Artificial Intelligence, volume 20 (5), pages 437–456, 2006.
  • BibTeX:
    @article{Kamali.aai06,
    author = {Kamali, Kaivan and Ventura, Dan and Garga, Amulya and Kumara, Soundar},
    title = {Geometric Task Decomposition in a Multi-agent Environment},
    journal = {Applied Artificial Intelligence},
    volume = {20},
    number = {5},
    pages = {437--456},
    year = {2006},
    }
  • Download the file: pdf

Fast and Robust Incremental Action Prediction for Interactive Agents

  • Authors: John Dinerstein and Dan Ventura and Parris Egbert
  • Abstract: The ability for a given agent to adapt on-line to better interact with another agent is a difficult and important problem. This problem becomes even more difficult when the agent to interact with is a human, since humans learn quickly and behave nondeterministically. In this paper we present a novel method whereby an agent can incrementally learn to predict the actions of another agent (even a human), and thereby can learn to better interact with that agent. We take a case-based approach, where the behavior of the other agent is learned in the form of state-action pairs. We generalize these cases either through continuous k-nearest neighbor, or a modified bounded minimax search. Through our case studies, our technique is empirically shown to require little storage, learn very quickly, and be fast and robust in practice. It can accurately predict actions several steps into the future. Our case studies include interactive virtual environments involving mixtures of synthetic agents and humans, with cooperative and/or competitive relationships.
  • Reference: Computational Intelligence, volume 21 (1), pages 90–110, 2005.
  • BibTeX:
    @article{dinerstein.ci05,
    author = {Dinerstein, John and Ventura, Dan and Egbert, Parris},
    title = {Fast and Robust Incremental Action Prediction for Interactive Agents},
    journal = {Computational Intelligence},
    volume = {21},
    number = {1},
    pages = {90--110},
    year = {2005},
    }
  • Download the file: pdf

Training a Quantum Neural Network

  • Authors: Bob Ricks and Dan Ventura
  • Abstract: Quantum learning holds great promise for the field of machine intelligence. The most studied quantum learning algorithm is the quantum neural network. Many such models have been proposed, yet none has become a standard. In addition, these models usually leave out many details, often excluding how they intend to train their networks. This paper discusses one approach to the problem and what advantages it would have over classical networks.
  • Reference: In Neural Information Processing Systems, pages 1019–1026, December 2003.
  • BibTeX:
    @inproceedings{ricks.nips03,
    author = {Ricks, Bob and Ventura, Dan},
    title = {Training a Quantum Neural Network},
    booktitle = {Neural Information Processing Systems},
    pages = {1019--1026},
    month = {December},
    year = {2003},
    }
  • Download the file: pdf

Probabilistic Connections in Relaxation Networks

  • Authors: Dan Ventura
  • Abstract: This paper reports results from studying the behavior of Hopfield-type networks with probabilistic connections. As the probabilities decrease, network performance degrades. In order to compensate, two network modifications — input persistence and a new activation function — are suggested, and empirical results indicate that the modifications significantly improve network performance.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 934–938, May 2002.
  • BibTeX:
    @inproceedings{ventura.ijcnn02,
    author = {Ventura, Dan},
    title = {Probabilistic Connections in Relaxation Networks},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {934--938},
    month = {May},
    year = {2002},
    }
  • Download the file: pdf

Pattern Classification Using a Quantum System

  • Authors: Dan Ventura
  • Abstract: We consider and compare three approaches to quantum pattern classification, presenting empirical results from simulations.
  • Reference: In Proceedings of the Joint Conference on Information Sciences, pages 537–640, March 2002.
  • BibTeX:
    @inproceedings{ventura.jcis02,
    author = {Ventura, Dan},
    title = {Pattern Classification Using a Quantum System},
    booktitle = {Proceedings of the Joint Conference on Information Sciences},
    pages = {537--640},
    month = {March},
    year = {2002},
    }
  • Download the file: pdf

A Quantum Analog to Basis Function Networks

  • Authors: Dan Ventura
  • Abstract: A Fourier-based quantum computational learning algorithm with similarities to classical basis function networks is developed. Instead of a Gaussian basis, the quantum algorithm uses a discrete Fourier basis with the output being a linear combination of the basis. A set of examples is considered as a quantum system that undergoes unitary transformations to produce learning. The main result of the work is a quantum computational learning algorithm that is unique among quantum algorithms as it does not assume a priori knowledge of a function f.
  • Reference: In Proceedings of the International Conference on Computing Anticipatory Systems, pages 286–295, August 2001.
  • BibTeX:
    @inproceedings{ventura.casys01,
    author = {Ventura, Dan},
    title = {A Quantum Analog to Basis Function Networks},
    booktitle = {Proceedings of the International Conference on Computing Anticipatory Systems},
    pages = {286--295},
    month = {August},
    year = {2001},
    }
  • Download the file: pdf

On the Utility of Entanglement in Quantum Neural Computing

  • Authors: Dan Ventura
  • Abstract: Efforts in combining quantum and neural computation are briefly discussed and the concept of entanglement as it applies to this subject is addressed. Entanglement is perhaps the least understood aspect of quantum systems used for computation, yet it is apparently most responsible for their computational power. This paper argues for the importance of understanding and utilizing entanglement in quantum neural computation.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 1565–1570, July 2001.
  • BibTeX:
    @inproceedings{ventura.ijcnn01,
    author = {Ventura, Dan},
    title = {On the Utility of Entanglement in Quantum Neural Computing},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {1565--1570},
    month = {July},
    year = {2001},
    }
  • Download the file: pdf

Learning Quantum Operators

  • Authors: Dan Ventura
  • Abstract: Consider the system F|x>=|y> where F is unknown. We examine the possibility of learning the operator F inductively, drawing analogies with ideas from classical computational learning.
  • Reference: In Proceedings of the Joint Conference on Information Sciences, pages 750–752, March 2000.
  • BibTeX:
    @inproceedings{ventura.jcis00,
    author = {Ventura, Dan},
    title = {Learning Quantum Operators},
    booktitle = {Proceedings of the Joint Conference on Information Sciences},
    pages = {750--752},
    month = {March},
    year = {2000},
    }
  • Download the file: pdf

Quantum Neural Networks

  • Authors: Alexandr Ezhov and Dan Ventura
  • Abstract: This chapter outlines the research, development and perspectives of quantum neural networks – a burgeoning new field which integrates classical neurocomputing with quantum computation. It is argued that the study of quantum neural networks may give us both new undestanding of brain function as well as unprecedented possibilities in creating new systems for information processing, including solving classically intractable problems, associative memory with exponential capacity and possibly overcoming the limitations posed by the Church-Turing thesis.
  • Reference: Kasabov, N., editor, Future Directions for Intelligent Systems and Information Science 2000, Physica-Verlag, 2000.
  • BibTeX:
    @article{ezhov.fdisis00,
    author = {Ezhov, Alexandr and Ventura, Dan},
    title = {Quantum Neural Networks},
    editor = {Kasabov, N.},
    journal = {Future Directions for Intelligent Systems and Information Science 2000},
    address = {Physica-Verlag},
    year = {2000},
    }
  • Download the file: pdf

Distributed Queries for Quantum Associative Memory

  • Authors: Alexandr Ezhov and A. Nifanova and Dan Ventura
  • Abstract: This paper discusses a model of quantum associative memory which generalizes the completing associative memory proposed by Ventura and Martinez. Similar to this model, our system is based on Grover’s well known algorithm for searching an unsorted quantum database. However, the model presented in this paper suggests the use of a distributed query of general form. It is demonstrated that spurious memories form an unavoidable part of the quantum associative memory model; however, the very presence of these spurious states provides the possibility of organizing a controlled process of data retrieval using a specially formed initial state of the quantum database and also of the transformation performed upon it. Concrete examples illustrating the properties of the proposed model are also presented.
  • Reference: Information Sciences , volume 3-4, pages 271–293, 2000.
  • BibTeX:
    @article{ezhov.is00,
    author = {Ezhov, Alexandr and Nifanova, A. and Ventura, Dan},
    title = {Distributed Queries for Quantum Associative Memory},
    journal = {Information Sciences },
    volume = {3-4},
    pages = {271--293},
    year = {2000},
    }
  • Download the file: pdf

Optically Simulating a Quantum Associative Memory

  • Authors: John Howell and John Yeazell and Dan Ventura
  • Abstract: This paper discusses the realization of a quantum associative memory using linear integrated optics. An associative memory produces a full pattern of bits when presented with only a partial pattern. Quantum computers have the potential to store large numbers of patterns and hence have the ability to far surpass any classical neural network realization of an associative memory. In this work two 3-qubit associative memories will be discussed using linear integrated optics. In addition, corrupted, invented and degenerate memories are discussed.
  • Reference: Physical Review A, volume 62, 2000. Article 42303.
  • BibTeX:
    @article{howell.pra00,
    author = {Howell, John and Yeazell, John and Ventura, Dan},
    title = {Optically Simulating a Quantum Associative Memory},
    journal = {Physical Review A},
    volume = {62},
    year = {2000},
    note = {Article 42303},
    }
  • Download the file: pdf

Quantum Associative Memory

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: This paper combines quantum computation with classical neural network theory to produce a quantum computational learning algorithm. Quantum computation uses microscopic quantum level effects to perform computational tasks and has produced results that in some cases are exponentially faster than their classical counterparts. The unique characteristics of quantum theory may also be used to create a quantum associative memory with a capacity exponential in the number of neurons. This paper combines two quantum computational algorithms to produce such a quantum associative memory. The result is an exponential increase in the capacity of the memory when compared to traditional associative memories such as the Hopfield network. The paper covers necessary high-level quantum mechanical and quantum computational ideas and introduces a quantum associative memory. Theoretical analysis proves the utility of the memory, and it is noted that a small version should be physically realizable in the near future.
  • Reference: Information Sciences , volume 1-4, pages 273–296, 2000.
  • BibTeX:
    @article{ventura.is00,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Quantum Associative Memory},
    journal = {Information Sciences },
    volume = {1-4},
    pages = {273--296},
    year = {2000},
    }
  • Download the file: pdf

Initializing the Amplitude Distribution of a Quantum State

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: To date, quantum computational algorithms have operated on a superposition of all basis states of a quantum system. Typically, this is because it is assumed that some function f is known and implementable as a unitary evolution. However, what if only some points of the function f are known? It then becomes important to be able to encode only the knowledge that we have about f. This paper presents an algorithm that requires a polynomial number of elementary operations for initializing a quantum system to represent only the m known points of a function f.
  • Reference: Foundations of Physics Letters , volume 6, pages 547–559, December 1999.
  • BibTeX:
    @article{ventura.fopl99,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Initializing the Amplitude Distribution of a Quantum State},
    journal = {Foundations of Physics Letters },
    volume = {6},
    pages = {547--559},
    month = {December},
    year = {1999},
    }
  • Download the file: pdf

A Quantum Associative Memory Based on Grover’s Algorithm

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: Quantum computation uses microscopic quantum level effects to perform computational tasks and has produced results that in some cases are exponentially faster than their classical counterparts. The unique characteristics of quantum theory may also be used to create a quantum associative memory with a capacity exponential in the number of neurons. This paper combines two quantum computational algorithms to produce a quantum associative memory. The result is an exponential increase in the capacity of the memory when compared to traditional associative memories such as the Hopfield network. This paper covers necessary high- level quantum mechanical ideas and introduces a quantum associative memory, a small verson of which should be physically realizable in the near future.
  • Reference: In Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms, pages 22–27, April 1999.
  • BibTeX:
    @inproceedings{ventura.icannga99,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {A Quantum Associative Memory Based on Grover's Algorithm},
    booktitle = {Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms},
    pages = {22--27},
    month = {April},
    year = {1999},
    }
  • Download the file: ps, pdf

Quantum Computational Intelligence: Answers and Questions

  • Authors: Dan Ventura
  • Abstract: This is a brief article discussing the interesting possibilities and potential difficulties with combining classical computational intelligence with quantum computation. See http://www.computer.org/intelligent/ex1999/pdf/x4009.pdf for a copy of the article.
  • Reference: IEEE Intelligent Systems , volume 4, pages 14–16, 1999.
  • BibTeX:
    @article{ventura.ieeeis99,
    author = {Ventura, Dan},
    title = {Quantum Computational Intelligence: Answers and Questions},
    journal = {{IEEE} Intelligent Systems },
    volume = {4},
    pages = {14--16},
    year = {1999},
    }
  • Download the file: pdf

Implementing Competitive Learning in a Quantum System

  • Authors: Dan Ventura
  • Abstract: Ideas from quantum computation are applied to the field of neural networks to produce competitive learning in a quantum system. The resulting quantum competitive learner has a prototype storage capacity that is exponentially greater than that of its classical counterpart. Further, empirical results from simulation of the quantum competitive learning system on real-world data sets demonstrate the quantum system’s potential for excellent performance.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks (IJCNN’99), paper 513, 1999.
  • BibTeX:
    @inproceedings{ventura.ijcnn99a,
    author = {Ventura, Dan},
    title = {Implementing Competitive Learning in a Quantum System},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks ({IJCNN}'99), paper 513},
    year = {1999},
    }
  • Download the file: ps, pdf

A Neural Model of Centered Tri-gram Speech Recognition

  • Authors: Dan Ventura and D. Randall Wilson and Brian Moncur and Tony R. Martinez
  • Abstract: A relaxation network model that includes higher order weight connections is introduced. To demonstrate its utility, the model is applied to the speech recognition domain. Traditional speech recognition systems typically consider only that context preceding the word to be recognized. However, intuition suggests that considering following context as well as preceding context should improve recognition accuracy. The work described here tests this hypothesis by applying the higher order relaxation network to consider both precedes and follows context in a speech recognition task. The results demonstrate both the general utility of the higher order relaxation network as well as its improvement over traditional methods on a speech recognition task.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks (IJCNN’99), paper 2188, 1999.
  • BibTeX:
    @inproceedings{ventura.ijcnn99b,
    author = {Ventura, Dan and Wilson, D. Randall and Moncur, Brian and Martinez, Tony R.},
    title = {A Neural Model of Centered Tri-gram Speech Recognition},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks ({IJCNN}'99), paper 2188},
    year = {1999},
    }
  • Download the file: pdf, ps

Artificial Associative Memory using Quantum Processes

  • Authors: Dan Ventura
  • Abstract: This paper discusses an approach to constructing an artificial quantum associative memory (QuAM). The QuAM makes use of two quantum computational algorithms, one for pattern storage and the other for pattern recall. The result is an exponential increase in the capacity of the memory when compared to traditional associative memories such as the Hopfield network. Further, the paper argues for considering pattern recall as a non-unitary process and demonstrates the utility of non-unitary operators for improving the pattern recall performance of the QuAM.
  • Reference: In Proceedings of the Joint Conference on Information Sciences, volume 2, pages 218–221, October 1998.
  • BibTeX:
    @inproceedings{ventura.jcis98,
    author = {Ventura, Dan},
    title = {Artificial Associative Memory using Quantum Processes},
    booktitle = {Proceedings of the Joint Conference on Information Sciences},
    volume = {2},
    pages = {218--221},
    month = {October},
    year = {1998},
    }
  • Download the file: ps, pdf

Quantum and Evolutionary Approaches to Computational Learning

  • Authors: Dan Ventura
  • Abstract: This dissertation presents two methods for attacking the problem of high dimensional spaces inherent in most computational learning problems. The first approach is a hybrid system for combining the thorough search capabilities of evolutionary computation with the speed and generalization of neural computation. This neural/evolutionary hybrid is utilized in three different settings: to address the problem of data acquisition for training a supervised learning system; as a learning optimization system; and as a system for developing neurocontrol. The second approach is the idea of quantum computational learning that overcomes the “curse of dimensionality” by taking advantage of the massive state space of quantum systems to process information in a way that is classically impossible. The quantum computational learning approach results in the development of a neuron with quantum mechanical properties, a quantum associative memory and a quantum computational learning system for inductive learning.
  • Reference: PhD thesis, Brigham Young University, Computer Science Department, August 1998.
  • BibTeX:
    @phdthesis{ventura.dissertation,
    author = {Ventura, Dan},
    title = {Quantum and Evolutionary Approaches to Computational Learning},
    school = {Brigham Young University},
    address = {Computer Science Department},
    month = {August},
    year = {1998},
    }
  • Download the file: ps

Quantum Associative Memory with Exponential Capacity

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: Quantum computation uses microscopic quantum level effects to perform computational tasks and has produced results that in some cases are exponentially faster than their classical counterparts by taking advantage of quantum parallelism. The unique characteristics of quantum theory may also be used to create a quantum associative memory with a capacity exponential in the number of neurons. This paper covers necessary high-level quantum mechanical ideas and introduces a simple quantum associative memory. Further, it provides discussion, empirical results and directions for future work.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 509–13, May 1998.
  • BibTeX:
    @inproceedings{ventura.ijcnn98a,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Quantum Associative Memory with Exponential Capacity},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {509--13},
    month = {May},
    year = {1998},
    }
  • Download the file: ps, pdf

Optimal Control Using a Neural/Evolutionary Hybrid System

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: One of the biggest hurdles to developing neurocontrollers is the difficulty in establishing good training data for the neural network. We propose a hybrid approach to the development of neurocontrollers that employs both evolutionary computation (EC) and neural networks (NN). EC is used to discover appropriate control actions for specific plant states. The survivors of the evolutionary process are used to construct a training set for the NN. The NN learns the training set, is able to generalize to new plant states, and is then used for neurocontrol. Thus the EC/NN approach combines the broad, parallel search of EC with the rapid execution and generalization of NN to produce a viable solution to the control problem. This paper presents the EC/NN hybrid and demonstrates its utility in developing a neurocontroller that demonstrates stability, generalization, and optimality.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 1036–41, May 1998.
  • BibTeX:
    @inproceedings{ventura.ijcnn98b,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Optimal Control Using a Neural/Evolutionary Hybrid System},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {1036--41},
    month = {May},
    year = {1998},
    }
  • Download the file: pdf, ps

Using Evolutionary Computation to Facilitate Development of Neurocontrol

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: The field of neurocontrol, in which neural networks are used for control of complex systems, has many potential applications. One of the biggest hurdles to developing neurocontrollers is the difficulty in establishing good training data for the neural network. We propose a hybrid approach to the development of neurocontrollers that employs both evolutionary computation (EC) and neural networks (NN). The survivors of this evolutionary process are used to construct a training set for the NN. The NN learns the training set, is able to generalize to new system states, and is then used for neurocontrol. Thus the EC/NN approach combines the broad, parallel search of EC with the rapid execution and generalization of NN to produce a viable solution to the control problem. This paper presents the EC/NN hybrid and demonstrates its utility in developing a neurocontroller for the pole balancing problem.
  • Reference: In Proceedings of the International Workshop on Neural Networks and Neurocontrol, August 1997.
  • BibTeX:
    @inproceedings{ventura.sian97,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Using Evolutionary Computation to Facilitate Development of Neurocontrol},
    booktitle = {Proceedings of the International Workshop on Neural Networks and Neurocontrol},
    month = {August},
    year = {1997},
    }
  • Download the file: pdf, ps

An Artificial Neuron with Quantum Mechanical Properties

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: Quantum computation uses microscopic quantum level effects to perform computational tasks and has produced results that in some cases are exponentially faster than their classical counterparts. Choosing the best weights for a neural network is a time consuming problem that makes the harnessing of this quantum parallelism appealing. This paper briefly covers high-level quantum theory and introduces a model for a quantum neuron.
  • Reference: In Proceedings of the International Conference on Neural Networks and Genetic Algorithms, pages 482–485, 1997.
  • BibTeX:
    @inproceedings{ventura.icannga97,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {An Artificial Neuron with Quantum Mechanical Properties},
    booktitle = {Proceedings of the International Conference on Neural Networks and Genetic Algorithms},
    pages = {482--485},
    year = {1997},
    }
  • Download the file: pdf, ps

Concerning a General Framework for the Development of Intelligent Systems

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: There exists on-going debate between Connectionism and Symbolism as to the nature of and approaches to cognition. Many viewpoints exist and various issues seen as important have been raised. This paper suggests that a combination of these methodologies will lead to a better overall model. The paper reviews and assimilates the opinions and viewpoints of these diverse fields and provides a cohesive list of issues thought to be critical to the modeling of intelligence. Further, this list results in a framework for the development of a general, unified theory of cognition.
  • Reference: In Proceedings of the IASTED International Conference on Artificial Intelligence, Expert Systems and Neural Networks, pages 44–47, 1996.
  • BibTeX:
    @inproceedings{ventura.iasted96,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Concerning a General Framework for the Development of Intelligent Systems},
    booktitle = {Proceedings of the {IASTED} International Conference on Artificial Intelligence, Expert Systems and Neural Networks},
    pages = {44--47},
    year = {1996},
    }
  • Download the file: pdf, ps

Robust Optimization Using Training Set Evolution

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: Training Set Evolution is an eclectic optimization technique that combines evolutionary computation (EC) with neural networks (NN). The synthesis of EC with NN provides both initial unsupervised random exploration of the solution space as well as supervised generalization on those initial solutions. An assimilation of a large amount of data obtained over many simulations provides encouraging empirical evidence for the robustness of Evolutionary Training Sets as an optimization technique for feedback and control problems.
  • Reference: In Proceedings of the International Conference on Neural Networks, pages 524–8, 1996.
  • BibTeX:
    @inproceedings{ventura.icnn96,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Robust Optimization Using Training Set Evolution},
    booktitle = {Proceedings of the International Conference on Neural Networks},
    pages = {524--8},
    year = {1996},
    }
  • Download the file: ps, pdf

A General Evolutionary/Neural Hybrid Approach to Learning Optimization Problems

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: A method combining the parallel search capabilities of Evolutionary Computation (EC) with the generalization of Neural Networks (NN) for solving learning optimization problems is presented. Assuming a fitness function for potential solutions can be found, EC can be used to explore the solution space, and the survivors of the evolution can be used as a training set for the NN which then generalizes over the entire space. Because the training set is generated by EC using a fitness function, this hybrid approach allows explicit control of training set quality.
  • Reference: In Proceedings of the World Congress on Neural Networks, pages 1091–5, 1996.
  • BibTeX:
    @inproceedings{ventura.wcnn96,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {A General Evolutionary/Neural Hybrid Approach to Learning Optimization Problems},
    booktitle = {Proceedings of the World Congress on Neural Networks},
    pages = {1091--5},
    year = {1996},
    }
  • Download the file: pdf, ps

On Discretization as a Preprocessing Step For Supervised Learning Models

  • Authors: Dan Ventura
  • Abstract: Many machine learning and neurally inspired algorithms are limited, at least in their pure form, to working with nominal data. However, for many real-world problems, some provision must be made to support processing of continuously valued data. BRACE, a paradigm for the discretization of continuously valued attributes is introduced, and two algorithmic instantiations of this paradigm, VALLEY and SLICE are presented. These methods are compared empirically with other discretization techniques on several real-world problems and no algorithm clearly outperforms the others. Also, discretization as a preprocessing step is in many cases found to be inferior to direct handling of continuously valued data. These results suggest that machine learning algorithms should be designed to directly handle continuously valued data rather than relying on preprocessing or ad hoc techniques. To this end statistical prototypes (SP/MSP) are developed and an empirical comparison with well-known learning algorithms is presented. Encouraging results demonstrate that statistical prototypes have the potential to handle continuously valued data well. However, at this point, they are not suited for handling nominally valued data, which is arguably at least as important as continuously valued data in learning real-world applications. Several areas of ongoing research that aim to provide this ability are presented.
  • Reference: Master’s thesis, Brigham Young University, Computer Science Department, April 1995.
  • BibTeX:
    @mastersthesis{ventura.thesis,
    author = {Ventura, Dan},
    title = {On Discretization as a Preprocessing Step For Supervised Learning Models},
    school = {Brigham Young University},
    address = {Computer Science Department},
    month = {April},
    year = {1995},
    }
  • Download the file: ps

Using Evolutionary Computation to Generate Training Set Data for Neural Networks

  • Authors: Dan Ventura and Tim L. Andersen and Tony R. Martinez
  • Abstract: Most neural networks require a set of training examples in order to attempt to approximate a problem function. For many real-world problems, however, such a set of examples is unavailable. Such a problem involving feedback optimization of a computer network routing system has motivated a general method of generating artificial training sets using evolutionary computation. This paper describes the method and demonstrates its utility by presenting promising results from applying it to an artificial problem similar to a real-world network routing optimization problem.
  • Reference: In Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms, pages 468–471, 1995.
  • BibTeX:
    @inproceedings{ventura.icannga95,
    author = {Ventura, Dan and Andersen, Tim L. and Martinez, Tony R.},
    title = {Using Evolutionary Computation to Generate Training Set Data for Neural Networks},
    booktitle = {Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms},
    pages = {468--471},
    year = {1995},
    }
  • Download the file: pdf, ps

An Empirical Comparison of Discretization Models

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: Many machine learning and neurally inspired algorithms are limited, at least in their pure form, to working with nominal data. However, for many real-world problems, some provision must be made to support processing of continuously valued data. This paper presents empirical results obtained by using six different discretization methods as preprocessors to three different supervised learners on several real-world problems. No discretization technique clearly outperforms the others. Also, discretization as a preprocessing step is in many cases found to be inferior to direct handling of continuously valued data. These results suggest that machine learning algorithms should be designed to directly handle continuously valued data rather than relying on preprocessing or ad hoc techniques.
  • Reference: In Proceedings of the 10th International Symposium on Computer and Information Sciences, pages 443–450, 1995.
  • BibTeX:
    @inproceedings{ventura.iscis95,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {An Empirical Comparison of Discretization Models},
    booktitle = {Proceedings of the 10th International Symposium on Computer and Information Sciences},
    pages = {443--450},
    year = {1995},
    }
  • Download the file: ps, pdf

Using Multiple Statistical Prototypes to Classify Continuously Valued Data

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: Multiple Statistical Prototypes (MSP) is a modification of a standard minimum distance classification scheme that generates multiple prototypes per class using a modified greedy heuristic. Empirical comparison of MSP with other well-known learning algorithms shows MSP to be a robust algorithm that uses a very simple premise to produce good generalization and achieve parsimonious hypothesis representation.
  • Reference: In Proceedings of the International Symposium on Neuroinformatics and Neurocomputers, pages 238–245, 1995.
  • BibTeX:
    @inproceedings{ventura.isninc95,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {Using Multiple Statistical Prototypes to Classify Continuously Valued Data},
    booktitle = {Proceedings of the International Symposium on Neuroinformatics and Neurocomputers},
    pages = {238--245},
    year = {1995},
    }
  • Download the file: ps, pdf

BRACE: A Paradigm for the Discretization of Continuously Valued Data

  • Authors: Dan Ventura and Tony R. Martinez
  • Abstract: Discretization of continuously valued data is a useful and necessary tool because many learning paradigms assume nominal data. A list of objectives for efficient and effective discretization is presented. A paradigm called BRACE (Boundary Ranking And Classification Evaluation) that attempts to meet the objectives is presented along with an algorithm that follows the paradigm. The paradigm meets many of the objectives, with potential for extension to meet the remainder. Empirical results have been promising. For these reasons BRACE has potential as an effective and efficient method for discretization of continuously valued data. A further advantage of BRACE is that it is general enough to be extended to other types of clustering/unsupervised learning.
  • Reference: In Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pages 117–121, 1994.
  • BibTeX:
    @inproceedings{ventura.flairs94,
    author = {Ventura, Dan and Martinez, Tony R.},
    title = {{BRACE}: A Paradigm for the Discretization of Continuously Valued Data},
    booktitle = {Proceedings of the Seventh Florida Artificial Intelligence Research Symposium},
    pages = {117--121},
    year = {1994},
    }
  • Download the file: pdf, ps

Generating Three Binary Addition Algorithms using Reinforcement Programming

  • Authors: Spencer White and Tony R. Martinez and George Rudolph
  • Reference: In To appear in Proceedings of ACMSE 2010, pages x–x, 2010.
  • BibTeX:
    @inproceedings{Spencer.ACMSE10,
    author = {White, Spencer and Martinez, Tony R. and Rudolph, George},
    title = {Generating Three Binary Addition Algorithms using Reinforcement Programming},
    booktitle = {To appear in Proceedings of ACMSE 2010},
    pages = {x--x},
    year = {2010},
    }
  • Download the file: pdf

Generating a Novel Sort Algorithm using Reinforcement Programming

  • Authors: Spencer White and Tony R. Martinez and George Rudolph
  • Reference: In To appear in Proceedings of Conference on Evolutionary Compution (CEC), pages x–x, 2010.
  • BibTeX:
    @inproceedings{Spencer.CEC10,
    author = {White, Spencer and Martinez, Tony R. and Rudolph, George},
    title = {Generating a Novel Sort Algorithm using Reinforcement Programming},
    booktitle = {To appear in Proceedings of Conference on Evolutionary Compution (CEC)},
    pages = {x--x},
    year = {2010},
    }
  • Download the file: pdf

Learning Multiple Correct Classifications from Incomplete Data using Weakened Implicit Negatives

  • Authors: Stephen Whiting and Dan Ventura
  • Abstract: Classification problems with output class overlap create problems for standard neural network approaches. We present a modification of a simple feedforward neural network that is capable of learning problems with output overlap, including problems exhibiting hierarchical class structures in the output. Our method of applying weakened implicit negatives to address overlap and ambiguity allows the algorithm to learn a large portion of the hierarchical structure from very incomplete data. Our results show an improvement of approximately 58% over a standard backpropagation network on the hierarchical problem.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 2953–2958, July 2004.
  • BibTeX:
    @inproceedings{whiting.ijcnn04,
    author = {Whiting, Stephen and Ventura, Dan},
    title = {Learning Multiple Correct Classifications from Incomplete Data using Weakened Implicit Negatives},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {2953--2958},
    month = {July},
    year = {2004},
    }
  • Download the file: pdf

The General Inefficiency of Batch Training for Gradient Descent Learning

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Gradient descent training of neural networks can be done in either a batch or on-line manner. A widely held myth in the neural network community is that batch training is as fast or faster and/or more “correct” than on-line training because it supposedly uses a better approximation of the true gradient for its weight updates. This paper explains why batch training is almost always slower than on-line training often orders of magnitude slower especially on large training sets. The main reason is due to the ability of on-line training to follow curves in the error surface throughout each epoch, which allows it to safely use a larger learning rate and thus converge with less iterations through the training data. Empirical results on a large (20,000-instance) speech recognition task and on 26 other learning tasks demonstrate that convergence can be reached significantly faster using on-line training than batch training, with no apparent difference in accuracy.
  • Reference: Neural Networks, volume 16 (10), pages 1429–1451, Elsevier Science Ltd. Oxford, UK, UK, 2003.
  • BibTeX:
    @article{Wilson.nn03.batch,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {The General Inefficiency of Batch Training for Gradient Descent Learning},
    journal = {Neural Networks},
    volume = {16},
    number = {10},
    pages = {1429--1451},
    publisher = {Elsevier Science Ltd.},
    address = {Oxford, UK, UK},
    year = {2003},
    issn = {0893-6080},
    }
  • Download the file: ps, pdf

The Need for Small Learning Rates on Large Problems

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: In gradient descent learning algorithms such as error backpropagation, the learning rate parameter can have a significant effect on generalization accuracy. In particular, decreasing the learning rate below that which yields the fastest convergence can significantly improve generalization accuracy, especially on large, complex problems. The learning rate also directly affects training speed, but not necessarily in the way that many people expect. Many neural network practitioners currently attempt to use the largest learning rate that still allows for convergence, in order to improve training speed. However, a learning rate that is too large can be as slow as a learning rate that is too small, and a learning rate that is too large or too small can require orders of magnitude more training time than one that is in an appropriate range. This paper illustrates how the learning rate affects training speed and generalization accuracy, and thus gives guidelines on how to efficiently select a learning rate that maximizes generalization accuracy.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’01, pages 115–119, 2001.
  • BibTeX:
    @inproceedings{wilson.ijcnn2001,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {The Need for Small Learning Rates on Large Problems},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'01},
    pages = {115--119},
    year = {2001},
    }
  • Download the file: pdf

The Inefficiency of Batch Training for Large Training Sets,

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Multilayer perceptrons are often trained using error backpropagation (BP). BP training can be done in either a batch or continuous manner. Claims have frequently been made that batch training is faster and/or more “correct” than continuous training because it uses a better approximation of the true gradient for its weight updates. These claims are often supported by empirical evidence on very small data sets. These claims are untrue, however, for large training sets. This paper explains why batch training is much slower than continuous training for large training sets. Various levels of semi-batch training used on a 20,000-instance speech recognition task show a roughly linear increase in training time required with an increase in batch size.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks (IJCNN2000), volume II, pages 113–117, July 2000.
  • BibTeX:
    @inproceedings{wilson.ijcnn2000.batch,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {The Inefficiency of Batch Training for Large Training Sets,},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks ({IJCNN2000})},
    volume = {{II}},
    pages = {113--117},
    month = {July},
    year = {2000},
    }
  • Download the file: pdf, ps

Reduction Techniques for Exemplar-Based Learning Algorithms

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Exemplar-based learning algorithms are often faced with the problem of deciding which instances or other exemplars to store for use during generalization. Storing too many exemplars can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main purposes. First, it provides a survey of existing algorithms used to reduce the number of exemplars retained in exemplar-based learning models. Second, it proposes six new reduction algorithms called DROP1-5 and DEL that can be used to prune instances from the concept description. These algorithms and 10 algorithms from the survey are compared on 31 datasets. Of those algorithms that provide substantial storage reduction, the DROP algorithms have the highest generalization accuracy in these experiments, especially in the presence of noise.
  • Reference: Machine Learning, volume 3, pages 257–286, March 2000.
  • BibTeX:
    @article{wilson.ml2000.drop,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Reduction Techniques for Exemplar-Based Learning Algorithms},
    journal = {Machine Learning},
    volume = {3},
    pages = {257--286},
    month = {March},
    year = {2000},
    }
  • Download the file: ps, pdf

An Integrated Instance-Based Learning Algorithm

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This paper proposes methods for overcoming each of these weaknesses and combines these methods into a comprehensive learning system called the Integrated Decremental Instance-Based Learning Algorithm (IDIBL) that seeks to reduce storage, improve execution speed, and increase generalization accuracy, when compared to the basic nearest neighbor algorithm and other learning models. IDIBL tunes its own parameters using a new measure of fitness that combines confidence and cross-validation (CVC) accuracy in order to avoid discretization problems with more traditional leave-one-out cross-validation (LCV). In our experiments IDIBL achieves higher generalization accuracy than other less comprehensive instance-based learning algorithms, while requiring less than one fourth the storage of the nearest neighbor algorithm and improving execution speed by a corresponding factor. In experiments on 21 datasets, IDIBL also achieves higher generalization accuracy than those reported for 16 major machine learning and neural network models.
  • Reference: Computational Intelligence, volume 1, pages 1–28, 2000.
  • BibTeX:
    @article{wilson.ci2000.idibl,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {An Integrated Instance-Based Learning Algorithm},
    journal = {Computational Intelligence},
    volume = {1},
    pages = {1--28},
    year = {2000},
    }
  • Download the file: ps, pdf

Combining Cross-Validation and Confidence to Measure Fitness

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Neural network and machine learning algorithms often have parameters that must be tuned for good performance on a particular task. Leave-one-out cross-validation (LCV) accuracy is often used to measure the fitness of a set of parameter values. However, small changes in parameters often have no effect on LCV accuracy. Many learning algorithms can measure the confidence of a classification decision, but often confidence alone is an inappropriate measure of fitness. This paper proposes a combined measure of Cross-Validation and Confidence (CVC) for obtaining a continuous measure of fitness for sets of parameters in learning algorithms. This paper also proposes the Refined Instance-Based (RIB) learning algorithm which illustrates the use of CVC in automated parameter tuning. Using CVC provides significant improvement in generalization accuracy on a collection of 31 classification tasks when compared to using LCV.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks (IJCNN’99), paper 163, 1999.
  • BibTeX:
    @inproceedings{wilson.ijcnn99.cvc,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Combining Cross-Validation and Confidence to Measure Fitness},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks ({IJCNN}'99), paper 163},
    year = {1999},
    }
  • Download the file: ps, pdf

The Robustness of Relaxation Rates in Constraint Satisfaction Networks

  • Authors: D. Randall Wilson and Dan Ventura and Brian Moncur and Tony R. Martinez
  • Abstract: Constraint satisfaction networks contain nodes that receive weighted evidence from external sources and/or other nodes. A relaxation process allows the activation of nodes to affect neighboring nodes, which in turn can affect their neighbors, allowing information to travel through a network. When doing discrete updates (as in a software implementation of a relaxation network), a goal net or goal activation can be computed in response to the net input into a node, and a relaxation rate can then be used to determine how fast the node moves from its current value to its goal value. An open question was whether or not the relaxation rate is a sensitive parameter. This paper shows that the relaxation rate has almost no effect on how information flows through the network as long as it is small enough to avoid large discrete steps and/or oscillation.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks (IJCNN’99), paper 162, 1999.
  • BibTeX:
    @inproceedings{wilson.ijcnn99.relax,
    author = {Wilson, D. Randall and Ventura, Dan and Moncur, Brian and Martinez, Tony R.},
    title = {The Robustness of Relaxation Rates in Constraint Satisfaction Networks},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks ({IJCNN}'99), paper 162},
    year = {1999},
    }
  • Download the file: pdf

Advances in Instance-Based Learning Algorithms

  • Authors: D. Randall Wilson
  • Abstract: The nearest neighbor algorithm and its derivatives, which are often referred to collectively as instance-based learning algorithms, have been successful on a variety of real-world applications. However, in its basic form, the nearest neighbor algorithm suffers from inadequate distance functions, large storage requirements, slow execution speed, a sensitivity to noise and irrelevant attributes, and an inability to adjust its decision surfaces after storing the data. This dissertation presents a collection of papers that seek to overcome each of these disadvantages. The most successful enhancements are combined into a comprehensive system called the Integrated Decremental Instance-Based Learning algorithm, which in experiments on 44 applications achieves higher generalization accuracy than other instance-based learning algorithms. It also yields higher generalization accuracy than that reported for 16 major machine learning and neural network models.
  • Reference: PhD thesis, Brigham Young University, Computer Science Department, August 1997.
  • BibTeX:
    @phdthesis{wilson.phd97,
    author = {Wilson, D. Randall},
    title = {Advances in Instance-Based Learning Algorithms},
    school = {Brigham Young University},
    address = {Computer Science Department},
    month = {August},
    year = {1997},
    }
  • Download the file: ps, pdf

Improved Center Point Selection for Probabilistic Neural Networks

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Probabilistic Neural Networks (PNN) typically learn more quickly than many neural network models and have had success on a variety of applications. However, in their basic form, they tend to have a large number of hidden nodes. One common solution to this problem is to keep only a randomly-selected subset of the original training data in building the network. This paper presents an algorithm called the Reduced Probabilistic Neural Network (RPNN) that seeks to choose a better-than-random subset of the available instances to use as center points of nodes in the network. The algorithm tends to retain non-noisy border points while removing nodes with instances in regions of the input space that are highly homogeneous. In experiments on 22 datasets, the RPNN had better average generalization accuracy than two other PNN models, while requiring an average of less than one-third the number of nodes.
  • Reference: In Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms (ICANNGA’97), pages 514–517, 1997.
  • BibTeX:
    @inproceedings{wilson.icannga97.rpnn,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Improved Center Point Selection for Probabilistic Neural Networks},
    booktitle = {Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms ({ICANNGA}'97)},
    pages = {514--517},
    year = {1997},
    }
  • Download the file: ps, pdf

Instance Pruning Techniques

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: The nearest neighbor algorithm and its derivatives are often quite successful at learning a concept from a training set and providing good generalization on subsequent input vectors. However, these techniques often retain the entire training set in memory, resulting in large memory requirements and slow execution speed, as well as a sensitivity to noise. This paper provides a discussion of issues related to reducing the number of instances retained in memory while maintaining (and sometimes improving) generalization accuracy, and mentions algorithms other researchers have used to address this problem. It presents three intuitive noise-tolerant algorithms that can be used to prune instances from the training set. In experiments on 29 applications, the algorithm that achieves the highest reduction in storage also results in the highest generalization accuracy of the three methods.
  • Reference: In Fisher, D., editor, Machine Learning: Proceedings of the Fourteenth International Conference (ICML’97), pages 403–411, San Francisco, CA, 1997. Morgan Kaufmann Publishers.
  • BibTeX:
    @inproceedings{wilson.icml97.prune,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Instance Pruning Techniques},
    editor = {Fisher, D.},
    booktitle = {Machine Learning: Proceedings of the Fourteenth International Conference ({ICML}'97)},
    pages = {403--411},
    publisher = {Morgan Kaufmann Publishers},
    address = {San Francisco, {CA}},
    year = {1997},
    }
  • Download the file: ps, pdf

Bias and the Probability of Generalization

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: In order to be useful, a learning algorithm must be able to generalize well when faced with inputs not previously presented to the system. A bias is necessary for any generalization, and as shown by several researchers in recent years, no bias can lead to strictly better generalization than any other when summed over all possible functions or applications. This paper provides examples to illustrate this fact, but also explains how a bias or learning algorithm can be better than another in practice when the probability of the occurrence of functions is taken into account. It shows how domain knowledge and an understanding of the conditions under which each learning algorithm performs well can be used to increase the probability of accurate generalization, and identifies several of the conditions that should be considered when attempting to select an appropriate bias for a particular problem.
  • Reference: In Proceedings of the 1997 International Conference on Intelligent Information Systems (IIS’97), pages 108–114, 1997.
  • BibTeX:
    @inproceedings{wilson.iis97.bias,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Bias and the Probability of Generalization},
    booktitle = {Proceedings of the 1997 International Conference on Intelligent Information Systems ({IIS}'97)},
    pages = {108--114},
    year = {1997},
    }
  • Download the file: ps, pdf

Improved Heterogeneous Distance Functions

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Nearest neighbor and instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), Interpolated Value Difference Metric (IVDM) and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher average classification accuracy than previous distance functions on applications with both nominal and continuous attributes.
  • Reference: Journal of Artificial Intelligence Research (JAIR), volume 1, pages 1–34, 1997.
  • BibTeX:
    @article{wilson.jair97.hvdm,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Improved Heterogeneous Distance Functions},
    journal = {Journal of Artificial Intelligence Research ({JAIR})},
    volume = {1},
    pages = {1--34},
    year = {1997},
    }
  • Download the file: ps, pdf

Real-Valued Schemata Search Using Statistical Confidence

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Many neural network models must be trained by finding a set of real-valued weights that yield high accuracy on a training set. Other learning models require weights on input attributes that yield high leave-one-out classification accuracy in order to avoid problems associated with irrelevant attributes and high dimensionality. In addition, there are a variety of general problems for which a set of real values must be found which maximize some evaluation function. This paper presents an algorithm for doing a schemata search over a real-valued weight space to find a set of weights (or other real values) for a given evaluation function. The algorithm, called the Real-Valued Schemata Search (RVSS) uses the BRACE statistical technique [Moore & Lee, 1993] to determine when to narrow the search space. This paper details the RVSS approach and gives initial empirical results.
  • Reference: In Proceedings of the 1997 Sian Ka’an International Workshop on Neural Networks and Neurocontrol, 1997.
  • BibTeX:
    @inproceedings{wilson.sian97.schema,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Real-Valued Schemata Search Using Statistical Confidence},
    booktitle = {Proceedings of the 1997 Sian Ka'an International Workshop on Neural Networks and Neurocontrol},
    year = {1997},
    }
  • Download the file: ps, pdf

Instance-Based Learning with Genetically Derived Attribute Weights

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: This paper presents an inductive learning system called the Genetic Instance-Based Learning (GIBL) system. This system combines instance-based learning approaches with evolutionary computation in order to achieve high accuracy in the presence of irrelevant or redundant attributes. Evolutionary computation is used to find a set of attribute weights that yields a high estimate of classification accuracy. Results of experiments on 16 data sets are shown, and are compared with a non-weighted version of the instance-based learning system. The results indicate that the generalization accuracy of GIBL is somewhat higher than that of the non-weighted system on regular data, and is significantly higher on data with irrelevant or redundant attributes.
  • Reference: In Proceedings of the International Conference on Artificial Intelligence, Expert Systems, and Neural Networks (AIE’96), pages 11–14, August 1996.
  • BibTeX:
    @inproceedings{wilson.aie96.gibl,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Instance-Based Learning with Genetically Derived Attribute Weights},
    booktitle = {Proceedings of the International Conference on Artificial Intelligence, Expert Systems, and Neural Networks ({AIE}'96)},
    pages = {11--14},
    month = {August},
    year = {1996},
    }
  • Download the file: ps, pdf

Value Difference Metrics for Continuously Valued Attributes

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Nearest neighbor and instance-based learning techniques typically handle continuous and linear input values well, but often do not handle symbolic input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between symbolic attribute values, but it largely ignores continuous attributes, using discretization to map continuous values into symbolic values. This paper presents two heterogeneous distance metrics, called the Interpolated VDM (IVDM) and Windowed VDM (WVDM), that extend the Value Difference Metric to handle continuous attributes more appropriately. In experiments on 21 data sets the new distance metrics achieves higher classification accuracy in most cases involving continuous attributes.
  • Reference: In Proceedings of the International Conference on Artificial Intelligence, Expert Systems, and Neural Networks (AIE’96), pages 74–78, August 1996.
  • BibTeX:
    @inproceedings{wilson.aie96.ivdm,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Value Difference Metrics for Continuously Valued Attributes},
    booktitle = {Proceedings of the International Conference on Artificial Intelligence, Expert Systems, and Neural Networks ({AIE}'96)},
    pages = {74--78},
    month = {August},
    year = {1996},
    }
  • Download the file: ps, pdf

Heterogeneous Radial Basis Function Networks

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: Radial Basis Function (RBF) networks typically use a distance function designed for numeric attributes, such as Euclidean or city-block distance. This paper presents a heterogeneous distance function which is appropriate for applications with symbolic attributes, numeric attributes, or both. Empirical results on 30 data sets indicate that the heterogeneous distance metric yields significantly improved generalization accuracy over Euclidean distance in most cases involving symbolic attributes.
  • Reference: In Proceedings of the International Conference on Neural Networks (ICNN’96), volume 2, pages 1263–1267, June 1996.
  • BibTeX:
    @inproceedings{wilson.icnn96.hrbf,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {Heterogeneous Radial Basis Function Networks},
    booktitle = {Proceedings of the International Conference on Neural Networks ({ICNN}'96)},
    volume = {2},
    pages = {1263--1267},
    month = {June},
    year = {1996},
    }
  • Download the file: pdf, ps

Prototype Styles of Generalization

  • Authors: D. Randall Wilson
  • Abstract: A learning system can generalize from training set data in many ways. No single style of generalization is likely to solve all problems better than any other style, and different styles work better on some applications than others. Several generalization styles are proposed, including distance metrics, first-order features, voting schemes, and confidence levels. These generalization styles are efficient in terms of time and space and lend themselves to massively parallel architectures. Empirical results of using these generalization styles on several real-world applications are presented. These results indicate that the prototype styles of generalization presented can provide accurate generalization for many applications, and that having several styles of generalization available often permits one to be selected that works well for a particular application.
  • Reference: Master’s thesis, Brigham Young University, August 1994.
  • BibTeX:
    @mastersthesis{wilson.thesis94,
    author = {Wilson, D. Randall},
    title = {Prototype Styles of Generalization},
    school = {Brigham Young University},
    month = {August},
    year = {1994},
    }
  • Download the file: pdf, ps

The Potential of Prototype Styles of Generalization

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: There are many ways for a learning system to generalize from training set data. This paper presents several generalization styles using prototypes in an attempt to provide accurate generalization on training set data for a wide variety of applications. These general- ization styles are efficient in terms of time and space, and lend themselves well to massively parallel architectures. Empirical results of generalizing on several real-world applications are given, and these results indicate that the prototype styles of generalization presented have potential to provide accurate generalization for many applications.
  • Reference: In The Sixth Australian Joint Conference on Artificial Intelligence (AI ’93), pages 356–361, November 1993.
  • BibTeX:
    @inproceedings{wilson.ai93.proto,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {The Potential of Prototype Styles of Generalization},
    booktitle = {The Sixth Australian Joint Conference on Artificial Intelligence ({AI} '93)},
    pages = {356--361},
    month = {November},
    year = {1993},
    }
  • Download the file: ps, pdf

The Importance of Using Multiple Styles of Generalization

  • Authors: D. Randall Wilson and Tony R. Martinez
  • Abstract: There are many ways for a learning system to generalize from training set data. There is likely no one style of generalization which will solve all problems better than any other style, for different styles will work better on some applications than others. This paper presents several styles of generalization and uses them to suggest that a collection of such styles can provide more accurate generalization than any one style by itself. Empirical results of generalizing on several real-world applications are given, and comparisons are made on the generalization accuracy of each style of generalization. The empirical results support the hypothesis that using multiple generalization styles can improve generalization accuracy.
  • Reference: In Proceedings of the First New Zealand International Conference on Artificial Neural Networks and Expert Systems (ANNES), pages 54–57, November 1993.
  • BibTeX:
    @inproceedings{wilson.annes93.mult,
    author = {Wilson, D. Randall and Martinez, Tony R.},
    title = {The Importance of Using Multiple Styles of Generalization},
    booktitle = {Proceedings of the First New Zealand International Conference on Artificial Neural Networks and Expert Systems ({ANNES})},
    pages = {54--57},
    month = {November},
    year = {1993},
    }
  • Download the file: pdf, ps

A Comprehensive Case Study: An Examination of Connectionist and Machine Learning Algorithms

  • Authors: Frederick Zarndt
  • Abstract: This case study examines the performance of 16 well-known and widely-used inductive learning algorithms on a comprehensive collection (102) of learning problems. It is distinguished from other, similar studies by the number of learning models used, the number of problems examined, and the rigor with which it has been conducted. The results of future case studies, which are done with the method and tools presented here, with learning problems used in this study, and with the same rigor, should be readily reproducible and directly comparable to the results of this study. An extensive set of tools is offered.
  • Reference: Master’s thesis, Brigham Young University, June 1995.
  • BibTeX:
    @mastersthesis{Zarndt.thesis95,
    author = {Zarndt, Frederick},
    title = {A Comprehensive Case Study: An Examination of Connectionist and Machine Learning Algorithms},
    school = {Brigham Young University},
    month = {June},
    year = {1995},
    }
  • Download the file: pdf

Using Decision Trees and Soft Labeling to Filter Mislabeled Data

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Reference: Journal of Intelligent Systems, volume 17 (4), pages 331–354, 2008.
  • BibTeX:
    @article{Zeng.JIS,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Using Decision Trees and Soft Labeling to Filter Mislabeled Data},
    journal = {Journal of Intelligent Systems},
    volume = {17},
    number = {4},
    pages = {331--354},
    year = {2008},
    }
  • Download the file: pdf

Feature Weighting Using Neural Networks

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: In this work we propose a feature weighting method for classification tasks by extracting relevant information from a trained neural network. This method weights an attribute based on strengths (weights) of related links in the neural network, in which an important feature is typically connected to strong links and has more impact on the outputs. This method is applied to feature weighting for the nearest neighbor classifier and is tested on 15 real-world classification tasks. The results show that it can improve the nearest neighbor classifier on 14 of the 15 tested tasks, and also outperforms the neural network on 9 tasks.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN’04, pages 1327–1330, 2004.
  • BibTeX:
    @inproceedings{zeng_martinez_ijcnn04,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Feature Weighting Using Neural Networks},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks {IJCNN}'04},
    pages = {1327--1330},
    year = {2004},
    }
  • Download the file: pdf

A Noise Filtering Method Using Neural Networks

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: During the data collecting and labeling process it is possible for noise to be introduced into a data set. As a result, the quality of the data set degrades and experiments and inferences derived from the data set become less reliable. In this paper we present an algorithm, called ANR (automatic noise reduction), as a filtering mechanism to identify and remove noisy data items whose classes have been mislabeled. The underlying mechanism behind ANR is based on a framework of multi-layer artificial neural networks. ANR assigns each data item a soft class label in the form of a class probability vector, which is initialized to the original class label and can be modified during training. When the noise level is reasonably small (<30%), the non-noisy data is dominant in determining the network architecture and its output, and thus a mechanism for correcting mislabeled data can be provided by aligning class probability vector with the network output. With a learning procedure for class probability vector based on its difference from the network output, the probability of a mislabeled class gradually becomes smaller while that of the correct class becomes larger, which eventually causes a correction of mislabeled data after sufficient training. After training, those data items whose classes have been relabeled are then treated as noisy data and removed from the data set. We evaluate the performance of the ANR based on 12 data sets drawn from the UCI data repository. The results show that ANR is capable of identifying a significant portion of noisy data. An average increase in accuracy of 24.5% can be achieved at a noise level of 25% by using ANR as a training data filter for a nearest neighbor classifier, as compared to the one without using ANR.
  • Reference: In Proceedings of the International Workshop of Soft Computing Techniques in Instrumentation, Measurement and Related Applications, 2003.
  • BibTeX:
    @inproceedings{zeng.scima2003,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {A Noise Filtering Method Using Neural Networks},
    booktitle = {Proceedings of the International Workshop of Soft Computing Techniques in Instrumentation, Measurement and Related Applications},
    year = {2003},
    }
  • Download the file: ps

Optimization by Varied Beam Search in Hopfield Networks

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: This paper shows that the performance of the Hopfield network for solving optimization problems can be improved by a varied beam search algorithm. The algorithm varies the beam search size and beam intensity during the network relaxation process. It consists of two stages: increasing the beam search parameters in the first stage and then decreasing them in the second stage. The purpose of using such a scheme is to provide the network with a better chance to find more and better solutions. A large number of simulation results based on 200 randomly generated city distributions of the 10-city traveling salesman problem demonstrated that it is capable of increasing the percentage of valid tours by 28.3% and reducing the error rate by 40.8%, compared to the original Hopfield network.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks, pages 913–918, 2002.
  • BibTeX:
    @inproceedings{zeng.ijcnn2002,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Optimization by Varied Beam Search in Hopfield Networks},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks},
    pages = {913--918},
    year = {2002},
    }
  • Download the file: ps

Graded Rescaling in Hopfield Networks

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: In this work we propose a method with the capability of improving the performance of the Hopfield network for solving optimization problems by using a graded rescaling scheme on the distance matrix of the energy function. This method controls the magnitude of rescaling by adjusting a parameter (scaling factor) in order to explore the optimal range for performance. We have evaluated different scaling factors through 20,000 simulations, based on 200 randomly generated city distributions of the 10-city traveling salesman problem. The results show that the graded rescaling can improve the performance significantly for a wide range of scaling factors. It increases the percentage of valid tours by 72.2%, reduces the error rate of tour length by 10.2%, and increases the chance of finding optimal tours by 39.0%, as compared to the original Hopfield network without rescaling.
  • Reference: In Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms, pages 63–67, 2001.
  • BibTeX:
    @inproceedings{zeng.icannga2001,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Graded Rescaling in Hopfield Networks},
    booktitle = {Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms},
    pages = {63--67},
    year = {2001},
    }
  • Download the file: ps

An Algorithm for Correcting Mislabeled Data

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: Reliable evaluation for the performance of classifiers depends on the quality of the data sets on which they are tested. During the collecting and recording of a data set, however, some noise may be introduced into the data, especially in various real-world environments, which can degrade the quality of the data set. In this paper, we present a novel approach, called ADE (automatic data enhancement), to correct mislabeled data in a data set. In addition to using multi-layer neural networks trained by backpropagation as the basic framework, ADE assigns each training pattern a class probability vector as its class label, in which each component represents the probability of the corresponding class. During training, ADE constantly updates the probability vector based on its difference from the output of the network. With this updating rule, the probability of a mislabeled class gradually becomes smaller while that of the correct class becomes larger, which eventually causes the correction of mislabeled data after a number of training epochs. We have tested ADE on a number of data sets drawn from the UCI data repository for nearest neighbor classifiers. The results show that for most data sets, when there exists mislabeled data, a classifier constructed using a training set corrected by ADE can achieve significantly higher accuracy than that without using ADE.
  • Reference: Intelligent Data Analysis, volume 5, pages 491–502, 2001.
  • BibTeX:
    @article{zeng.ida2001,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {An Algorithm for Correcting Mislabeled Data},
    journal = {Intelligent Data Analysis},
    volume = {5},
    pages = {491--502},
    year = {2001},
    }
  • Download the file: pdf

Improving the Hopfield Network through Beam Search

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: In this paper we propose a beam search mechanism to improve the performance of the Hopfield network for solving optimization problems. The beam search readjusts the top M (M > 1) activated neurons to more similar activation levels in the early phase of relaxation, so that the network has the opportunity to explore more alternative, potentially better solutions. We evaluated this approach using a large number of simulations (20,000 for each parameter setting), based on 200 randomly generated city distributions of the 10-city traveling salesman problem. The results show that the beam search has the capability of significantly improving the network performance over the original Hopfield network, increasing the percentage of valid tours by 17.0% and reducing error rate by 24.3%.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks, pages 1162–1167, 2001.
  • BibTeX:
    @inproceedings{zeng.ijcnn2001,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Improving the Hopfield Network through Beam Search},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks},
    pages = {1162--1167},
    year = {2001},
    }
  • Download the file: ps

Rescaling the Energy Function in the Hopfield Network

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: In this paper we propose an approach that rescales the distance matrix of the energy function in the Hopfield network for solving optimization problems. We rescale the distance matrix by normalizing each row in the matrix and then adjusting the parameter for the distance term. This scheme has the capability of reducing the effects of clustering in data distributions, which is one of main reasons for the formation of invalid solutions. We evaluate this approach through a large number (20,000) simulations based on 200 randomly generated city distributions of the 10-city traveling salesman problem. The result shows that, compared to those using the original Hopfield network, rescaling is capable of increasing the percentage of valid tours by 17.6%, reducing the error rate of tour length by 11.9%, and increasing the chance of finding optimal tours by 14.3%.
  • Reference: In Proceedings of the IEEE International Joint Conference on Neural Networks, volume 6, pages 498–502, 2000.
  • BibTeX:
    @inproceedings{zeng.ijcnn2000,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Rescaling the Energy Function in the Hopfield Network},
    booktitle = {Proceedings of the {IEEE} International Joint Conference on Neural Networks},
    volume = {6},
    pages = {498--502},
    year = {2000},
    }
  • Download the file: ps

Distribution-Balanced Stratified Cross-Validation for Accuracy Estimation

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: Cross-validation has often been applied in machine learning research for estimating the accuracies of classifiers. In this work, we propose an extension to this method, called distribution-balanced stratified cross-validation (DBSCV), which improves the estimation quality by providing balanced intraclass distributions when partitioning a data set into multiple folds. We have tested DBSCV on nine real-world and three artificial domains using the C4.5 decision trees classifier. The results show that DBSCV performs better (has smaller biases) than the regular stratified cross-validation in most cases, especially when the number of folds is small. The analysis and experiments based on three artificial data sets also reveal that DBSCV is particularly effective when multiple intraclass clusters exist in a data set.
  • Reference: Journal of Experimental and Theoretical Artificial Intelligence, volume 12, pages 1–12, 2000.
  • BibTeX:
    @article{zeng.jetai2000,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Distribution-Balanced Stratified Cross-Validation for Accuracy Estimation},
    journal = {Journal of Experimental and Theoretical Artificial Intelligence},
    volume = {12},
    pages = {1--12},
    year = {2000},
    }
  • Download the file: ps

Using a Neural Network to Approximate an Ensemble of Classifiers

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: Several methods (e.g., Bagging, Boosting) of constructing and combining an ensemble of classifiers have recently been shown capable of improving accuracy of a class of commonly used classifiers (e.g., decision trees, neural networks). The accuracy gain achieved, however, is at the expense of a higher requirement for storage and computation. This storage and computation overhead can decrease the utility of these methods when applied to real-world situations. In this paper, we propose a learning approach which allows a single neural network to approximate a given ensemble of classifiers. Experiments on a large number of real-world data sets show that this approach can substantially save storage and computation while still maintaining accuracy similar to that of the entire ensemble.
  • Reference: Neural Processing Letters, volume 12, pages 225–237, 2000.
  • BibTeX:
    @article{zeng.npl2000,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Using a Neural Network to Approximate an Ensemble of Classifiers},
    journal = {Neural Processing Letters},
    volume = {12},
    pages = {225--237},
    year = {2000},
    }
  • Download the file: pdf, ps

A New Activation Function in the Hopfield Network for Solving Optimization Problems

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: This paper shows that the performance of the Hopfield network for solving optimization problems can be improved by using a new activation (output) function. The effects of the activation function on the performance of the Hopfield network are analyzed. It is shown that the sigmoid activation function in the Hopfield network is sensitive to noise of neurons. The reason is that the sigmoid function is most sensitive in the range where noise is most predominant. A new activation function that is more robust against noise is proposed. The new activation function has the capability of amplifying the signals between neurons while suppressing noise. The performance of the new activation function is evaluated through simulation. Compared with the sigmoid function, the new activation function reduces the error rate of tour length by 30.6% and increases the percentage of valid tours by 38.6% during simulation on 200 randomly generated city distributions of the 10-city traveling salesman problem.
  • Reference: In Proceedings of the International Conference on Neural Networks and Genetic Algorithms, pages 67–72, 1999.
  • BibTeX:
    @inproceedings{zeng.icannga99a,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {A New Activation Function in the Hopfield Network for Solving Optimization Problems},
    booktitle = {Proceedings of the International Conference on Neural Networks and Genetic Algorithms},
    pages = {67--72},
    year = {1999},
    }
  • Download the file: ps

Improving the Performance of the Hopfield Network By Using A Relaxation Rate

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: In the Hopfield network, a solution of an optimization problem is obtained after the network is relaxed to an equilibrium state. This paper shows that the performance of the Hopfield network can be improved by using a relaxation rate to control the relaxation process. Analysis suggests that the relaxation process has an important impact on the quality of a solution. A relaxation rate is then introduced to control the relaxation process in order to achieve solutions with better quality. Two types of relaxation rate (constant and dynamic) are proposed and evaluated through simulations based on 200 randomly generated city distributions of the 10-city traveling salesman problem. The result shows that using a relaxation rate can decrease the error rate by 9.87% and increase the percentage of valid tours by 14.0% as compared to those without using a relaxation rate. Using a dynamic relaxation rate can further decrease the error rate by 4.2% and increase the percentage of valid tours by 0.4% as compared to those using a constant relaxation rate.
  • Reference: In Proceedings of the International Conference on Neural Networks and Genetic Algorithms, pages 73–77, 1999.
  • BibTeX:
    @inproceedings{zeng.icannga99b,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Improving the Performance of the Hopfield Network By Using A Relaxation Rate},
    booktitle = {Proceedings of the International Conference on Neural Networks and Genetic Algorithms},
    pages = {73--77},
    year = {1999},
    }
  • Download the file: ps

Extending the Power and Capacity of Constraint Satisfaction Networks

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: This work focuses on improving the Hopfield network for solving optimization problems. Although much work has been done in this area, the performance of the Hopfield network is still not satisfactory in terms of valid convergence and quality of solutions. We address this issue in this work by combing a new activation function EBA and a new relaxation procedure CR in order to improve the performance of the Hopfield network. Each of EBA and CR has been individually demonstrated capable of substantially improving the performance. The combined approach has been evaluated through 20,000 simulations based on 200 randomly generated city distributions of the 10-city traveling salesman problem. The result shows that combining the two methods is able to further improve the performance. Compared to CR without combining with EBA, the combined approach increases the percentage of valid tours by 21.0% and decreases the error rate by 46.4%. As compared to the original Hopfield method (using neither EBA nor CR), the combined approach increases the percentage of valid tours by 245.7% and decreases the error rate by 64.1%.
  • Reference: In Proceedings of the International Joint Conference on Neural Networks, pages 432–437, 1999.
  • BibTeX:
    @inproceedings{zeng.ijcnn99,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {Extending the Power and Capacity of Constraint Satisfaction Networks},
    booktitle = {Proceedings of the International Joint Conference on Neural Networks},
    pages = {432--437},
    year = {1999},
    }
  • Download the file: ps

A New Relaxation Procedure in the Hopfield Network for Solving Optimization Problems

  • Authors: Xinchuan Zeng and Tony R. Martinez
  • Abstract: When solving an optimization problem with a Hopfield network, a solution is obtained after the network is relaxed to an equilibrium state. The relaxation process is an important step in achieving a solution. In this paper, a new procedure for the relaxation process is proposed. In the new procedure, the amplified signal received by a neuron from other neurons is treated as the target value for its activation (output) value. The activation of a neuron is updated directly based on the difference between its current activation and the received target value, without using the updating of the input value as an intermediate step. A relaxation rate is applied to control the updating scale for a smooth relaxation process. The new procedure is evaluated and compared with the original procedure in the Hopfield network through simulations based on 200 randomly generated instances of the 10-city traveling salesman problem. The new procedure reduces the error rate by 34.6% and increases the percentage of valid tours by 194.6% as compared with the original procedure.
  • Reference: Neural Processing Letters, volume 10, pages 1–12, 1999.
  • BibTeX:
    @article{zeng.npl99,
    author = {Zeng, Xinchuan and Martinez, Tony R.},
    title = {A New Relaxation Procedure in the Hopfield Network for Solving Optimization Problems},
    journal = {Neural Processing Letters},
    volume = {10},
    pages = {1--12},
    year = {1999},
    }
  • Download the file: ps

Improving the Performance of the Hopfield Network for Solving Optimization Problems

  • Authors: Xinchuan Zeng
  • Reference: Master’s thesis, Brigham Young University, Computer Science Department, December 1998.
  • BibTeX:
    @mastersthesis{zeng.thesis98,
    author = {Zeng, Xinchuan},
    title = {Improving the Performance of the Hopfield Network for Solving Optimization Problems},
    school = {Brigham Young University},
    address = {Computer Science Department},
    month = {December},
    year = {1998},
    }
  • Download the file: ps

Valid XHTML 1.0 Strict Valid CSS!