A Programming Paradigm for Machine Learning, with a Case Study of Bayesian Networks.

Lloyd Allison,
ACSC2006, pp.103-111, January 2006.

Abstract: Inductive programming is a new machine learning paradigm which combines functional programming for writing statistical models and information theory to prevent overfitting. Type-classes specify general properties that models must have. Many statistical models, estimators and operators have polymorphic types. Useful operators combine models, and estimators, to form new ones; Functional programming's compositional style of programming is a great advantage in this domain. Complementing this, information theory provides a compositional measure of the complexity of a model from its parts.

15-variable, mixed, Bayes net
Missing Person Bayes Net

Inductive programming is illustrated by a case study of Bayesian networks. Networks are built from classification- (decision-) trees. Trees are built from partitioning functions and models on data-spaces. Trees, and hence networks, are general as a natural consequence of the method. Discrete and continuous variables, and missing values are handled by the networks. Finally the Bayesian networks are applied to a challenging data set on lost persons.

Keywords:  inductive inference, functional programming, Haskell, minimum length encoding, statistical models, Bayesian networks.

[paper.pdf], [paper.ps] or pdf@[acm.org]['06].

Also see [JFP05].