Probability Model Type Sufficiency

L. J. Fitzgibbon, L. Allison and J. W. Comley, International Conference on Intelligent Data Engineering and Automated Learning (IDEAL-2003), Hong Kong, doi:10.1007/978-3-540-45080-1_102, 21-23 March 2003.

Abstract: We investigate the role of sufficient statistics in generalized probabilistic data mining and machine learning software frameworks. Some issues involved in the specification of a statistical model type are discussed and we show that it is beneficial to explicitly include a sufficient statistic and functions for its manipulation in the model type's specification. Instances of such types can then be used by generalized learning algorithms while maintaining optimal learning time complexity. Examples are given for problems such as incremental learning and data partitioning problems (e.g. change-point problems, decision trees and mixture models).