Next: Search algorithms Up: Local score based structure Previous: Local score based structure   Contents

## Local score metrics

We use the following conventions to identify counts in the database and a network structure . Let () be the cardinality of . We use to denote the cardinality of the parent set of in , that is, the number of different values to which the parents of can be instantiated. So, can be calculated as the product of cardinalities of nodes in , . Note implies . We use (, ) to denote the number of records in for which takes its th value.We use (, , ) to denote the number of records in for which takes its th value and for which takes its th value. So, . We use to denote the number of records in .

Let the entropy metric of a network structure and database be defined as

 (2)

and the number of parameters as
 (3)

AIC metric The AIC metric of a Bayesian network structure for a database is

 (4)

A term can be added [1] representing prior information over network structures, but will be ignored for simplicity in the Weka implementation.

MDL metric The minimum description length metric of a Bayesian network structure for a database is is defined as

 (5)

Bayesian metric The Bayesian metric of a Bayesian network structure for a database is

where is the prior on the network structure (taken to be constant hence ignored in the Weka implementation) and the gamma-function. and represent choices of priors on counts restricted by . With (and thus ), we obtain the K2 metric [5]

With (and thus ), we obtain the BDe metric [8].

Next: Search algorithms Up: Local score based structure Previous: Local score based structure   Contents
Remco Bouckaert 2008-05-12