Conditional independence tests in Weka are slightly different from the
standard tests described in the literature. To test whether variables
and
are conditionally independent given a set of variables
,
a network structure with arrows
is compared with
one with arrows
.
A test is performed by using any of the score metrics described in Section
2.1.
At the moment, only the ICS [11]and CI algorithm are implemented.
The ICS algorithm makes two steps, first find a skeleton (the undirected graph with edges
there
is an arrow in network structure) and second direct all the edges in the skeleton
to get a DAG.
Starting with a complete undirected graph, we try to find conditional independencies
in the data. For each pair of nodes
,
, we consider sets
starting with cardinality
, then
up to a user defined maximum. Furthermore,
the set
is a subset of nodes that are neighbors of both
and
. If an
independency is identified, the edge between
and
is removed from the skeleton.
The first step in directing arrows is to check for every configuration
where
and
not connected in the skeleton whether
is in the set
of
variables that justified removing the link between
and
(cached in the
first step). If
is not in
, we can assign direction
.
Finally, a set of graphical rules is applied [11] to direct the remaining arrows.
Rule 1: i->j--k & i-/-k => j->k
Rule 2: i->j->k & i--k => i->k
Rule 3 m
/|\
i | k => m->j
i->j<-k \|/
j
Rule 4 m
/ \
i---k => i->m & k->m
i->j \ /
j
Rule 5: if no edges are directed then take a random one (first we can find)
The ICS algorithm comes with the following options.
Since the ICS algorithm is focused on recovering causal structure, instead of
finding the optimal classifier, the Markov blanket correction can be made
afterwards.
Specific options:
The maxCardinality option determines the largest subset of
to be
considered in conditional independence tests
.
The scoreType option is used to select the scoring metric.