MERLIN 2.0
MERLIN 2.0 (Model Extraction by Regular Language INference) is an
inductive logic programming (ILP) system that uses an overly general
hypothesis in the form of a logic program together with sets of
positive and (optionally) negative examples in order to find an
inductive hypothesis which entails all positive examples but no
negative examples.
MERLIN 2.0 is particularly suited for learning recursive
hypotheses. Apart from being able to learn from positive examples only
and invent new predicates, one of the main features of the system is
that it may infer both base clauses and recursive clauses from
a single example. This contrasts to traditional covering techniques,
which produce at most one clause from each example and furthermore
need particular examples from which base clauses but no recursive
clauses are to be induced.
MERLIN 2.0 first tries to find SLD-refutations for the positive
examples using the overly general theory. Wieving the sequences
of input clauses in these refutations as strings in a formal language,
the system induces a finite-state automaton that can generate all
positive sequences (and none of the negative sequences). The automaton
is then used to specialise the overly general hypothesis, a process in
which new predicates are invented.
There are two modes in which MERLIN 2.0 may be run:
- Learning from positive and negative examples
This mode employs a hill-climbing technique for finding the minimal
automaton that can generate all positive and no negative sequences.
Note that running MERLIN 2.0 under this mode results in the same
hypotheses as produced by the former version MERLIN 1.0. This mode requires no further
parameters to be specified, but is not very useful in the absence of
negative examples.
- Learning from positive examples only
This mode employs an incremental hill-climbing technique for inducing
the most probable Hidden Markov Model structure that can generate the
positive sequences X, by maximising log P(Ms) + log P(X|Ms), where log
P(Ms) is the natural logarithm of the prior of the model structure Ms
and log P(X|Ms) is the logarithm of the probability of the sequences
given Ms. Note that in the current version of the system, the negative
examples are completely disregarded in this mode, e.g. it is not
checked whether negative examples are covered or not. Furthermore,
minimisation of the induced automaton is currently not employed, many
times resulting in redundant predicates being included in the final
hypothesis.
For algorithmic details, see
Boström H., "Theory-Guided Induction of Logic Programs
by Inference of Regular Languages", Proc. of the 13th International
Conference on Machine Learning , Morgan Kaufmann (1996) pp 46-53
Boström H., ``Predicate Invention and Learning from Positive Examples Only'',
Proc. of the Tenth European Conference on Machine Learning, Springer Verlag (1998) pp 226-237
MERLIN 2.0 was implemented in SICStus Prolog 3 #5
by
Dr. Henrik Boström.
How to get the program
MERLIN 2.0 can be obtained in two forms: as SICStus Prolog source code
or as a stand-alone application for SUN OS 5.4. In the former
case it is required that
SICStus 3 has been installed (or that the code is adapted to suit
your own Prolog system, which should not be a too difficult task as most
built-in predicates used are standard). In the latter case it is required
that Tcl 7.5 / Tk
4.1 has been installed.
The source code for MERLIN 2.0 can be downloaded by clicking the following item:
MERLIN 2.0 (SICStus version)
(112 kb when uncompressed)
The stand-alone version of MERLIN 2.0 can be downloaded by clicking the following item:
MERLIN 2.0 (SUNOS version)
(1.5 MB when uncompressed)
Assuming that you have saved the file as MERLIN2.tar.gz, expand the file by:
i) gunzip MERLIN2.tar.gz
ii) tar -xvf MERLIN2.tar
This results in that the directory MERLIN2 is created.
The SICStus version of MERLIN 2.0 is started by giving the command
'sicstus -l merlin2' at this directory.
The stand-alone version of MERLIN 2.0 is started by giving the command 'merlin'
at the directory MERLIN2 (in the current version it is not possible
to start the program from other directories).
For questions and bug reports, please send an email to henke@dsv.su.se.