[Termtools] (sub) categories, specification of input format, test cases

Johannes Waldmann waldmann at imn.htwk-leipzig.de
Wed May 31 12:48:25 CEST 2006


Dear colleagues.

I am looking forward to the competition. Does anyone have interesting
guesses (e. g. how many of last year's unsolved problems will be solved,
how many CPU hours will be needed, what number of problems
will be the difference between first and second place,
how many unsolved problems from the first round (1 minute)
will be solved in the extension round (5 minutes) etc.)
I'd suggest we post the questions and guesses here,
just for our entertainment. (After the competition,
we can give awards to "best guesses".)


But now, seriously.

The rules define the concept of "category" (SRS, TRS, LP)
but not "sub-category" (as seen in last year's results table).

A prover must register for a (full) category,
so it must behave correctly
on all combinations of THEORY, STRATEGY,
and relative and conditional rules, right?

That means if a prover decides to say "DONT KNOW"
for relative termination problems, then it will loose some points
(in the total score) with respect  to another prover
that emits at least some proofs (e. g. by treating "->=" as "->")

And what are the implications for submitting secret problems?
Each prover (author) submits up to 10 problems per category
(irregardless of subcategory)?


The input format http://www.lri.fr/~marche/tpdb/format.html
is semantically underspecified in several places, e. g.
what happens for "(STRATEGY INNERMOST) (STRATEGY OUTERMOST) ..."
what theories are pre-defined (A, C, AC  I guess)?
what is "(THEORY (A x) (C x))", do these declarations accumulate?

Why am I so concerned about this? I think one of the goals
of the competition is to attract interest from "users"
(i. e. researchers outside our community) that might want to apply
automated termination provers to their problems
(much the same way as some of us now apply SAT solvers).
But software (e. g. a termination prover) can only be used sensibly
if it is clearly specified, and this of course starts
with the syntax and semantics of the input language.
Note that termination provers are quite likely to be applied
in software/specification verification, so we should try to be
very formally exact in all details --- If we neglect this,
this will discourage users, especially those with a "formal" education.
And if we do not explore all corner cases and of the specification
(and of the provers) now, then certainly our users will stumble
on them, much to our embarrassment.

To make a concrete proposal, I plan to make a collection
of TRS/SRS problems (test cases) that show
(a) all expected uses of the specification
    (all sensible combinations of annotations to problems)
(b) all expected mis-uses (i. e. input that should be rejected (*)
    because it has semantic problems, e. g. conflicting strategies,
    see above)
I think that this problem set could be used
on any prover that enters next year's competition.
So, if you have any suggestions for test cases, please send them to me
or to this list (do we have a public archive meanwhile?)


(*) this shows another weak point of the current specification:
at the moment, a prover's answer is YES, NO, DONTKNOW
but we need another return value (e. g. ERROR) that is used
if the prover detects inconsistent input. As I said, we don't need this
for the competition immediately (because Claude can pre-check the input)
but our users are going to need it. (And I would need it
to automate the testing as described above.)

Best regards,
-- 
-- Johannes Waldmann -- Tel/Fax (0341) 3076 6479/80 --
---- http://www.imn.htwk-leipzig.de/~waldmann/ -------

_______________________________________________
Termtools mailing list
Termtools at lists.lri.fr
http://lists.lri.fr/mailman/listinfo/termtools



More information about the Termtools mailing list