[Termination tools] Rules of the WST competition

Tue Apr 27 09:44:43 CEST 2004

>>>>> "H" == H Zantema <hzantema at win.tue.nl> writes:

    H> I think a main goal of the competition is to be able to compare the 
    H> power of various tools in a more objective way than by reading papers
    H> about these tools. I am afraid that this desired objectivity is
    H> hard to reach if

    H> * the competition is only on systems from some data base while 
    H>   the moderators of this database are participants too, and

I'm not a moderator of the database, I included every problem that
anyone submitted. I contributed myself for only a very small part of
the TPDB. I really don't understand your doubts about my objectivity.

    H> * there is an obvious way to perform well in the competition by a 
    H>   simple fake program.

But a tool doing that will be "disqualified" in the sense that nobody
will trust it anymore. Its author will have the word "cheater" on his
face for the rest of his life.

    H> Other areas like chess programs once started by only doing brute force
    H> computation, but now also use databases with well-known results. 
    H> I do not see why this could not happen in the termination community, 
    H> and then this format will not work any more.

Indeed, if a termination tool is able to use a database of well-known
problems in a clever way, that is not only some syntactic recognition
but for example recognizing a shape that will trigger some heuristic,
I think it could be nice.

    H> But for understanding why sometimes the one tool
    H> outperforms the other there is nothing wrong with searching for such
    H> examples incidentally, not only for competition-as-a-sport but also 
    H> from a scientific point of view. 

Yes, this is the goal, and such examples should be added in the TPDB.

    H> This is indeed a problem. For some categories like term rewriting there
    H> is already quite a list, but for some other categories (like string
    H> rewriting) the list is still too poor for being the basis for a serious
    H> competition.

So, again, contributions to the TPDB are welcome.

    >> I think the best thing for future WST competitions would be to
    >> encourage growth of the TPDB (with cleaning, removing duplicates,
    >> organizing problems differently, etc.), so that a selection of random
    >> problems would become the best way of running the WST competition.
    >> Unfortunately, Albert Rubio had to give up from its task of
    >> maintaining the TPDB, I had to do the porting to the new format
    >> myself, hence the TPDB did not grow much since last year. Regarding
    >> that, I would like to thank people who sent me mails pointing out the
    >> mistakes, and proposed new examples. I think there is no more mistake
    >> now. Any idea of organizing the TPDB differently, any suggestions, are
    >> welcome.  Volunteers are welcome.

    H> Indeed moderating TPDB is a substantial ongoing job. I think it is good
    H> that more than one person is responsible for this. I assume there will
    H> be candidates for this job (including me). Let's decide in Aachen.

    >> >So, for this year, I will follow my first idea of running all tools on
    >> all the TPDB. If some of you are willing to set up another
    >> "competition" on a set of examples that will come from another source,
    >> please do so.  If this set of examples is made available in the TPBD
    >> format, I can run my wrapper on it, it is no problem with me since it
    >> runs automatically.

    H> I agree for the categories having sufficiently many systems in TPDB this
    H> is the best decision for this moment. Other formats have advantages
    H> and disadvantages, but it is not wise to change the format drastically
    H> only a few weeks before the competition. 

    H> However, I already mentioned that for string rewriting the data base
    H> is still too poor, and may be for other categories too. In order to
    H> force extension of these categories I propose that for these 
    H> categories the participants may submit 10 systems before some date
    H> (say, one week before WST). Then these systems will be added to 
    H> the data base and the competition will be on the full data base 

It is OK with me, you can submit systems to the TPDB whenever you like. 

    H> including these new systems. I hope that also term rewrite tools
    H> will participate on string rewriting, and submit new systems.

    H> One final point: I should like to be decided that only valid 'YES' 
    H> answers count, and not 'NO' answers. As I said before techniques for
    H> proving non-termination are quit different from proving termination,
    H> these areas should not be mixed. As most tools do not have facilities
    H> for proving non-termination this would hardly influence the competition,
    H> but it is good to be clear about the rules. Can you adjust the rules
    H> for this? 

I will make two classifications, one counting the NO and one not
counting the NO.

-- 
| Claude Marché           | mailto:Claude.Marche at lri.fr |
| LRI - Bât. 490          | http://www.lri.fr/~marche/  |
| Université de Paris-Sud | phoneto: +33 1 69 15 64 85  |
| F-91405 ORSAY Cedex     | faxto: +33 1 69 15 65 86    |