|
Word sense disambiguation
(WSD) has been recognised as a central (and difficult) problem in the very
first paper on computer treatment of language, Weaver's memorandum (Weaver,
1949). Since then, there has been continuous research on WSD, in the
context of various sub-fields (machine translation, information retrieval,
content analysis, natural language understanding, etc. -- for a recent
survey, see Ide and Véronis,
1998). An impressive array of methods has been proposed, and occasionally
rediscovered over the years, and various claims of efficiency have been
made. However, it is extremely difficult to compare the results, and therefore
the methods: the texts, words and sense lists used are widely different
across studies, as well as the evaluation protocols and metrics. Under
the auspices of ACL-SIGLEX and EURALEX, the SENSEVAL
evaluation exercise is attempting for the first time to run an ARPA-like
competition between WSD systems.
Discussions among the SENSEVAL
program committee members pointed out the differences in existing linguistic
resources (corpora, dictionaries, etc.) between English and other languages
and decided to organise within SENSEVAL a specific competition for Romance
languages, called ROMANSEVAL. A six-month test campaign is planned in coordination
with the ARCADE
project on multilingual text alignment, whose word track will use the same
corpus and test words. Results will be presented at the SENSEVAL
workshop in September 1998.
|