DATA FOR AI Projects (Senseval 3: April 2004)Multilingual lexical sample[23 teams] The goal of this task is to create a framework for the evaluation of systems that perform Machine Translation, with a focus on the translation of ambiguous words. The task will be very similar to the lexical sample task, except that rather than using the sense inventory from a dictionary we will follow the suggestion of Resnik and Yarowsky and use the translations of the target words into a second language as the "inventory". The contexts will be in English, and the tags for the target words will be their translations in a second language. We plan to select words with various degrees of "interlingual-ambiguity", to create a complete picture of the various problems that may appear in this task. At the moment, we plan on two pairs of languages, English-French, and English-Hindi, with an estimated number of about 50 ambiguous words per language pair. The data will be collected via the Open Mind Word Expert (bilingual edition).
Coordinators:
Automatic Labeling of Semantic Roles [36 teams]
[webpage]
Trial data: available from the task webpage
Word-sense disambiguation has frequently been criticized as a task in
search of a reason. Heretofore, the focus of disambiguation has been on
the sense inventory and has not examined the major reason why we would
have lexical knowledge bases: how the meanings would be represented and
thus, available for use in natural language processing applications. An
important baseline study for automatic labelling of semantic roles
(following the FrameNet paradigm) has recently appeared in the
literature ("Automatic Labeling of Semantic Roles" by Daniel Gildea and
Daniel Jurafsky). The FrameNet project has put together a body of
hand-labeled data and this study has put together a set of suitable
metrics for evaluating the performance of an automatic system. The
proposed Senseval-3 task would call for the development of systems to
meet the same objectives as the Gildea and Jurafsky study. The data for
this task would be a sample of the FrameNet hand-annotated data.
Evaluation of systems would follow the metrics of the Gildea and
Jurafsky study.
Coordinator: Ken Litkowski (ken@clres.com)
Datasets
|