DATA FOR AI Projects (Senseval 3: April 2004)

Multilingual lexical sample[23 teams]

The goal of this task is to create a framework for the evaluation of systems that perform Machine Translation, with a focus on the translation of ambiguous words. The task will be very similar to the lexical sample task, except that rather than using the sense inventory from a dictionary we will follow the suggestion of Resnik and Yarowsky and use the translations of the target words into a second language as the "inventory". The contexts will be in English, and the tags for the target words will be their translations in a second language. We plan to select words with various degrees of "interlingual-ambiguity", to create a complete picture of the various problems that may appear in this task. At the moment, we plan on two pairs of languages, English-French, and English-Hindi, with an estimated number of about 50 ambiguous words per language pair. The data will be collected via the Open Mind Word Expert (bilingual edition).

Coordinators:
Tim Chklovski, timc@mit.edu
Rada Mihalcea, rada@cs.unt.edu
Ted Pedersen, tpederse@d.umn.edu
Amruta Purandare, pura0010@d.umn.edu

Automatic Labeling of Semantic Roles [36 teams]

[webpage]

Trial data: available from the task webpage

Word-sense disambiguation has frequently been criticized as a task in search of a reason. Heretofore, the focus of disambiguation has been on the sense inventory and has not examined the major reason why we would have lexical knowledge bases: how the meanings would be represented and thus, available for use in natural language processing applications. An important baseline study for automatic labelling of semantic roles (following the FrameNet paradigm) has recently appeared in the literature ("Automatic Labeling of Semantic Roles" by Daniel Gildea and Daniel Jurafsky). The FrameNet project has put together a body of hand-labeled data and this study has put together a set of suitable metrics for evaluating the performance of an automatic system. The proposed Senseval-3 task would call for the development of systems to meet the same objectives as the Gildea and Jurafsky study. The data for this task would be a sample of the FrameNet hand-annotated data. Evaluation of systems would follow the metrics of the Gildea and Jurafsky study.

Coordinator: Ken Litkowski (ken@clres.com)

Datasets

Task ID	Task name	Trial data	Training data	Test data	Other resources	Scoring software
11	Multilingual lexical sample	download	download	download	-	[scorer2]
13	Semantic Roles	download	download	download	FrameNet	[new]