HPSG



Head-Driven Phrase Structure Grammar is the latest link in the chain of Context-Free Grammars or CFGs which have found such extensive usage in computational linguistics. These grammars are easy to implement on computers because they divorce the structure of a statement from the context in which it is uttered.

The particulars of HPSG were spelled out formally for the first time by Pollard and Sag in their work of 1994. This work, however, was not complete as the grammatical formulations of a lot of constructions was not provided. The grammar has continuously evolved, and is still evolving, with various researchers around the world adding up the bits and pieces.

This grammar is particularly easy to apply to computers as it drastically reduces the rules and syntactic formulations. For example, all the complexities of the English language are dealt with by 6 rules and a small number of principles. This simplicity, however, comes with a corresponding increase in the complexity of the word-list, as all lexical entries are highly articulated to deal with all possible structures and interpretations. Consider, for example, the following examples:

  1. Rajat dines.
  2. Rajat devours the pizza.
  3. Rajat dines the pizza.
  4. Rajat devours.
Both dines and devours are verbs that have to do with eating. Yet, statements 1 & 2 are grammatical, while 3 & 4 are not. These anomalous behaviours are taken care of in hpsg by suitable modifications in the lexical entries for these words.

HPSG is a head-driven grammar, which means that each phrase is assigned a head which controls the grammatical behaviour of that phrase. For example, in the phrase the cat in the socks the entity being talked about is the cat and it controls the grammatical behaviour, such as plurality, of the whole phrase.

Lexical Entries in HPSG

As already mentioned, the word-entries in HPSG are highly detailed, specifying under what circumstances a particular word can be used and in combination with which words and phrases. The conditions of usage for a word are stored in the Content and Context values of the entry. The conditons on co-occurence with other entries are governed by the sub-category list. All three combine to give the Synsem (Syntax+Semantics) value of the word. For example the detailed entry for the word "she" is as follows:

The lexical entries used by us are not as detailed but capture all the essential features of the above structure.

Rules and Principles

All the complex constructions of the English language are reduced to a set of 6 grammatical rules and a small number of principles. The rules specify whether a given phrase is grammatical or not, and the principles say how some phrases will behave after they are combined. Following is an enumeration of the 6 rules and 2 of the most important principles:

RULES: A phrase is grammatical if

  1. The sub-category list is empty.
  2. The sub-category list is of length one and the head-daughter is a word.
  3. The sub-category list is empty and the head-daughter is a word.
  4. Head-Marker rule: Marker-daughter's head is a marker.
  5. Head-Adjunct rule: Adjunct dtr's mod value is same as synsem value of head-dtr.
  6. Head-Filler rule: Head-dtr is of form: [Head[verb[VFORM finite, subcat<>] and Filler-dtr's local value is same as Head-dtr's slash value.

PRINCIPLES:

  1. Head Feature Principle: The head of a phrase is identical to the head of it'shead-daughter.
  2. Subcategorization Principle: The sub-category list of a phrase = sub-category list of the head phrase-those members in the subcategory list that have already been satisfied.