On Lexical Aggregation and Ordering

Hercules Dalianis & Eduard Hovy

Abstract

Aggregation is the process of removing redundant information during language generation while preserving the information to be conveyed. Various types of aggregation (syntactic, lexical, referential) have been identified in other work. This paper investigates lexical aggregation, the process by which a set of items is replaced with a single new lexeme that encompasses the same meaning. It can be divided into two major types, Bounded and Unbounded. With Bounded lexical aggregation (where the aggregator lexeme covers a closed set of concepts) the redundancy is obvious and the aggregation process must be carried out, while Unbounded lexical aggregation has to be licensed by other factors, such as the hearer's goals. Furthermore, although lexical aggregation operates over lexis, interactions between syntactic and lexical aggregation necessitate the careful ordering of their respective rules. We performed an experiment to determine the optimal ordering(s) by applying several aggregation rules, in all permutations, to the clauses of a text plan, also permuted. The paper describes the somewhat surprising results.

Postscript version