Aggregation as a Subtask of Text and Sentence planning

Hercules Dalianis

Abstract

Natural language generation is the technique of letting a computer automatically create natural language, e.g. English, Chinese or Greek, out of a computational representation. To generate natural language from computational representations, a number of processes must be carried out. Part of the process called sentence planning is the task of aggregation. Aggregation is the process which removes redundancies during generation of a natural language discourse without losing any information. Aggregation, which has been called ellipsis or coordination in Linguistics, makes text more fluent and easily read. While people do aggregation all the time without thinking about it, the contents of software engineering tools, data bases and expert systems, etc., is often highly redundant and needs aggregation before paraphrased to natural language. This paper summarizes a larger work [Dalia96] which address various aspects of aggregation. When do we need to carry out aggregation ? What type of aggregations are there? Are there any general rules for how to aggregate? How are the rules related to each other? Aggregation may give rise to ambiguities: How can we solve them? How is aggregation related to the other generation processes?

pdf