Hercules Dalianis
Abstract
The content of real-world databases, knowledge bases, database models, and formal
specifications is often highly redundant and needs to be aggregated before these
representations can be successfully paraphrased into natural language.
To generate natural language from these representations, a number of processes must
be carried out. One of these processes is sentence planning where the task of aggregation is carried out.
Aggregation, which has been called ellipsis or coordination in Linguistics, is the
process that removes redundancies during generation of a natural language discourse,
without losing any information.
This article addresses various aspects of aggregation: When do we need it? What types
of aggregations exist? Are there any general rules for aggregation? How can we solve
the ambiguities introduced by aggregation? How is aggregation related to other generation
processes?
The article describes a set of corpus studies that focus on aggregation, provides
a set of aggregation rules, and finally, shows how these rules are implemented in
a couple of prototype systems.
We develop further the concept of aggregation and discuss it in connection with the
growing literature on the subject. This work offers a new tool for the sentence planning
phase of natural language generation systems.