John W. N. van Smaalen, PhD Candidate
Dept. of Surveying and Remote Sensing, Wageningen Agricultural University Office: Hesselink van Suchtelenweg 6 Snail mail: P.O. Box 339, 6700 AH Wageningen, The Netherlands Email: [email protected]
'A typical objective of generalization is to capture the essential character of some phenomena and remove unnecessary spatial and attribute detail.'
But which information is essential and which unnecessary? Just like the structuring of the original data this depends strongly on the purpose. Consequently the generalization strategy to be followed is also dependent on the use of the generalized
dataset.
When looking at the various generalization processes distinguished by different authors, we find that most tend to divide generalization into two distinct parts.
'Geographical information abstraction mainly concerns managing geographical meaning in databases, and map generalization mainly concerns structuring map presentations. For these reasons, it is convenient and useful to separate geographical information abstraction and map generalization.' [Nyerges '91]
Nyerges' distinction roughly agrees with the distinction in conceptual generalization (= information abstraction) and cartographic generalization (= map generalization) that will be used in the research presented here (Figure 1). Thus the process
of generalization can be divided into:
.
Figure 1: Conceptual vs. Cartographic generalization
.
In cartography generalization is mostly done by hand, interactively. Although organizations involved in the production of geographic and cartographic material are highly interested in automation of the generalization process there is not a complete solution yet. So far, the emphasis has been on the graphic representation instead of the underlying semantic structure.
.
The research will concentrate primarily on the thematic and topological issues. Both topology and thematics contain semantic information, the semantics of the data will therefore play a central role. The objective is to develop a framework for
conceptual generalization as well as a working prototype. The proposed method will not try to imitate the generalization actions of human professionals. In the first place it would be very difficult, if not impossible, to include all considerations taken into account by human cartographers. Besides that, although proven more than usable human interpretations are not always very consistent. In this research it is avoided to speak
about generalization in terms of scale (i.e. "generalizing from scale 1 : 1.000 to 1 : 25.000"). Instead, thematic issues and aggregation hierarchies will completely direct the generalization process.
Terminology used (refers to the representation of reality in a model, not reality itself). 'object class' (employees) 'object' (employee Peter Peterson) 'attribute' (salary of employees) 'attribute value' ($4000 salary of Peter Peterson) 'relationship type' (employees works at departments) 'relationship' (Peter Peterson works at Sales)
The strategy of the research is based on the definition of a powerful data structure to describe the basic information, see [Richardson '93]. Elementary terrain features and their mutual relationships are described using geometric and thematic
components. The data will be stored in vector format, structured according to the Formal Data Structure [Molenaar '89], an object based model (for terminology used see Table 1). At this time, the value of the object-oriented approach for many applications
is widely recognised. Also for generalization:
'An object-oriented approach is a good framework for implementing a prototype generalization system. It allows to correctly identify objects, their attributes and their behaviors.' [Kilpela-nen '92], [Armstrong & Bennett '90]
A characteristic of the object model is that not only attributes, but also functions can be assigned to the object class(es). In generalization this means that generalization functions, applicable for the geographical feature classes (rivers, roads, fields etc.), can be stored with the object class description.
Conceptual generalization based on a combination of classification hierarchies (top-down), topological/spatial analysis (bottom-up) and rules seems to be the most promising approach towards a more powerful system. Furthermore it is
important to be sure whether the main aim is:
In case the aim is a combination of both thematic and spatial abstraction one has to be alert to assure that the thematic abstraction will not be a submissive issue. Therefore special attention should be given to the behavior of attributes when generalizing.
The system described is primarily being developed to generalize detailed cadastral and topographic data but is designed to provide a more generic approach. It consists of two clearly distinguishable parts (Figure 2):
.
Figure 2. Functional description of the system.
.
.
Figure 3. Example of a semantic network including
generalization rules.
.
The system will provide a graphic interface to define, visualize and change class relationships. Figure 3 gives an example of such an interface. Using this interface the 'expert user' builds a semantic network containing the relationships between the object classes. Examples of these relationships are:
Figure 4. Operations used in conceptual generalization.
.
Generalization rules also refer to one or more object classes and can therefore be included in the same structure.
Figure 3 also shows that not only object classes with a spatial representation (building, parcel) but also other object classes (resident, address) are included because these can also play a role in the generalization process. An example is the
aggregation of residential areas based on the income of the residents.
In order to define the semantic network the 'expert user' is assisted by procedures to automatically derive as much information as possible from the (type of) dataset used. The attribute structure (common attributes) of the objects will be used to derive initial classification hierarchies. The attributes values will be used to provide measures of similarity for different object classes. If possible the spatial attributes (geometry, topology) will also be used for these purposes.
By implementing classification- and other relationships as a semantic network flexibility is assured. Apart from the semantic network, describing the relationships between the object classes, a network will be constructed containing the (elementary) generalization rules and operations and their mutual dependencies. This rule network is linked to the semantic network through the object classes to which the rules apply (Figure 3). The order in which the elementary generalization operations should be executed is considered essential and will be included in the rule network.
Resumed: in phase 1 is dealt with the definition of generic relationships at class level, unlike phase 2 which deals with specific relationships between the actual objects. In phase 2 the individual objects are known and located, the relationships will therefore generally have a more spatial character.
.
The dataset to be generalized is stratified in order to select the generalization rules applicable to each area. An inference mechanism (rule based reasoning) will be used for the execution of the generalization rules.
User interaction will be limited to the absolute minimum in the second phase. After adjusting the required parameters applying to the rules, the generalization process is executed without interruption. Often used sets of parameters could be stored as predefined 'profiles', like the user-defined settings of a word processor.
.
.