CLustering using Indifference and Preference relations

Get Complete Project Material File(s) Now! »

Modelling a decision aiding process

As part of the decision aiding process, several steps can be identified. These are the problem situation, the problem formulation, the selection of an evaluation model and the extraction of a final recommendation [Tsoukiàs 2007]. They may not be all explored during each decision aiding process, nor in any particular order, In general, one of the first steps that need to be taken, when considering a decision problem, is to situate the problem.
Some of the main questions that should be answered during this step are [Tsouk-iàs 2007]:
• who has a problem?
• why is this a problem?
• who decides on this problem?
• what are the stakes?
• etc.
Although the dm may already consider some of these questions before commit-ting to making a decision, the answers to these questions should also be explored when employing the aid of an analyst. During this step, the set of actors which take part in the decision process, the stakes each of them brings and the resources they commit on their stakes need to be identified. Exploring this step is also beneficial in helping to eliminate any misunderstandings between the dm and the analyst.

Formulating the problem

While the previous step has a more descriptive purpose, the step of formulating the problem has a more constructive character. At this point the analyst needs to reduce the real problem to a formal one. If the model is accepted by the dm, then diﬀerent methods can be used afterwards in order to solve the decision problem.
At this point, the alternatives, or potential decisions, need to be identified, along with the set of points of view on them, i.e. how the alternatives are evaluated, and finally the objective, or what the dm expects to find at the end of this process.
Several decision aiding approaches will stop after this step, as they consider that understanding the problem is equivalent to solving it. Other approaches consider the formulation of the problem as given and will only consider the next steps. Regardless of these approaches, the step of proposing a problem formulation is very important in the decision aid process and is used further to construct the evaluation model.

The evaluation model

This step represents the part that most classical decision aiding approaches focus on. With the alternatives defined and the criteria, which express the features of the alternatives that are important to the dm, constructed, diﬀerent models and approaches can be employed in order to find a formal solution. A certain model, which is able to capture the preferences of the dm on the set of alternatives, needs to be first selected and tuned accordingly. This step is very important as the final result will be strongly reflected by this choice. The model is then used to construct a global relation between the alternatives which is then exploited by a selected method in order to provide a solution. An important aspect when selecting the evaluation model is to have this model emerge naturally from the problem formulation and not by first choosing a specific method and then formulating the problem around it [Tsoukiàs 2007].

The final recommendation

The final recommendation is constructed by translating the formal solution from the previous step into a clear and easily understandable recommendation from the point of view of the dm.
Several aspects need to be validated when constructing this result:
• the analyst needs to be sure that the model is formally correct;
• the dm needs to be sure the he understands the model and that it reflects his preferences;
• the recommendation needs to be accepted or not by the dm, following which the reasons for this decision should be properly understood.

Modelling preferences

Finding a preference model and tuning it so that it is able to capture the point of view of the dm is a crucial step in the decision aiding process. We will focus in this section on the two major approaches to modelling preferences: utility function and outranking relations. Before detailing them, it is important to state that regardless of the chosen model, one of the following binary relations may be extracted for any pair of alter-natives [Roy 1996, Greco et al. 2010]:
• Indiﬀerence (I): a reflexive and symmetric relation that appears between al-ternatives that are considered equivalent by the dm.
• Strict preference (P): a non-reflexive and asymmetric relation between an al-ternatives that is clearly favoured by the dm instead of the other.
• Weak preference (Q): a non-reflexive and asymmetric relation between an alternative which is either strictly preferred or indiﬀerent to another.
• Incomparability (R): a non-reflexive and symmetric relation between two al-ternatives that cannot be placed in either of the previous relations.

Value functions

The Multi-attribute Value Theory (mavt) [Keeney and Raiffa 1976] avoids the diﬃculty of the multidimensional character of the alternatives by creating a unique criterion that aggregates all the criteria together. In this way all the dimen-sions that define an alternative are reduced to a single one, the vector of evaluations of the alternatives to a single value, or score. This approach assumes that the preferences of the decision maker can be mod-elled as a weak order over the set of alternatives X and models the complete and transitive binary relation on X using an overall value function U [Roy 1971b, Keeney and Raiffa 1976] in such a way that: x y ⇐⇒ U (x) > U (y), ∀x, y ∈ X. The most common overall value function is the additive form [Fishburn 1964, Fishburn 1965, Fishburn 1970], which is equal to the sum of all the marginal value functions ui: U (x) = ui(xi), ∀x ∈ X, i∈I. The marginal value function ui determines the score given to an alternative on criterion i ∈ F . One particular and well studied form of the overall value function is the weighted sum, where we associate with every criterion i ∈ F a weight wi: U (x) = wi.xi, ∀x ∈ X. i∈F. The weights, in this case, have the role of trade-oﬀs between criteria, e.g. a low value on one criterion may be compensated by a high value on another, if first criterion has a lower weight than that of the second.
Let us notice that the additive form makes an important assumption on the independence of the criteria. If the criteria are not independent, we may use diﬀer-ent models, like for instance the Choquet integral [Choquet 1953] or the Sugeno integral [Sugeno 1974], which also consider possible interactions between criteria. These interactions can be positive, when, for instance, the impact of one criterion is reinforced by the impact of another, or negative, when two criteria are partially redundant. In comparison to the additive function, the Choquet integral and the Sugeno integral have more parameters in the form of weights, which are called capacities for the first and fuzzy measures for the second. Several common issues exist when expressing preferences using value functions. The first lies in the selection of the value function and its form, i.e. its parameters. This is a diﬃcult task and diﬀerent ways to elicit these parameters have been pro-posed. We will mention several of them later in this section. A second issue revolves around the conversion of qualitative scales into quantitative ones. This conversion is needed due to the way in which value functions are defined. For example, trans-forming the scale {good, medium, bad} into {3, 2, 1} would impose the fact that a good evaluation is considered three times better than a bad evaluation, which may not be the case. One advantage of representing the preferences of the dm in this way is the fact that it produces a complete weak order on the set of alternatives. This allows us to compare the alternatives together with ease. We will always be able to construct either a relation of preference (P) between two alternatives, if one has a higher score than the other, or one of indiﬀerence (I), if their scores are equal, but never a relation of incomparability (R).

READ Curved space-time Dirac equation in the honeycomb and triangular lattice

Outranking relations

Instead of reducing the evaluations of each alternative to a single score, another approach is to compare the evaluations between pairs of alternatives systematically and aggregate these comparisons into a binary relation called an outranking rela-tion.
There are several methods of constructing these relations, among the most com-monly used being the electre methods [Keeney and Raiffa 1976, Roy 1993] and the promethee methods [Brans and Mareschal 2002]. We mention also the rubis method [Bisdorff et al. 2008], which is defined in a bipolar setting. We will detail below the principle on which all these methods build the outranking relation.
An alternative x is said to outrank another alternative y if [Roy 1993]:
1. there is a qualified majority of weighted criteria on which x is performing at least as good as y.
2. there is no criterion on which y seriously outperforms x.
The first statement is modelled through local concordance degrees, which mea-sure for each criterion i ∈ F if the evaluation of x on i is at least as good as that of y. Generally, these degrees are valued on a scale from 0 to 1. If x is not at least as good as y on criterion i, then the local concordance degree is 0, if x is at least as good as y then this degree is 1. If there is a measure of uncertainty with respect to the two statements, a value between 0 and 1 may be used, depending on the particular type of outranking relation begin used.
The criteria on which x has an evaluation at least as good as y are then put into balance, by additionally using a set of criteria importance weights and constructing a global concordance degree. The importance weights have, in this case, a diﬀerent meaning than the weights used in value functions. As the outranking methods are inspired from the voting procedures in social choice theory, each criterion can be seen as a group of voters having the same opinion, while the weight assigned to them represents their strength, as in number. On the other hand, the weights using in value functions are interpreted as trade-oﬀs between criteria. The condition of having the criteria independent is also required in the case of outranking relations.
Unlike in mavt, dealing with both qualitative and quantitative scales is much simpler in this case. We no longer need to convert a qualitative scale into a numerical representation, but only to define an operator for assessing if an evaluation is at least as good as another or not. In order to deal with potential imprecision in the evaluations of the alternatives on the set of criteria, some of these operators take into account discrimination thresholds. In most cases, two such thresholds are defined on each criterion: an indiﬀerence threshold and a preference threshold, which are used to define intervals inside which a diﬀerence between evaluations can be considered as indiﬀerent, or clearly significant respectively. Discrimination thresholds are also used to deal with the subjective perception of the dm with respect to the criteria evaluations. For instance, if a dm considers the problem of buying a car, focusing equally on only two criteria, the price and the fuel consumptions, if one of the cars is more expensive but has a fuel consumption lower than the second, then he is faced with a dilemma. If the first car is more expensive with only a little amount, which the dm considers negligible, then the initial dilemma is removed. He is therefore able to select the slightly more expensive car.

Table of contents :

Résumé
Introduction
I On the problem of clustering in MCDA
1 On clustering
1.1 An introduction to cluster analysis
1.1.1 Clustering and Classification
1.1.2 Fields of application
1.1.3 Clustering challenges
1.2 Measuring similarity
1.2.1 Data characteristics
1.2.2 Measures on quantitative scales
1.2.3 Measures on qualitative scales
1.3 Overview of clustering approaches
1.3.1 Partitioning clustering approaches
1.3.2 Hierarchical clustering approaches
1.3.3 Other approaches
2 Multiple criteria aid for decisions
2.1 Multiple criteria decision aiding
2.1.1 A brief introduction
2.1.2 Decision aiding
2.1.3 Modelling a decision aiding process
2.2 Modelling preferences
2.2.1 Value functions
2.2.2 Outranking relations
2.2.3 Eliciting the model parameters
2.3 The classical decision problems
2.3.1 The choice problem
2.3.2 The sorting problem
2.3.3 The ranking problem
3 On the mcda clustering problem
3.1 State-of-the-art of clustering in mcda
3.1.1 Overview
3.1.2 Multicriteria Relational Clustering
3.1.3 Multicriteria Ordered Clustering
3.2 From data analysis to mcda
3.2.1 General comparison
3.2.2 Illustrative examples
3.2.3 Detailed analysis of the relations
3.3 The problem of clustering in mcda
3.3.1 Defining clustering in mcda
3.3.2 Motivations for clustering
II Tackling the problem of clustering in mcda
4 Modelling the clustering problem
4.1 Working context
4.1.1 Credibility calculus
4.1.2 Modelling the preferential relations
4.2 Comparing sets of alternatives
4.2.1 Using representatives
4.2.2 Aggregating pairwise comparisons
4.2.3 Properties of the relations
4.3 Clustering objectives
4.3.1 Non-relational clustering
4.3.2 Relational and ordered clustering
4.3.3 Additional objectives
5 Solving the clustering problem
5.1 Initial considerations
5.1.1 On solving the problem
5.1.2 Existing and new approaches
5.2 Self-organizing maps for mcda clustering
5.2.1 General description
5.2.2 K-means and K-medoids
5.2.3 New topologies and adaptivity
5.3 CLIP: CLustering using Indifference and Preference relations
5.3.1 First step: Partitioning using indifferences
5.3.2 Approximative approach for the first step
5.3.3 Second step: Refining using preferences
6 Validating the clustering approaches
6.1 Generating the benchmarks
6.1.1 Benchmarks characteristics
6.1.2 Generating the evaluations
6.1.3 Analysing the benchmarks
6.2 SOM adaptations
6.2.1 The disconnected SOM on indifferences
6.2.2 The chained SOM on indifferences
6.2.3 The ordered SOM on indifferences
6.3 CLIP and other approaches
6.3.1 First step of CLIP
6.3.2 Second step of CLIP
6.3.3 Comparative analysis
III Exploring large preferential datasets
7 Measures for characterising sets of alternatives
7.1 The preference model
7.1.1 The bipolar valued outranking relation
7.1.2 Modelling the relation through linear constraints
7.1.3 Formulating a mixed-integer linear program
7.2 Descriptive measures for sets of alternatives
7.2.1 Central profiles
7.2.2 Bounding profiles
7.2.3 Separating profiles
7.3 Validating the approaches
7.3.1 Extracting central profiles
7.3.2 Extracting bounding profiles
7.3.3 Extracting separating profiles
8 Clustering for large mcda applications
8.1 Clustering large datasets
8.1.1 General introduction
8.1.2 Limitations and challenges
8.1.3 Main approaches
8.2 mcda clustering algorithms for large datasets
8.2.1 A Divide and Conquer approach
8.2.2 Incremental clustering
8.2.3 Clustering through samples
8.3 Validating the approaches
8.3.1 Describing the experiments
8.3.2 Initial analysis
8.3.3 Comparative analysis
9 Case Study: Mining Toxics Release Inventory data
9.1 Describing the available data
9.1.1 Selecting the data on chemicals toxicity
9.1.2 Selecting the data on facility practices
9.1.3 Analysing the selected data
9.2 Defining the problem
9.2.1 Formulating the problem
9.2.2 Structuring the problem
9.2.3 The preference model
9.3 Solving the problem
9.3.1 Finding the result
9.3.2 Analysis using representatives
9.3.3 Analysis using bounding profiles
Summary of the main achievements
Perspectives and future work
Concluding remarks
Bibliography