The Minimum Weighted K-Cardinality Tree Problem

MALLBA  (TIC1999-0754-C03)

We thank the Barcelona team of the Mallba Project for their support while conducting this research. We are grateful to Arne Lökketangen, Matthias Ehrgott, Horst W. Hamacher and Jörg Freitag for providing us with their benchmarks and executable codes for k-Cardinality Tree, and also for their interest in our work.



 
The Minimum Weighted K-Cardinality Problem is the combinatorial optimization problem of finding in a given graph a subree of minimal weight with a fixed number K of edges.  It was first described in [9].  Its structure was investigated in detail in [10].  For a mathematical formulation see [13].

There are several applications which require to find a connected subset of a given graph. The first application is in oil field leasing in the Norwegian part of the North Sea (see [11]).  The policy of the Norwegian Government is to give blocks of an oil field to the companies allowing them to explore these fields for six years. After this period at least half of the block has to be retorned, under restriction that the returned part is connected. This problem can be modeled mathematically using a grid graph, the nodes of which represent the subsquares of the blocks. Subsquares have a size of one longitudinal by one latitudinal minute. Edges are introduced between two nodes if the corresponding subsqueares are neighbours. The objective is then to find a connected subgraph of the grid graph of minimal value and containing at least half of the nodes. The part of the block which is represented by this sugraph is the returned. The values of the subsquares will be established by the company based on their explorations during the first six years.

A second possible application is in the area of facilities layout and graph partitioning (see [12]). In facilities layout we consider a total office area or machine hall which has to be subdivided to accommodate the particular rooms. an office is the a connected subarea consisting of a given number alof unit squares. Feasible solutions are provided by a partition off a grid graph G into L connected subgraphs of al nodes, l=1,...,L.
 


   Implementation of MINKTREE with the TS skeleton

We have used the TS skeleton to implement the Quadratic Assignment Problem.  The design part of an instanced problem is equal to the design part of the skeleton, just because this part describes the internal working of the method and that part is generic enough not to depend on the problem.  The part to be filled by the filler user depends totally on the problem to be instanciated. All files (skeleton filler's part and skeleton designer's part) are available in the compressed file MINKTREE.tar.gz. If you are really interested in our implementation please send a mail to Maria J. Blesa or Fatos Xhafa and we can send you some information.


Internal details:

To make an instance of the MINKTREE problem with our skeleton some entities and functionalities of the method are needed to be described in detail. Roughly speaking, we can consider the problem, the solution and the movement as the main participant entities in the Tabu Search method. These entities have become C++ classes, so the filler user has to fill their attributes and methods in order to describe them. The methods of these (and other) classes needed for the TS skeleton to work are described (by the header) in the TS skeleton itself, but the majority of the attributes have to be introduced by the user because the attributes will describe the data structures to implement the entities represented in each C++ class.

An structure for maintaining and managing the tabu movements must be supplied.  A list structure is given by default with the skeleton code (file Tabu1List.cc), but the user can implementate a more efficient and particular one (taking benefit from the concrete problem that he/she instanciates). To include the default implementation (based on lists) the filler user only has to link his/her instanciated classes with the file Tabu1List.cc .  Something like this would appear in the Makefile:

        MINKTREE.list.seq:   Min_kTree.imp.o  Min_kTree.o  Tabu1List.o Main.o
        $(CXX) $(LDFLAGS) $^ -lL -lmallba -lpvm3 -o $@

To use an alternative implementation of the TabuStorage class,  a .cc file has to be filled with a source code for all its methods, and then link this .cc file with the rest of the classes.  Something like this would appear in the Makefile:

        MINKTREE.seq:   Min_kTree.imp.o  Min_kTree.o  AlternativeStructure.o Main.o
$(CXX) $(LDFLAGS) $^ -lL -lmallba -lpvm3 -o $@

In the MINKTREE Problem two standard lists are needed : one to maintain the movements that has added some edge to the tree under construction, and the second list to maintain the movements that has removed some edge of the tree.  The implementation and management of these lists has been implemented in the file Tabu2List.cc.   As you may imagine by now, the Makefile includes something like this:

        MINKTREE.seq:   Min_kTree.imp.o  Min_kTree.o  Tabu2List.o Main.o
$(CXX) $(LDFLAGS) $^ -lL -lmallba -lpvm3 -o $@

Here we describe some details of how the C++ classes of the TS skeleton have been filled in order to describe and implement the MINKTREE problem:
 
Problem Description:
We have to construct a minimum-weigted tree of a fixed cardinality. This tree has to be constructed over a weighted undirected graph.
Attributes:
  • The cardinality of the tree to be constructed.
  • The number of nodes of the graph.
  • The undirected weighted graph. 
Details of some methods:
All the attributes are read by the >> operator. The values are expected in this order:  the cardinality K, the number of nodes of the graph and the graph itself. The weighted graph expected to be represented as a adjacency matrix where the item (x,y) is the weight of the edge between nodes x and y.
Solution Description:
A solution is a tree with k (the cardinality) edges embedded in the weighted graph (see Problem).
Attributes:
  • The problem that this solution solves.
  • The edges penalized while diversificating the search (implemented as a set of edges). This is needed for internal reasons of the implementation.
  • The edges rewarded while intensificating the search (implemented as a set of edges).
  • The tree consisting of k edges (implemented as a set of edges). This is needed for internal reasons of the implementation.
  • The nodes of the tree (implemented as a set of nodes). This is needed for internal reasons of the implementation.
Details of some methods:
  • The initial solution is created with a greedy algorithm that constructs a tree starting with the less weighted edge of the graph and selecting the less weighted edges having a node in the constructed  part of the tree.
  • The algorithm can escape from a region by applying the method escape to the current solution. This method constructs a new initial solution but choosing a random edge as the starting edge of the greedy algorithm. 
  • A movement is applied by removing one edge of the tree an inserting a different one not in the tree already.
  • The fitness of a solution is computed as the sum of the weights of the edges in the tree.
  • Given a movement to be applied, the delta function is computed as the weight of the edge to be inserted minus the weight of the edge to be removed.
  • A movement has aspiration with respect to a solution if it makes the fitness decrease.
  • A solution is penalized by doubling the weight of its edges.
  • A solution is rewarded by halving the weihgt of its edges.
Movement Description:
A movement is defined as a replacement of one edge.  Then, it consist of two edges: the edge to be removed and the edge to be inserted. If the edge to be inserted have both end-points (both nodes) in the tree the movement is said to be STATIC, otherwise it is call to be DINAMIC (note that the edge to be inserted has to have an end-point node in the tree in order to create a tree, not a forest).
Attributes:
  • The tabu life.
  • The type (static or dinamic).
  • The edge to be removed.
  • The edge to be inserted.
Tabu Storage Description:
The tabu storage allows the management of the tabu movements.  In the MINKTREE problem it consists of two lists: one for the edges recently added to the tree and other for the edges recently removed from the tree. 
Attributes:
  • The setup parameters.
  • The problem.
  • The list of edges recently added to the tree.
  • The list of edges recently removed from the tree.
Details of some methods:
When a movement is made tabu, its "edge to insert" is added to the list of recently added edges and its "edge to remove" is added to the list of recently removed edges.

A movement is considered tabu if the edge to be removed is in the list of added edges or if the edge to be inserted is list of removed edges. 

Choose best move Description:
This method implements the search itself.  The search space consists of all the edges that incide in the current tree (this are the edges that can be inserted) combined with all the edges of the tree that can be removed (depending on the edge chosen to be inserted). 

All the incident edges are treated as a candidate edge to be inserted in the current tree.  Depending on its end-point nodes they are treated in a different way:

  • If both nodes of the incident edge are already in the tree:

  • Inserting this edge in the three will make a cycle, so the edge to be removed has to be one of this cycle (we have decided that it is the one in the cycle with maximum weight).  This case will cause a static movement.
     
  • If only one node of the incident edge is already in the tree:

  • The edge to be removed in this case has to be a leaf-edge, i.e. an edge with one vertex of degree greater than one (shared node), and the other vertex of degree equal to one (not shared node).  Among all the leaf-edges of the tree we choose the one with maximum weight. This case will cause a dinamic movement.


Inputs needed:

Two inputs are needed for the execution of a MINKTREE instance:


These input values can be stored in an unique ascii file and introduced to the program using the << operator to read them in the right order.
 


   Brief comparison with other benchmarks
 

Some benchmarks have been generated by us with a random graph generator program that Arne Lokketangen sent us. With this tool he made some experiments presented in [8].  He also sent us his Tabu Search implementation for the MINKTREE problem.  We have compared his results with our ones for the cardinality K=20,  and that is what we have obtained:

NOTE: The time is expressed in seconds. The Jörnsten & Lokketangen's benchmark is a windows-based application that allows only the execution of 1 trial. We have repeated the executions 10 times in order to equal the configuration used for their executions and the configuration used by us.  So time in Jörnsten & Lokketangen's benchmark is aproximated by the time spent in a trial  (the total aproximated time is expressed as the time spent in a trial per 10 trials).


Results for JÖ-LO (Jörnsten & Lokketangen) benchmark with K=20:
 

Benchmark #edges Best fitness 
obtained by
Jörnsten & 
Lokketangen
Best results obtained by us
1st round

(ba2 - 10 trials, 1000 it/trial)
2nd round

(ba2 - 10 trials, 1000 it/trial)
3rd round

(ba4 - 10 trials, 1000 it/trial)
4th round

(ba1 - 10 trials, 1000 it/trial)
setup
parameters
used
fitness
aprox. time
(sec)
fitness
time
(sec)
fitness
time
(sec)
fitness
time
(sec)
fitness
time
(sec)
g25-4-01.dat 50 219
 46 x10
219
17.59
219
19.47
219
18.63
219
18.19
g25-4-02.dat 
48
607
 53 x10
607
15.09
607
15.90
607
15.32
607
15.84
g25-4-03.dat
50
464
 45 x10
464
13.07
464
13.19
464
12.97
464
13.86
g25-4-04.dat
48
620
 50 x10
620
14.49
620
14.78
620
14.70
620
15.53
g25-4-05.dat
50
573
 50 x10
573
16.06
573
16.18
573
15.87
573
16.83
g50-4-01.dat
98
466
 51 x10
460
14.92
463
15.38
463
15.58
464
15.93
g50-4-02.dat
99
421
 51 x10
421
15.35
421
15.56
421
15.50
421
16.40
g50-4-03.dat
99
580
 51 x10
577
14.15
565
14.71
565
14.31
577
14.88
g50-4-04.dat
100
434
 51 x10
434
12.16
434
12.59
434
12.68
434
13.35
g50-4-05.dat
100
396
 52 x10
391
13.57
392
14.04
392
13.60
392
14.25
g75-4-01.dat
149
366
 52 x10
366
16.11
366
16.36
366
16.22
366
17.25
g75-4-02.dat
150
307
 52 x10
295
16.34
295
16.33
295
16.38
295
17.02
g75-4-03.dat
149
421
 53 x10
421
16.87
428
16.39
426
16.32
428
17.00
g75-4-04.dat
150
432
 53 x10
432
23.35
432
16.56
432
16.64
432
17.51
g75-4-05.dat
150
284
 54 x10
284
37.43
284
15.05
284
14.92
284
15.71
g100-4-01.dat
200
363
 55 x10
363
38.36
363
17.63
363
17.21
363
19.06
g100-4-02.dat
200
338
 55 x10
340
46.54
338
22.37
338
19.24
340
22.31
g100-4-03.dat
200
412
 55 x10
412
44.09
412
21.66
412
18.96
412
21.01
g100-4-04.dat
199
442
 57 x10
442
44.02
442
18.60
442
17.99
442
19.56
g100-4-05.dat
199
388
 56 x10
397
44.85
397
18.49
397
19.10
401
20.35
g200-4-01.dat
400
348
 61 x10
 308
61.71
308
27.68
308
26.19
308
29.25
g200-4-02.dat
400
299
 62 x10
 299
60.31
299
26.97
299
25.63
299
28.71
g200-4-03.dat
400
300
 62 x10
 300
60.82
300
27.55
300
25.87
300
28.27
g200-4-04.dat
399
310
 62 x10
 312
57.33
312
26.27
312
24.31
312
26.14
g200-4-05.dat
399
380
 broken
 357
66.71
357
29.60
360
27.39
360
30.47
g400-4-01.dat
800
253
 82 x10
 253
60.17
253
24.91
253
23.50
253
26.08
g400-4-02.dat
799
338
 74 x10
 349
57.02
355
24.36
355
23.83
355
24.98
g400-4-03.dat
799
302
 76 x10
 306
62.28
306
25.85
306
24.32
306
26.53
g400-4-04.dat
800
315
 76 x10
 314
58.30
314
24.10
328
23.03
328
25.22
g400-4-05.dat
800
320
 78 x10
 328
61.27
328
25.43
328
23.76
328
25.58
g1000-4-01.dat
2000
270
110 x10
 277
71.08
267
28.52
277
26.66
277
29.43
g1000-4-02.dat
1999
321
 120 x10
 292
72.25
292
28.65
292
26.28
292
29.57
g1000-4-03.dat
2000
295
 124 x10
 308
70.45
308
27.96
309
26.13
309
28.60
g1000-4-04.dat
2000
306
 120 x10
 312
 71.96
316
28.21
316
26.08
316
28.76
g1000-4-05.dat
2000
284
 131 x10
 280
 69.73
280
27.76
280
25.90
280
28.05

You can also obtain all these benchmarks compressed in JOLO-benchmark.tar.gz .

Squares in light red mean worst fitness with relation to the best fitness obtained by Jörnsten & Lokketangen, and squares in light green mean better fitness wrt. the same results.  Remember that all these experiments are configured to build a 20-cardinality minimum weighted tree, i.e. k=20.

The name of each benchmark has a meaning.  All of them start by letter g because they contain graphs, and end by .dat because they are data.  Then the middle part of the name refers to:

NumberOfVertices-EdgesPerVertex-InstanceNumber

So, for example, benchmark g200-4-03.dat contains (the third instance generated of) a 200-vertices graph, and each of these 200 vertices has degree 4.  Edges are generated randomly and the seed used to randomize is different for each of them.
 


Results for EHR-FRE-HA (Ehrgott & Freitag & Hamacher) benchmark with K=3..28:
 

Benchmark #edges Best fitness obtained by
Ehrgott, Freitag & Hamacher
Best results obtained by us
1st round

(quite loaded machine - 10 trials, 1000 it/trial)
2nd round

(ba2 - 10 trials, 1000 it/trial)
3rd round

(ba4 - 10 trials, 1000 it/trial)
DPT
DDP
fitness
time
fitness time
fitness
time
(sec)
fitness
time
(sec)
fitness
time
(sec)
gr30_10.k3 211
12
 
12
 
12
01.33
12
02.09
12
02.15
gr30_10.k4
18
 
18
 
18
01.50
18
00.83
18
00.81
gr30_10.k5
27
 
25
 
25
01.58
25
00.96
25
00.90
gr30_10.k6
32
 
31
 
31
01.80
31
01.07
31
01.08
gr30_10.k7
36
 
36
 
36
01.93
36
01.24
36
01.16
gr30_10.k8
42
 
42
 
42
02.12
43
01.34
42
01.35
gr30_10.k9
49
 
49
 
49
02.25
49
01.51
49
01.48
gr30_10.k10
58
 
54
 
54
02.73
54
15.49
54
15.34
gr30_10.k11
68
 
60
 
60
02.71
60
01.90
60
01.86
gr30_10.k12
72
 
67
 
67
02.95
67
02.13
71
02.04
gr30_10.k13
76
 
76
 
76
03.12
76
02.25
76
02.20
gr30_10.k14
83
 
83
 
83
03.31
83
02.49
83
02.42
gr30_10.k15
92
 
92
 
92
03.71
92
02.62
92
02.66
gr30_10.k16
102
 
102
 
102
03.96
102
02.95
102
02.90
gr30_10.k17
109
 
109
0.11
109
04.51
109
03.24
109
03.13
gr30_10.k18
118
 
118
0.11
118
04.72
118
03.61
119
03.60
gr30_10.k19
128
 
128
 
128
05.22
128
04.05
129
04.05
gr30_10.k20
139
 
139
0.16
139
05.71
139
04.26
139
04.25
gr30_10.k21
149
 
149
0.16
149
06.33
149
05.26
149
04.93
gr30_10.k22
160
 
160
0.16
160
06.52
160
05.11
160
05.30
gr30_10.k23
171
 
171
0.22
171
07.11
171
05.49
171
05.84
gr30_10.k24
184
 
184
0.22
184
07.77
184
06.42
184
06.03
gr30_10.k25
200
 
200
0.28
200
08.42
200
06.90
200
06.85
gr30_10.k26
218
 
218
0.28
218
09.12
218
07.42
218
07.40
gr30_10.k27
234
 
234
0.27
234
 
234
08.07
234
07.88
gr30_10.k28
252
 
252
0.33
252
 
252
 17.80
252
08.27
 
 
   
 
 
 
 
 
 
 
 
gr30_20.k3
201
11
 
11
0.05
11
04.63
11
00.77
11
00.68
gr30_20.k4
18
 
18
0.05
18 
03.14
18
00.81
18
00.77
gr30_20.k5
 22
 
22
0.05 
22
03.36
22
00.92
22
00.88
gr30_20.k6
 30
 
30
 
30
07.33
30
01.05
30
01.01
gr30_20.k7
 35
0.05
35
 
35
05.81
35
01.15
35
01.15
gr30_20.k8
 41
0.05
41
0.06 
40
09.79
40
01.29
40
01.30
gr30_20.k9
 46
0.05
46
0.06 
46
06.01
46
01.43
46
01.40
gr30_20.k10
 53
0.05
53
0.06 
53
08.57
53
01.60
53
01.62
gr30_20.k11
 59
0.05
59
0.06
59
05.94
59
01.74
59
01.76
gr30_20.k12
 66
0.11
66
0.06 
66
12.69
66
01.88
66
01.89
gr30_20.k13
 73
0.11
73
0.10 
73
10.45
73
02.13
73
02.06
gr30_20.k14
 80
0.11
80
0.05 
80
09.50
80
02.31
80
02.36
gr30_20.k15
 89
0.11
88
 
88
14.08
88
02.56
88
02.62
gr30_20.k16
 98
0.17
95
0.11 
95
08.83
95
02.76
95
02.82
gr30_20.k17
 104
0.11
104
0.11 
104
14.87
104
03.34
104
03.29
gr30_20.k18
 113
0.17
113
0.17 
113
04.64
113
03.62
113
03.57
gr30_20.k19
 122
0.16
122
0.17 
122
05.20
122
04.09
122
03.90
gr30_20.k20
 139
0.16
139
0.17 
139
06.06
139
04.73
139
04.91
gr30_20.k21
 155
0.22
155
0.17 
155
06.60
155
05.42
155
05.03
gr30_20.k22
 172
0.22
172
0.16
172
07.14
172
06.02
172
06.18
gr30_20.k23
 189
0.27
189
0.22
189
07.73
189
06.59
189
06.55
gr30_20.k24
 208
0.28
208
0.22 
208
08.27
208
07.33
208
07.22
gr30_20.k25
 230
0.28
230
0.22 
230
09.78
234
08.77
234
08.63
gr30_20.k26
 249
0.28
249
0.27 
249
10.24
249
07.99
249
08.75
gr30_20.k27
 275
0.33
275
0.28 
275
11.40
275
09.63
275
09.89
gr30_20.k28
 316
0.33
316
0.28
316
15.77
316
13.82
316
13.63

You can also obtain all these benchmarks compressed in FREITAG_benchmark.tar.gz .

The name of each benchmark has a meaning.  All of them start by letter gr because they contain grid graphs. Then the middle part of the name refers to:

NumberOfVertices_InstanceNumber.Kcardinality

So, for example, benchmark gr30_20.k25.dat refers to instance 20th of a 30-nodes graph.  This file is the benchmark to construct a 25-edges tree of minimum weight over this graph. So gr30_20.k26.dat is the benchmark to construct a 26-edges tree of minimum weight over the same graph.

We have only took 2 graphs from the original benchmarks that Jörg Freitag sent us in order to vary the cardinality (K) of the tree to be constructred and compare with his results.
 
 
 


MALLBA
Last update: Tue Sep 5  2000