Èñòî÷íèê: http://www.springerlink.com/content/99vvynnyq901tylm/
Graph drawing algorithms are a basic enabling technology which is used to help with the understanding of large sets of inter-related data. By presenting data in a visual form it can often be easily digested by the user and both regular patterns and anomalies can be more easily identified. However most data sets do not contain any explicit information on how they should be laid out for easy viewing, although normally such a layout will depend on the relationships between pieces of data. Thus if we model the data points with the vertices of a graph and the relationships with the edges we can use graph based technology and, in particular, graph drawing algorithms to infer a ‘good’ layout from an arbitrary data set based on the relationships.
The motivation behind our interest in graph drawing, and in fact behind our approach to addressing the
problem, both arise from our work in the field of graph partitioning. More often than not, the types of
graph we wish to partition arise from computational meshes which need to be distributed amongst the
processors of a parallel computer to simulate some real life process or phenomenon. It is fairly normal to
have access to the coordinate information of the mesh and hence some form of coordinate system for the
graph vertices (e.g. if each vertex represents a triangular or tetrahedral element one might simply position
graph vertices at the centroid of that element). However, we are sometimes required to partition graphs
either with no coordinate information or for which we do not have access to the coordinate information.
Our initial interest in graph drawing therefore arose in trying to visualise the graph in order to help explain
partitioning results and debug algorithms. In particular we are interested in:
(a) the micro structure of the graph – this can sometimes give a good indication of whether the graph is
derived froma mesh or not and possibly if the local partition refinement algorithms are likely to work
well;
(b) the macro structure of the graph – this can sometimes give an indication of how easy the graph is to
partition in a global sense (e.g. a long thin domain may be easier to partition than a dense ball like
structure).
Graph partitioning techniques have alsomotivated our approach to addressing the graph drawing problem.
In recent years it has been recognised that an effective way of both accelerating graph partitioning
algorithms and/or, perhaps more importantly, giving them a global perspective is to use multilevel techniques.
The idea is to match pairs of vertices to form clusters, use the clusters to define a new graph and
recursively iterate this procedure until the graph size falls below some threshold. The coarsest graph is
then partitioned (possibly with a crude algorithm) and the partition is successively refined on all the graphs
starting with the coarsest and ending with the original. This sequence of contraction followed by repeated
expansion/refinement loops is known as the multilevel partitioning and has been successfully developed
as a strategy for overcoming the localised nature of the Kernighan-Lin (KL), [11], and other optimisation algorithms.
Themultilevel partitioning paradigmwas first proposed by Barnard & Simon, [1], as a method of
speeding up spectral bisection and improved by both Hendrickson & Leland, [9] and Bui & Jones, [2], who
generalised it to encompass local refinement algorithms. Several algorithms for carrying out the matching
of vertices have been devised by Karypis & Kumar, [10].
Perhaps more interestingly in the context of graph drawing we have recently used modifications of
these ideas to optimise partitions with respect to the aspect ratio or compactness of subdomains, [16], and
with respect to the mapping of the subdomains onto various heterogeneous communications networks,
[15] (for example to map onto a 1D array of processors it may be beneficial to slice the domain so that
each subdomain, as with each processor, has only 2 neighbours). These examples are of particular interest
because, although most of the modifications aremade to the local refinement algorithms (local in the sense
that they typically involve operations confined to the immediate neighbours of a vertex), the result is that
the subdomains are somehow shaped in a global sense (i.e. either given a compact shape or shaped so that
the subdomain graph matches the faster links in the processor graph). The crucial point is that by using
a suitable local refinement scheme in a multilevel framework we can achieve global quality within the
optimisation. This is an important consideration for graph drawing; the localised positioning of a vertex
relative to fixed neighbours is actually fairly easy and it is the global untangling of the graph which is more
difficult or time consuming. We therefore aim to use the multilevel ideas to both enhance the layout and
accelerate the graph drawing process.
In fact such ideas have already been suggested in the literature. For example, Fruchterman & Reingold,
[8], suggest the possible use of ‘a multigrid technique that allows whole portions of the graph to be moved’
whilst Davidson & Harel, [3], suggest a multilevel approach to ‘expedite the SA [simulated annealing]
process’. However, we have not yet found an implementation of such ideas.
A related but somewhat different idea is that of multilevel drawings, e.g. [7]. Rather than using the
multilevel process to create a good layout of the original graph, a multilevel graph is created, either by
natural clustering which exists in the graph or by artificial means similar to those applied here. Each
level is drawn on a plane at a different height and the entire multilevel structure can then be used to aid
understanding of the graph at multiple abstraction levels, [6].
In this section we describe how we combine a multilevel the optimisation ideas drawn from graph partitioning with our variant of a force directed placement algorithm.
As previously stated, the inspiration behind our graph drawing algorithm is the multilevel partitioning paradigm. Once again, the idea is to group vertices to form clusters, use the clusters to define a new graph and recursively iterate this procedure until the graph size falls below some threshold. The coarsest graph is then given an initial layout and the layout is successively optimised on all the graphs starting with the coarsest and ending with the original. The algorithm does not actually operate simultaneously on multiple levels of the graph (as, for example, a multigrid algorithm might) but instead optimises the layout at each level and then interpolates the result onto the next level down. raph coarsening. There are many ways to create a coarser graph Gl+1(Vl+1;El+1)from Gl(V;El)and clustering algorithms are an active area of research within the field of graph drawing amongst others, e.g. [12]. Often such clustering algorithms seek to retain the more important structural features of the graph in order that the visualisation of each level is meaningful in itself. However, here we are only interested in the drawing of the original graph. As such we seek a fast and efficient (i.e. not necessarily optimal) algorithm that is Sigradual – if too many vertices are clustered together in one step it may depreciate the benefits of the multilevel paradigm and in particular inhibit the force directed placement algorithm, as applied to Gl, from making use of the positioning obtained for Gl+1; uniform – we also require that the coarsening does not change the inherent properties of the graph differently between different regions. To suit these requirements we choose (as in the partitioning algorithm) a coarsening approach known as matching in which vertices are matched with at most one neighbour so that clusters are thus formed of at most two vertices. Computing a matching is equivalent to finding a maximal independent subset of graph edges which are then collapsed to create the coarser graph. The set is independent if no two edges in the set are incident on the same vertex (so no two edges in the set are adjacent), and maximal if no more edges can be added to the set without breaking the independence criterion. Having found such a set, each selected edge is collapsed and the vertices, u1;2Vlsay, at either end of it are merged to form a new vertex v2Vl+1with weight jv=ju1+j2. The problem of computing a matching of the vertices is known as the maximum cardinality matching problem. Although there are optimal algorithms to solve this problem, they are of at least O(V2:5), e.g. [13]. Unfortunately this is too slow for our purposes and, since it is not too important for the multilevel process to solve the problemoptimally, we use a variant of the edge contraction heuristic proposed by Hendrickson & Leland, [9]. Their method of constructing a maximal independent subset of edges is to create a randomly ordered list of the vertices and visit them in turn, matching each unmatched vertex with an unmatched neighbouring vertex (or with itself if no unmatched neighbours exist). Matched vertices are removed from the list. If there are several unmatched neighbours the choice of which to match with can be random, but in order to keep the coarser graphs as uniform as possible, and after some experimentation, we choose to match with the neighbouring vertex with the smallest weight (note that even if the original graph G0is unweighted, Glfor l=1;2:will be weighted). The initial layout. Having constructed the series of graphs until the number of vertices in the coarsest graph is smaller than some threshold, the normal practice of the multilevel partitioning strategy is to carry out an initial partition. In terms of graph drawing we can think of this as an analogue to the initial layout. If the graph is coarsened down to 2 vertices (which because of the mechanisms of the coarsening will be connected by a single weighted edge) we can simply place these vertices at random and set the initial equilibrium distance to be the distance between them. Layout interpolation. Having optimised the layout on a graph Gl, it is interpolated onto its parent Gl??1. The interpolation itself is a trivial matter and matched pair of vertices, v1;2Vl??1, are placed at the same position as the cluster, v2Vl, which represents them.
At each level we use a force-directed placement algorithm to draw the graph, Gl, and more importantly to provide initial positions for the parent graph Gl??1. There has been considerable research into graph drawing and a survey can be found in [4]. Here we are interested in straight-line drawing schemes and, in particular, spring-embedder or force-directed placement algorithms. The original concept came from a paper by Eades, [5], and is constructed around the idea of replacing vertices by rings or hinges and edges by springs. The vertices are given initial positions, usually random, and the system is released so that the springs move the vertices to a minimal energy state (i.e. so that the springs are compressed or extended as little as possible). From the point of view of the multilevel approach it is attractive as it is an incremental scheme which iterates to convergence and which can reuse a previously calculated initial layout. The particular variant of force-directed placement that we use is based on an algorithm by Fruchterman & Reingold (FR), [8], itself a variation of Eades’ original algorithm. We have made a number of modifications based on our experience with it and, in particular, because of the additional problems associated with drawing very large graphs. In principle however, it should be possible to use any iterative incremental algorithm for this part of the multilevel graph drawing, although in practice different algorithms can be somewhat sensitive and require a certain amount of tuning.
We have described our algorithmas if positioning the vertices fromscratch, but one of the attractive features of spring embedder type algorithms is that they can make use of existing layouts and (hopefully) improve them. Our algorithm is no exception although two minor issues must be addressed. Firstly the coarsening will start with an existing layout and so when any two vertices are clustered together the cluster should be positioned at the mid-point between the vertices. More importantly, the coarsening thresholdmust be changed. If the graph is coarsened to 2 vertices then the algorithm will more or less proceed as if starting from scratch and all existing information will be lost (since the new layout is completely dependent on the initial layout). If, however, the coarsening stops after the coarsest graph size falls below some threshold Tthen a certain amount of information will be retained. There is no automatic method of choosing Tand to some extent it must depend on how ‘good’ the existing layout is (something which is very difficult to quantify); the better the layout the lower the threshold. Interestingly this has an exact analogue in multilevel graph partitioning when an existing but perhaps suboptimal partition needs to be repartitioned (for example to load-balance a dynamically changing mesh). Themore the graph is coarsened the better the quality of the partition is likely to be. However, if the existing partition is already of high quality, the more coarsening takes place, the more the partition will change and as a result, in a parallel simulation, the more data will need to migrate from one processor to another, [14]. This reflects the ‘globality’ of the multilevel paradigm; more coarsening implies a more global aspect to the solution but may move the new solution further away from the previous solution. We have not tested the redrawing facility but it is implemented and the user can set the threshold Tat run time.
We have described amultilevel algorithmfor force directed graph drawing. The algorithmdoes not actually operate simultaneously on multiple levels of the graph (as, for example, a multigrid algorithm might) but instead, inspired by the multilevel partitioning paradigm, optimises the layout at each level and then interpolates the result onto the next level down. The algorithm is fast, e.g. about a second for a 500 vertex graph, under 30 seconds for a 2D layout of up to around 10,000 vertices and about 10 to 20 minutes for 75-100,000 vertices. It also seems to work robustly on a range of different graphs. We have not particularly tried to address graphs for which the technique might not work. It is likely that very dense graphs or even those such as mesh100 which have a dense substructure are never going to be good candidates for any graph drawing algorithm and ours is no exception. However it is also possible that graphs containing vertices of very high degree may not particularly suit the multilevel process. We have not dealt with disconnected graphs but feel that this requires only a few minor modifications. So far we have tested the algorithm on a number of different graphs mostly drawn from our a suite of examples assembled to test graph partitioning algorithms. Many of these derive from unstructured meshes and as such are relatively homogeneous in both vertex degree and local adjacency patterns. An obvious source of further research is to test the technique on graphs arising from different areas (e.g. models of communications networks or the internet). Our algorithm also allows vertex weights and although we have only tested this in the context of themultilevel procedure, its use with weighted graphsmight provide further interesting insights. In addition we believe, partly because of our experience in dynamic repartitioning algorithms, that the multilevel process is well suited to handling dynamically changing graphs and this looks to be a fruitful topic for future research. Finally we suspect that further work on some of the parameters of the algorithm would enhance its robustness and efficiency. In particular the calculation of the natural spring length kseems almost too simple to be effective. In addition, we feel that better investigation of the repulsive forces might alleviate some of the (minor) problems demonstrated in the examples. On a larger scale we would also be interested in investigating techniques for which kdoes not have to be constant but can vary throughout the graph.
- S. T. Barnard and H. D. Simon. A Fast Multilevel Implementation of Recursive Spectral Bisection for Partitioning Unstructured Problems. Concurrency: Practice & Experience, 6(2):101–117, 1994.
- T.N. Bui and C. Jones. A Heuristic for Reducing Fill-In in SparseMatrix Factorization. In R. F. Sincovec et al., editor, Parallel Processing for Scientific Computing, pages 445–452. SIAM, 1993.
- R. Davidson and D. Harel. Drawing Graphs Nicely using Simulated Annealing. ACM Trans. Graphics, 15(4):301–331, 1996.