计算机科学学术论文写作 (4).pdf
HERMAN ET AL.:GRAPH VISUALIZATION AND NAVIGATION IN INFORMATION VISUALIZATION 1 Graph Visualization and Navigation in Information Visualization:a Survey Ivan Herman,Member,IEEE CS Society,Guy Melanon,and M.Scott Marshall AbstractThis is a survey on graph visualization and navigation techniques,as used in information visualization.Graphs appear in numerous applications such as web browsing,statetransition diagrams,and data structures.The ability to visualize and to navigate in these potentially large,abstract graphs is often a crucial part of an application.Information visualization has specific requirements,which means that this survey approaches the results of traditional graph drawing from a different perspective.Index TermsInformation visualization,graph visualization,graph drawing,navigation,focus+context,fisheye,clustering.1 Introduction lthough the visualization of graphs is the subject of this survey,it is not about graph drawing in general.Excellent bibliographic surveys4,34,books5,or even online tutorials26 exist for graph drawing.Instead,the handling of graphs is considered with respect to information visualization.Information visualization has become a large field and“subfields”are beginning to emerge(see for example Card et al.16 for a recent collection of papers from the last decade).A simple way to determine the applicability of graph visualization is to consider the following question:is there an inherent relation among the data elements to be visualized?If the answer to the question is“no”,than data elements are“unstructured”and the goal of the information visualization system might be to help discover relations among data through visual means.If,however,the answer to the question is“yes”,then the data can be represented by the nodes of a graph,with the edges representing the relations.Information visualization research dealing with unstructured data has a distinct flavour.However,such research is not the subject of this survey.Instead,our discussion focuses on representations of structured data,i.e.,where graphs are the fundamental structural representation of the data.Information visualization has specific requirements,which means that we will approach the results of traditional graph drawing from a different perspective than the other surveys.1.1 Typical Application Areas Graph visualization has many areas of application.Most people have encountered a file hierarchy on a computer system.A file hierarchy can be represented as a tree(a special type of graph).It is often necessary to navigate through the file hierarchy in order to find a particular file.Anyone who has done this has probably experienced a few of the problems involved in graph visualization:“Where am I?”“Where is the file that Im looking for?”Other familiar types of graphs include the hierarchy illustrated in an organisational chart and taxonomies that portray the relations between species.Web site maps are another application of graphs as well as browsing history.In biology and chemistry,graphs are applied to evolutionary trees,phylogenetic trees,molecular maps,genetic maps,biochemical pathways,and protein functions.Other areas of application include objectoriented systems(class browsers),data structures(compiler data structures in particular),realtime systems(statetransition diagrams,Petri nets),data flow diagrams,subroutinecall graphs,entity relationship diagrams(e.g.UML and database structures),semantic networks and knowledgerepresentation diagrams,project management(PERT diagrams),logic programming(SLDtrees),VLSI(circuit schematics),virtual reality(scene graphs),and document management systems.Note that the information isnt always guaranteed to be in a purely hierarchical format this necessitates techniques which can deal with more general graphs than trees.1.2 Key Issues in Graph Visualisation The size of the graph to view is a key issue in graph visualization.Large graphs pose several difficult problems.If the number of elements is large it can compromise performance or even reach the limits of the viewing platform.Even if it is possible to layout and display all the elements,the issue of viewability or usability arises,because it will become impossible to discern between nodes and edges(see Figure 1,although this tree is by no means a very complex one).In fact,usability becomes an issue even before the problem of discernability is reached.It is well known that comprehension and detailed analysis of data in graph structures is easiest when the size of the displayed graph is small.In general,displaying an entire large graph may give an indication of the overall structure or a location within it but makes it difficult to comprehend.These issues form the context for most of this survey.A The authors are with the Centre for Mathematics and Computer Sciences(CWI),Amsterdam,The Netherlands Email:I.Hermancwi.nl,G.Melanconcwi.nl,M.S.Marshallcwi.nl 2000 IEEE.Personal use of this material is permitted.However,permission to reprint/republish this material for advertising or promotionalpurposes or for creating new collective works for resale or redistribution to servers or lists,or to reuse any copyrighted component of this workin other works must be obtained from the IEEE.2 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,VOL.6,NO.X,XXX,2000 Other than the usual reference to information overload and the occasional reference to some of the gestalt principles,papers in information visualization rarely apply cognitive science and human factors.This is for no lack of trying;very few of the findings in cognitive science have practical applications at this time and very few usability studies have been done.Cognitive aspects are undoubtedly a subject for future research.For this reason,an objective evaluation of the merits of a given approach is difficult;the reader has to bear this limitation in mind when various techniques are presented.1 The rest of this survey is organized as follows:In Section 2,we try to give an impression of graph layout issues and limitations with regard to scaleability.Then,we discuss several approaches to navigation of large graphs(Section 3),followed by methods of reducing visual complexity through reorganisation of the data(Section 4).Afterwards,we discuss a few application systems that implement many of the techniques described in this survey(Section 5).To help the reader pursue further research and development,we have listed the various sources of information that we found particularly important for graph visualization(Section 6)and provided an extensive list of references.2 Graph Layout This section looks at the current results in graph drawing and layout algorithms,but from the point of view of graph visualization in information visualization.As we will see,this point of view differs,in many respects,from the traditional view of the Graph Drawing community.We will give an account of the available results and discuss their relevance for graph visualization,although,in general,we will not go too far into the technical details.For those desiring more information,we recommend the excellent book of Battista et al.5 as one of the best starting points.1 Wares new book123 may become an important source of information in this area,although,at the time of finalization of this manuscript,only a draft version is available,which does not allow a thorough judgement.2.1 Background of Graph Drawing The Graph Drawing community2 grew around the yearly Symposia on Graph Drawing(GD XX conferences),which were initiated in 1992 in Rome.SpringerVerlag publishes the proceedings of the conference in the LNCS series,which contains new layout algorithms,theoretical results on their efficiency or limitations,and systems demonstrations.The recent electronic Journal of Graph Algorithms and Applications is dedicated to papers concerned with design and analysis of graph algorithms,as well as with experiences and applications.The basic graph drawing problem can be put simply:given a set of nodes with a set of edges(relations),calculate the position of the nodes and the curve to be drawn for each edge.Of course,this problem has always existed,for the simple reason that a graph is often defined by its drawing.Indeed,Euler himself relied on a drawing to solve the“Knigsberger Brckenproblem”in his 1736 paper(see the recent book of Jungnickel74).The annotated bibliography by Battista et al.4 gathers hundreds of papers studying what a good drawing of a graph is.That is where the problem becomes more intricate:it requires the definition of properties and a classification of layouts according to the type of graphs to which they can be applied.For example,a familiar property is planarity whether it is possible to draw a graph on the plane with no edge crossings.Layout algorithms may be categorized with respect to the type of layout they generate.For example,grid layouts position nodes of a graph at points with integer coordinates.Other categories of layouts are defined by the methodology on which they are based.For example,nondeterministic approaches form a category that uses algorithms such as forcedirected models or simulated annealing.Each class of graphs and layouts thus generates its own set of problems.Planarity,for example,raises problems such as:2 http:/www.cs.brown.edu/people/rt/gd.html Fig.1 A tree layout for a moderately large graph HERMAN ET AL.:GRAPH VISUALIZATION AND NAVIGATION IN INFORMATION VISUALIZATION 3 Planarity tests for graphs:is it possible to draw a graph without edgecrossings?Planar layout algorithms according to various constraints:given that a graph is planar,find a layout satisfying a group of constraints.Many constraints in use are also expressed in terms of aesthetic rules imposed on the final layout.Nodes and edges must be evenly distributed,edges should all have the same length,edges must be straight lines,isomorphic substructures should be displayed in the same manner,edgecrossings should be kept to a minimum,etc.1 Trees have received the most attention in the literature.Consequently,additional aesthetics rules have also been formulated for them.For example,nodes with equal depth should be placed on a same horizontal line,distance between sibling nodes is usually fixed,etc.See again the book of Battista et al.5 for further examples.The Reingold and Tilford algorithm for trees103,121(see Figure 1)is a good example of a layout algorithm achieving these aesthetics goals.Isomorphic subtrees are laid out in exactly the same way,and distance between nodes is a parameter of the algorithm.On the other hand,the more straightforward and naive algorithm for displaying a tree,consisting of distributing the available horizontal space to subtrees according to their number of leaves,actually fails to achieve some of the aesthetic rules listed above.Although the adjective“aesthetic”is used,some rules were originally motivated by more practical issues.For instance,minimisation of the full graph area might be an important criterion in applications.Some of the rules clearly apply to a certain category of graphs or layouts only,others have a more“absolute”character.Furthermore,each of the rules defines an associated optimisation problem,used in a number of nondeterministic layout algorithms22.There has been some work lately which questions the absolute character of those rules,however.Usability studies were conducted in order to evaluate the relevance of these aesthetics for the enduser.Purchase100demonstrates that“reducing the crossings is by far the most important aesthetic,while minimizing the number of bends and maximizing symmetry have a lesser effect”.Her work concludes by prioritizing these aesthetics;see also Purchase et al.101,102 for more details.Other authors10,29,86 report differences in the perception of a graph depending on its layout.Unfortunately,usability studies necessitate a great effort,both to realize the experimentation itself and to analyse its results properly,but we regard this line of work as essential for information visualization as well.They have recently gained credibility in the graph visualization community as well,recognizing their contribution to help focus on important issues in the area.A wide variety of tasks related to graph drawing have been studied:layering a graph,turning it into an acyclic directed graph,planarisation of a graph,minimizing the area occupied 1Actually,some aesthetics are quite arbitrary and are not seen as absolute rules any more100,101.Wares book123 is also an interesting source of information for this topic.by a layout,minimizing the number of bends in edges,etc.Unfortunately,many of the associated algorithms are too complex to be practical for applications.On the positive side,this has motivated the development of effective heuristics to overcome the complexity of some of these problems5,34.In graph visualization,a major problem that needs to be addressed is the size of the graph.Few systems can claim to deal effectively with thousands of nodes,although graphs with this order of magnitude appear in a wide variety of applications.NicheWorks126or H3Viewer94are among the few systems that claim to handle data sets with thousands of elements.The size of a graph can make a normally good layout algorithm completely unusable.In fact,a layout algorithm may produce good layouts for graphs of several hundred nodes,but this does not guarantee that it will scale up to several thousand nodes.For example,Figure 1 illustrates a tree with a few hundred nodes laid out using the classical Reingold and Tilford algorithm.The high density of the layout comes as no surprise,and changing particular parameters of the algorithm will not improve the picture for the graph.Other 2D layout techniques could be used,but most layout algorithms suffer from the same problem.Because the layout is so dense,interaction with the graph becomes difficult.Occlusions in the picture make it impossible to navigate and query about particular nodes.The use of 3D or of nonEuclidean geometry have also been proposed to alleviate these problems.Sections 2.4 and 2.5 provide more details about these techniques.However,beyond a certain limit,no algorithm will guarantee a proper layout of large graphs.There is simply not enough space on the screen.In fact,from a cognitive perspective,it does not even make sense to display a very large amount of data.Consequently,a first step in the visualization process is often to reduce the size of the graph to display.As a result,classical layout algorithms remain usable tools for visualization,but only when combined with these techniques.Other properties of a layout algorithm can be critical when navigating through a graph.The concept of predictability has been identified as an important and necessary aspect of layout algorithms61,99.What is meant by predictability is that two different runs of the algorithm,involving the same or similar graphs,should not lead to radically different visu