# Research on complex network layout algorithm based on grid point matching method

Research on complex network layout algorithm based on grid point matching method Abstract Clearly visualized networks provide a great help in understanding complex systems, which require designing efficient layout algorithm to draw the network diagrams. Compared with force-directed algorithm, the algorithms of grid-optimization series have the advantage of avoiding node overlap through positioning nodes on square grid points of twodimensional. However, for networks of several thousand nodes, the computation costs of extant grid layouts are too high to meet the visualization requirements. In order to improve computation performance, this article proposes a new grid point matching based algorithm named grid-based layout (GBL) with three procedures to draw complex networks. Firstly, graph clustering algorithm is applied to divide network into several modules constituted of closely connected nodes, then all modules are placed on separated two-dimensional space as global layout, and finally, the grid layout algorithm is applied to position the child nodes within these modules as local optimization. With GBL help, users can gain insights of global topological structure of network as well as detailed connectivity within these modules. In particular, an improved weight strategy is designed to speed up optimization process. Compared with latest available grid layout algorithm, GBL shows relatively better performances in computing time, edge–edge crossings, node–edge crossings, relative edge lengths and connectivity F-measures. 1. Introduction Modelling and simulation of complex networks are some of the most important tasks in the fields of network analysis and knowledge discovery. For social, biological and internet networks, their visualization seeks to organize the network structures in a manner that helps the users to gain insights into the most interesting patterns and relationships within the data, such as groups of scientific collaboration networks, components of biological pathways and autonomous systems of internet [1–5]. With the developments of information technologies, experimental datasets are increasing sharply in their size and complexity, bring a great difficult to network visualization. Closely connected nodes in complex networks are usually grouped as a module to fulfil a specific function and modules are often organized in a hierarchical structure. Such modular architecture contributes significantly to clarify the network complexity [1,6]. There exists a wide variety of network layout algorithms that aim to place connected nodes of a graph close to each other on two-dimensional plane. Force directed is a notable layout algorithm available in most popular visualization platforms (e.g. Cytoscape [7] and CellDesigner [8] etc.), where a network is modelled as a mechanical system with nodes having repulsive force and edges having attractive force. However, a drawback of force directed is that the densely connected nodes tend to locate too close and easily overlap each other [9]. To solve the problem, a variation algorithm named grid layout [10] is proposed that nodes are placed on square grid points to avoid node overlap with the expense of computation performance. Grid layout succeeded in generating compact layouts and attracted many concerns [11–17]. To decrease computation cost, Kojima et al. [13,14] proposed a method termed sweep calculation to speed up layout process. He et al. [15] designed an algorithm named LucidDraw grid layout algorithm (LGL), where a neighbourhood-test strategy was used to low computation time. Previous grid layout algorithms work fairly well for small- and medium-sized networks of several hundred networks, whereas larger networks, the computation time is increasing rapidly. In the article, we proposed a new grid optimization algorithm named grid-based layout algorithm (GBL) that is specifically focused on clarifying visual complexity with network modular properties, as well as takes a new optimization strategy to improve computation performance. Three main steps are taken to optimize the node positions (Fig. 1). Firstly, networks are preprocessed as several modules through graph clustering algorithm, and then all modules are placed on the two-dimensional plane as global level, finally, the coordinates of nodes within all modules are optimized as local level. Fig. 1. View largeDownload slide (a) An example network. (b) Network preprocessing. (c) Global layout and the initialization of nodes’ coordinates. (d) Local optimization and the positions of all nodes are locally optimized with grid layout algorithm. Fig. 1. View largeDownload slide (a) An example network. (b) Network preprocessing. (c) Global layout and the initialization of nodes’ coordinates. (d) Local optimization and the positions of all nodes are locally optimized with grid layout algorithm. 2. Method A network can generally be transformed into a graph where nodes are represented as geometric points and edges between nodes are represented as straight lines. Under such a drawing regulation, a layout can be generated when all nodes’ coordinates are figured out through a certain optimization algorithm. 2.1 Step 1: network preprocessing For a given network, the node set is partitioned into several size-varied subsets termed modules with clustering algorithms as network preprocessing. Such clustering algorithms can group closely connected nodes by placing them in the same module and separate sparsely connected nodes by placing them in different modules. In our method, the node set is partitioned by an algorithm termed multi-level clustering [18] that is a heuristics graph clustering algorithm which coarsens the network through iteratively merging clusters and then refines the resulting clusters by iteratively moving individual nodes between them. Multi-level clustering has the advantages of relative high-network modularity and low-computation time whose computational complexity is $$O$$ ($$m$$ log $$n$$), where $$m$$ and $$n$$ are the number of edges and nodes, respectively. An example network of 28 nodes as shown in Fig. 1a is used to illustrate the GBL procedures, and Fig. 1b is the preprocessed result by multi-level clustering, where the node set is partitioned into five subsets represented by modules of example network. 2.2 Step 2: globe layout The focus of this step is to place all modules resulted from step 1 on a two-dimensional plane. As shown in Fig. 1c, the space is divided into several fan-shaped areas of varied sizes to place the modules, and the size of fan-shaped (determined by central angle $$\theta$$) is proportional to the number of nodes embedded in module, i.e. one module having more nodes is arranged to relative larger area to avoid node overlaps. In this step, the coordinates of nodes within modules are initialized to random values prepared for local optimization. The computation cost of globe layout is mainly resulted from the initialization procedure. To avoid nodes’ overlap, the algorithm has to check repeatedly whether the coordinates (i.e. abscissa and ordinate) of nodes are identical through a nested loop. For a network of $$n$$nodes, the cost time of nested loop is $$O (n^{\mathrm{2}})$$ when globe layout is finished. 2.3 Step 3: local optimization The procedure of local optimization. On the basis of global layout, the positions of child nodes embedded in modules should be optimized as shown in Fig. 1d. We adopted a strategy similar to original grid layout [10,15] with modifications to improve the computation performance. Given a network of $$n$$ nodes, the layout R can be denoted by $$\textbf{R}= \textbf{R}(\textbf{r}_{{1}}, \textbf{r}_{{2}},\ldots, \textbf{r}_{n})$$, where $$\textbf{r}_{i} = (x_{i},\ y_{i})$$ is the coordinates. The original grid optimization [1] seeks to find the best layout through optimizing a cost function $$f \textbf{(W, R)}$$ and positions all nodes on grid point to avoid node overlaps:   $$f({{\bf W}}, {{\bf R}}{)}=\sum\limits_{i<j}^n {f_{ij} ({{\bf W}},{{\bf R}})}\label{eq1}$$ (1) The cost between nodes $$i$$ and $$j$$ is $$f_{ij}$$$$(\textbf{W,R}) = w_{ij}d (\mathrm{r}_{i},\ \mathrm{r}_{j})$$, where $$w_{ij}$$ is the interaction weight of node $$i$$ and $$j$$, which describes the way nodes interplay and the weights between all node pairs constitute the weight matrix W. The term $$d(\mathrm{r}_{i}$$, $$\mathrm{r}_{j})$$ is the Manhattan distance between nodes $$i$$ and $$j$$. In general, the node-pairs with higher positive weight will be positioned closer and node-pairs of negative weight be positioned more distant during optimization procedure. As shown in Fig. 2, the procedure begins with a random layout R which is subjected to the locally optimization that moves every single node to its neighbouring vacant grid point to reduce the cost $$f\textbf{(W, R)}$$. To avoid the local minimum, the layout Rʼ is generated through a perturbation procedure that moves each node to a randomly chosen adjacent grid point with a given probability. The perturbed layout Rʼ is optimized once more and the layout of lower cost score is selected as the next input to repeat the optimization process until it reaches niter time. Fig. 2. View largeDownload slide The Pseudo-code of local optimization. Fig. 2. View largeDownload slide The Pseudo-code of local optimization. One main disadvantage of original grid optimization is the high time cost that results from the computation of weight matrix W. For a network of $$n$$nodes, if A is $$n \times n$$ adjacent matrix, W is matrix A multiply by A whose computational complexity is $$O$$ ($$n^{\mathrm{3}})$$ according to original grid layout algorithm [10]. To low the computation costs, a two-stage search strategy for the setting of weight matrix W is put forward and integrated in local optimization algorithm through the function getWeight(), please see Fig. 2 (line 2). Two-stage search strategy. The weight matrix is generated based on the path lengths (i.e. graph distances) between node-pairs. As shown in Fig. 3, for example, node A has three adjacent nodes B, C and D, and the path lengths between A and its adjacent nodes is defined as 1; node A is connected to node E and F via node D, therefore the path lengths between node A and E, F is defined as 2. At the first stage, the algorithm searches the node-pairs with path length $$=$$ 1 and then it searches the node pairs with path length $$=$$ 2 at the second stage as shown in Fig. 3b and c, respectively. Fig. 3. View largeDownload slide The two-stage search strategy. Fig. 3. View largeDownload slide The two-stage search strategy. The algorithm sets weight values according to the criterion that closely connected nodes (smaller path lengths) are assigned higher weights and sparsely connected nodes (larger path lengths) are assigned lower weights or negative weights. As shown in Fig. 3d, the weight values (e.g. 30, 10, $$-$$20) are set empirically and they can be adjusted according to usersʼ needs. The computation cost of local layout is mainly derived from weight matrix setting and the perturbing process. Obviously, for a network of $$n$$nodes, the two-stage search requires $$O(n^{2})$$ time to assign weight values through dual loops and the local layout algorithm also spends $$O(n^{2})$$ time to search vacant grid points on two-dimensional plane for local optimization in perturbing process, therefore the computation cost of local layout is $$O(n^{2}) + O(n^{2}) = O(n^{2})$$. 3. Algorithm parameter and computation complexity The number of iteration niter in local optimization is the crucial parameter that controls computation time. Small niter usually results in relatively low quality of drawings and large niter is not necessary because better drawing is harder and harder to obtain at later optimization. After repeated testing in practice, we chose a moderate value of $$niter = 10$$ that is usually enough to generate satisfied layouts. As discussed in the Section 2, for a network of $$m$$ edges and $$n$$ nodes, the running time of three steps is $$O (m \log n) + O (n^{2}) +niter \times O (n^{2}) = O (m \log n) + O(n^{{2}})$$. 4. Implementation GBL is implemented with Java language and it is integrated in a visualization graphical user interface (GUI) which has been developed based on JGraph, an open source graph visualization library also written in Java (https://github.com/jgraph/jgraphx). With the support of abundant graphical functionalities in JGraph, the GUI can provide user-friendly operations on the network drawings such as zooming in/out, showing/hiding labels and moving nodes. Figure 4a is an example drawing of global layout that includes 14 modules, each module is marked with different shapes and colours. All modules are placed on fan-shaped areas around a centre point and child nodes within these modules are positioned randomly with no local optimization when iteration parameter niter$$=$$ 0. Figure 4b is the result drawing after local optimization with niter$$=$$ 10, where the modulesʼ positions remain relative stable. GBL and visualization GUI can freely downloaded from http://jsjxy.jstu.edu.cn/Detail.aspx?DepartColumnId$=$14. Fig. 4. View largeDownload slide A layout of the yeast cell cycle regulatory network of 200 nodes visualized by GBL based on JGraph GUI. The network is divided into 14 modules by multi-level clustering algorithm. (a) Globe layout: the modules are placed on fan-shaped areas of varied sizes and child nodes within these modules are positioned randomly when niter$$=$$ 0. (b) Local optimization: the positions of child nodes embedded in modules are optimized based on global layout when niter$$=$$ 10. Fig. 4. View largeDownload slide A layout of the yeast cell cycle regulatory network of 200 nodes visualized by GBL based on JGraph GUI. The network is divided into 14 modules by multi-level clustering algorithm. (a) Globe layout: the modules are placed on fan-shaped areas of varied sizes and child nodes within these modules are positioned randomly when niter$$=$$ 0. (b) Local optimization: the positions of child nodes embedded in modules are optimized based on global layout when niter$$=$$ 10. 5. Results and discussions As shown in Table 1, 11 example networks have been selected to evaluate various algorithms for comparison purpose. There are five different types of networks (i.e. regulatory, protein–protein interaction, metabolic, social and scientific collaboration and communication) that represent diversified topological properties. Note the average node degrees of biological networks are relatively lower than other networks. Table 1 Test networks    Network name  Network type  Number of nodes  Number of edges  Average node degrees  Ref.  1  Yeast cell cycle  Regulatory network  200  270  2.700  [19]  2  Utez-screen  Protein–protein interaction  263  292  2.179  [20]  3  Ito-core  Protein–protein interaction  426  568  2.556  [21]  4  Y2H-CCSB  Protein–protein interaction  964  1598  3.200  [22]  5  PAO1  Metabolic network  1294  1590  2.449  [23]  6  S.cerevisiae iFF708  Metabolic network  2879  5616  3.884  [24]  7  Aspergillus niger  Metabolic network  3401  7193  4.229  [25]  8  Facebook combined  Facebook social network  4039  88234  43.69  [26]  9  Aspergillus oryzae  Metabolic network  4976  11042  4.446  [27]  10  CA-GrQc  Co-author network of relativity theory study  5242  14496  5.528  [26]  11  p2p-Gnutella08  Gnutella peer-to-peer communication network  6301  20777  6.595  [26]     Network name  Network type  Number of nodes  Number of edges  Average node degrees  Ref.  1  Yeast cell cycle  Regulatory network  200  270  2.700  [19]  2  Utez-screen  Protein–protein interaction  263  292  2.179  [20]  3  Ito-core  Protein–protein interaction  426  568  2.556  [21]  4  Y2H-CCSB  Protein–protein interaction  964  1598  3.200  [22]  5  PAO1  Metabolic network  1294  1590  2.449  [23]  6  S.cerevisiae iFF708  Metabolic network  2879  5616  3.884  [24]  7  Aspergillus niger  Metabolic network  3401  7193  4.229  [25]  8  Facebook combined  Facebook social network  4039  88234  43.69  [26]  9  Aspergillus oryzae  Metabolic network  4976  11042  4.446  [27]  10  CA-GrQc  Co-author network of relativity theory study  5242  14496  5.528  [26]  11  p2p-Gnutella08  Gnutella peer-to-peer communication network  6301  20777  6.595  [26]  As discussion in Section 2, GBL is more similar to the grid layouts than the force-directed ones. There are five main extant grid layout algorithms including original grid layout [10], Cerebral [12], SCCB-grid layout [13], LGL [15] and hybrid grid layout [17]. According to the report of the article [15], the performances of LGL are obviously better than the original grid layout and Cerebral which are only suitable for the drawings of networks with several hundred nodes. The SCCB-grid layout also does not meet the requirements of real-time drawing of complex networks with several thousand nodes due to its high time complexity [13]. Among available extant algorithms, LGL and hybrid grid layout attain relatively better performances and are suitable for the visualization of complex networks compared with other grid layouts. Unfortunately, for those networks of high-average node degrees, hybrid grid layout usually does not work and cannot output any layout results. In this case, LGL has been chosen as a reference algorithm to be compared with GBL. Five different evaluations [17] have been carried out and they are described as follows. Computation efficiency: the computation speed of algorithm. Ratio of edge–edge crossings: the number of edge–edge crossings divided by the number of edge combinations and smaller ratio will avoid higher visual complexity. Ratio of node–edge crossings: the number of node–edge crossings divided by the number of node–edge combinations and smaller ratio will avoid the more misunderstanding of network connectivity when edges cross the nodes. Relative edge length: the sum of all edgesʼ lengths divided by the product of area of layout space and the number of edges. It is used to examine whether the distributions of nodes are balanced on the layout space, and smaller relative edge length means better layout. Connectivity F-measures: evaluates whether nodes with dense connectivity are positioned together, and larger value indicates the better layout (higher modularity) [10,14,28]. Figure 5 is the comparison of computation efficiency between LGL and GBL. GBL has relatively high-computation speed owing to new method of weight matrix introduction and a network of 6000 nodes can be drawn around 100 s. In addition, the computation costs of LGL are increasing sharply for those type networks of high-average node degree, such as Facebook network (4039 nodes, average node degree $$=$$ 43.69) and CA-GrQc (5242 nodes, average node degree $$=$$ 5.528). For comparison, the time costs of GBL are shown a stable increasing on a quadratic function of complexity of $$O (n^{\mathrm{2}})$$, it is understandable that all high-degree nodes are preprocessed with graph clustering algorithm and assigned to certain modules, which lowered computation cost greatly. Fig. 5. View largeDownload slide The comparison of computation efficiency between LGL and GBL. All evaluations are based on averages of 10 runs ($$niter = 10)$$ on a Dell laptop (OS: windows 7 64 bit, CPU: Intel Core i7-4790, 3.60 GHz, Memory: 16.00 GB). Fig. 5. View largeDownload slide The comparison of computation efficiency between LGL and GBL. All evaluations are based on averages of 10 runs ($$niter = 10)$$ on a Dell laptop (OS: windows 7 64 bit, CPU: Intel Core i7-4790, 3.60 GHz, Memory: 16.00 GB). As shown in Fig. 6a–c, GBL has achieved relative outstanding performances compared with LGL in ratio of edge–edge crossings, node–edge crossings and relative edge length, dense connectivity within modules and sparse connectivity between modules are the main reasons led to the success of GBL in these directions. Figure 6d indicates that GBL has the better connectivity F-measure than that of LGL which is apparent because GBL groups densely connected nodes as modules. Fig. 6. View largeDownload slide The comparison of algorithm characterization between LGL and GBL. (a) ratio of edge–edge crossings, (b) ratio of node–edge crossings, (c) relative edge length and (d) connectivity F-measure. Fig. 6. View largeDownload slide The comparison of algorithm characterization between LGL and GBL. (a) ratio of edge–edge crossings, (b) ratio of node–edge crossings, (c) relative edge length and (d) connectivity F-measure. 6. Conclusions GBL is a new grid layout algorithm that has the advantage of displaying inherent modular structure while discern the linkage details of nodes within modules as well as it can avoid node overlaps with help of grid optimization. Owing to introduction of graph clustering algorithm and two-stage search strategy for the setting of weight matrix, compared with other grid layout, GBL outperform better performance in network visualization characteristics such as computation speed, edge–edge crossing, node–edge crossing, relative edge length and connective F-measure. To meet the requirements of real-time drawing, GBL speeds up the layout process dramatically and generates an acceptable layout within several minutes for a network of several thousand nodes. It should be also addressed that the algorithms of grid-optimization series including GBL require relatively high time costs comparing with those of force-directed series. Therefore, they are not suitable for the visualization of large-scale networks and we would like to design a further optimized grid layout algorithm to visualize larger networks. Funding National Natural Science Foundation of China (61472166); Natural Science Foundation of Jiangsu Province of China (BK20161199); and Jiangsu Overseas Research & Training Program for University Prominent Young & Middle-aged Teachers and Presidents (201613). Author contributions The basic idea was conceived by S.H. and F.Y. This idea was developed by S.H. and Y.L. who then conceived a new idea and developed it. F.Y. and D.G. took part in design and software evaluation. S.H. and F.Y. wrote the article. All authors read and approved the final article. References 1. Hartwell, L. H., Hopfield, J. J., Leibler, S. & Murray, A. W. ( 1999) From molecular to modular cell biology. Nature , 402, C47– C52. Google Scholar CrossRef Search ADS PubMed  2. Barabasi, A. L. & Oltvai, Z. N. ( 2004) Network biology: understanding the cellʼs functional organization. Nat. Rev. Genet. , 5, 101– 113. Google Scholar CrossRef Search ADS PubMed  3. Galvo, V., Miranda, J. G. V., Andrade, Jr R. F. S., Andrade, J. S. A., Gallos, L. K. & Makse, H. A. ( 2010) Modularity map of the network of human cell differentiation. Proc. Natl. Acad. Sci. USA , 107, 5750– 5755. Google Scholar CrossRef Search ADS   4. Weatheritt, R. J., Luck, K., Petsalaki, E., Davey, N. E. & Gibson, T. J. ( 2012) The identification of short linear motif-mediated interfaces within the human interactome. Bioinformatics , 28, 976– 982. Google Scholar CrossRef Search ADS PubMed  5. Hu, Z., Mellor, J., Wu, J., Kanehisa, M., Stuart, J. M. & DeLisi, C. ( 2007) Towards zoomable multidimensional maps of the cell. Nat. Biotechnol. , 25, 547– 554. Google Scholar CrossRef Search ADS PubMed  6. Tuikkala, J., Vahamaa, H., Salmela, P., Nevalainen, O. & Aittokallio, T. ( 2012) A multilevel layout algorithm for visualizing physical and genetic interaction networks, with emphasis on their modular organization. BioData Mining , 5, 2. Google Scholar CrossRef Search ADS PubMed  7. Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B. & Ideker, T. ( 2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. , 13, 2498– 2504. Google Scholar CrossRef Search ADS PubMed  8. Funahashi, A., Matsuoka, Y., Jouraku, A., Morohashi, M., Kikuchi, N. & Kitano, H. CellDesigner 3.5: a versatile modeling tool for biochemical networks. Proc. IEEE , 96, 1254– 1265. CrossRef Search ADS   9. Thomas, M. J. F. & Edward, M. R. ( 1991) Graph drawing by force-directed placement. Software-Practice & Experience , 21, 1129– 1164. Google Scholar CrossRef Search ADS   10. Li, W. & Kurata, H. ( 2005) A grid layout algorithm for automatic drawing of biochemical networks. Bioinformatics , 21, 2036– 2042. Google Scholar CrossRef Search ADS PubMed  11. Kato, M., Nagasaki, M., Doi, A. & Miyano, S. ( 2005) Automatic drawing of biological networks using cross cost and subcomponent data. Genome Inform.,  16, 22– 31. 12. Barsky, A., Gardy, J. L., Hancock, R. E. W., Munzner, T. ( 2007) Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation. Bioinformatics , 23, 1040– 1042. Google Scholar CrossRef Search ADS PubMed  13. Kojima, K., Nagasaki, M., Jeong, E., Kato, M. & Miyano, S. ( 2007) An efficient grid layout algorithm for biological networks utilizing various biological attributes. BMC Bioinformatics , 8, 76. Google Scholar CrossRef Search ADS PubMed  14. Kojima, K., Nagasaki, M. & Miyano, S. ( 2008) Fast grid layout algorithm for biological networks with sweep calculation. Bioinformatics , 24, 1433– 1441. Google Scholar CrossRef Search ADS PubMed  15. He, S., Mei, J., Shi, G., Wang, Z. & Li, W. ( 2010) LucidDraw: efficiently visualizing complex biochemical networks within MATLAB. BMC Bioinformatics , 11, 31. Google Scholar CrossRef Search ADS PubMed  16. Kojima, K., Nagasaki, M. & Miyano, S. ( 2010) An efficient biological pathway layout algorithm combining grid-layout and spring embedder for complicated cellular location information. BMC Bioinformatics , 11, 335. Google Scholar CrossRef Search ADS PubMed  17. Inoue, K., Shimozono, S., Yoshida, H. & Kurata, H. ( 2012) Application of approximate pattern matching in two dimensional spaces to grid layout for biochemical network maps. PLoS One , 7, e37739. 18. Noack, A. & Rotta, R. ( 2009) Multi-level Algorithms for Modularity Clustering. In: Experimental Algorithms.  SEA 2009 ( Vahrenhold J. eds). Lecture Notes in Computer Science , vol. 5526. Heidelberg, Berlin: Springer. Google Scholar CrossRef Search ADS   19. Kurata, H., Matoba, N. & Shimizu, N. ( 2003) CADLIVE for constructing a large-scale biochemical network based on a simulation-directed notation and its application to yeast cell cycle. Nucleic Acids Res.,  31, 4071– 4084. Google Scholar CrossRef Search ADS   20. Uetz, P., Giot, L., Cagney, G., Mansfield, T. A., Judson, R. S., Knight, J. R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P. Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S. & Rothberg, J. M. ( 2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature , 403, 623– 627. Google Scholar CrossRef Search ADS PubMed  21. Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M. & Sakaki, Y. ( 2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA , 98, 4569– 4574. Google Scholar CrossRef Search ADS   22. Yu, H., Braun, P., Yildirim, M. A., Lemmens, I., Venkatesan, K., Sahalie, J., Hirozane-Kishikawa, T., Gebreab, F., Li, N., Simonis, N., Hao, T., Rual, J.-F., Dricot, A., Vazquez, A., Murray, R. R., Simon, C., Tardivo, L., Tam, S., Svrzikapa, N., Fan, C., de Smet, A.-S., Motyl, A., Hudson, M. E., Park, J., Xin, X., Cusick, M. E., Moore, T., Boone, C., Snyder, M., Roth, F. P., Barabási, A.-L., Tavernier, J. Hill, D. E., & Vidal, M. ( 2008) High-quality binary protein interaction map of the yeast interactome network. Science , 322, 104– 110. Google Scholar CrossRef Search ADS PubMed  23. Oberhardt, M. A., Puchalka, J., Fryer, K. E., Martins dos Santos, V. A. P. & Papin, J. A. ( 2008) Genome-scale metabolic network analysis of the opportunistic pathogen pseudomonas aeruginosa PAO1. Journal of Bacteriology , 190, 2790– 2803. Google Scholar CrossRef Search ADS PubMed  24. Forster, J., Famili, I., Fu, P., Palsson, B. & Nielsen J. ( 2003) Genome-scale reconstruction of the saccharomyces cerevisiae metabolic network. Genome Res. , 13, 244– 253. Google Scholar CrossRef Search ADS PubMed  25. Andersen, M. R., Nielsen, M. L. & Nielsen, J. ( 2008) Metabolic model integration of the bibliome, genome, metabolome and reactome of Aspergillus niger. Mol Syst Biol , 4, 178. 26. Leskovec, J., Kleinberg, J. & Faloutsos, C. ( 2007) Graph evolution: densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data , 1, 1– 42. Google Scholar CrossRef Search ADS   27. Vongsangnak, W., Olsen, P., Hansen, K., Krogsgaard, S. & Nielsen, J. ( 2008) Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae. BMC Genomics , 9, 245. Google Scholar CrossRef Search ADS PubMed  28. Yamada, T., Saito, K. & Ueda, N. ( 2003) Cross-Entropy Directed Embedding of Network Data. Proceedings of the Twentieth International Conference on Machine Learning: 2003  ( Fawcett T. & Mishra N. eds). Washington DC: Kluwer Academic Publishers, pp. 832– 839. © The authors 2017. Published by Oxford University Press. All rights reserved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Complex Networks Oxford University Press

# Research on complex network layout algorithm based on grid point matching method

, Volume 6 (1) – Feb 1, 2018
10 pages

/lp/ou_press/research-on-complex-network-layout-algorithm-based-on-grid-point-UAcm8Z0nyx
Publisher
Oxford University Press
ISSN
2051-1310
eISSN
2051-1329
D.O.I.
10.1093/comnet/cnx026
Publisher site
See Article on Publisher Site

### Abstract

Abstract Clearly visualized networks provide a great help in understanding complex systems, which require designing efficient layout algorithm to draw the network diagrams. Compared with force-directed algorithm, the algorithms of grid-optimization series have the advantage of avoiding node overlap through positioning nodes on square grid points of twodimensional. However, for networks of several thousand nodes, the computation costs of extant grid layouts are too high to meet the visualization requirements. In order to improve computation performance, this article proposes a new grid point matching based algorithm named grid-based layout (GBL) with three procedures to draw complex networks. Firstly, graph clustering algorithm is applied to divide network into several modules constituted of closely connected nodes, then all modules are placed on separated two-dimensional space as global layout, and finally, the grid layout algorithm is applied to position the child nodes within these modules as local optimization. With GBL help, users can gain insights of global topological structure of network as well as detailed connectivity within these modules. In particular, an improved weight strategy is designed to speed up optimization process. Compared with latest available grid layout algorithm, GBL shows relatively better performances in computing time, edge–edge crossings, node–edge crossings, relative edge lengths and connectivity F-measures. 1. Introduction Modelling and simulation of complex networks are some of the most important tasks in the fields of network analysis and knowledge discovery. For social, biological and internet networks, their visualization seeks to organize the network structures in a manner that helps the users to gain insights into the most interesting patterns and relationships within the data, such as groups of scientific collaboration networks, components of biological pathways and autonomous systems of internet [1–5]. With the developments of information technologies, experimental datasets are increasing sharply in their size and complexity, bring a great difficult to network visualization. Closely connected nodes in complex networks are usually grouped as a module to fulfil a specific function and modules are often organized in a hierarchical structure. Such modular architecture contributes significantly to clarify the network complexity [1,6]. There exists a wide variety of network layout algorithms that aim to place connected nodes of a graph close to each other on two-dimensional plane. Force directed is a notable layout algorithm available in most popular visualization platforms (e.g. Cytoscape [7] and CellDesigner [8] etc.), where a network is modelled as a mechanical system with nodes having repulsive force and edges having attractive force. However, a drawback of force directed is that the densely connected nodes tend to locate too close and easily overlap each other [9]. To solve the problem, a variation algorithm named grid layout [10] is proposed that nodes are placed on square grid points to avoid node overlap with the expense of computation performance. Grid layout succeeded in generating compact layouts and attracted many concerns [11–17]. To decrease computation cost, Kojima et al. [13,14] proposed a method termed sweep calculation to speed up layout process. He et al. [15] designed an algorithm named LucidDraw grid layout algorithm (LGL), where a neighbourhood-test strategy was used to low computation time. Previous grid layout algorithms work fairly well for small- and medium-sized networks of several hundred networks, whereas larger networks, the computation time is increasing rapidly. In the article, we proposed a new grid optimization algorithm named grid-based layout algorithm (GBL) that is specifically focused on clarifying visual complexity with network modular properties, as well as takes a new optimization strategy to improve computation performance. Three main steps are taken to optimize the node positions (Fig. 1). Firstly, networks are preprocessed as several modules through graph clustering algorithm, and then all modules are placed on the two-dimensional plane as global level, finally, the coordinates of nodes within all modules are optimized as local level. Fig. 1. View largeDownload slide (a) An example network. (b) Network preprocessing. (c) Global layout and the initialization of nodes’ coordinates. (d) Local optimization and the positions of all nodes are locally optimized with grid layout algorithm. Fig. 1. View largeDownload slide (a) An example network. (b) Network preprocessing. (c) Global layout and the initialization of nodes’ coordinates. (d) Local optimization and the positions of all nodes are locally optimized with grid layout algorithm. 2. Method A network can generally be transformed into a graph where nodes are represented as geometric points and edges between nodes are represented as straight lines. Under such a drawing regulation, a layout can be generated when all nodes’ coordinates are figured out through a certain optimization algorithm. 2.1 Step 1: network preprocessing For a given network, the node set is partitioned into several size-varied subsets termed modules with clustering algorithms as network preprocessing. Such clustering algorithms can group closely connected nodes by placing them in the same module and separate sparsely connected nodes by placing them in different modules. In our method, the node set is partitioned by an algorithm termed multi-level clustering [18] that is a heuristics graph clustering algorithm which coarsens the network through iteratively merging clusters and then refines the resulting clusters by iteratively moving individual nodes between them. Multi-level clustering has the advantages of relative high-network modularity and low-computation time whose computational complexity is $$O$$ ($$m$$ log $$n$$), where $$m$$ and $$n$$ are the number of edges and nodes, respectively. An example network of 28 nodes as shown in Fig. 1a is used to illustrate the GBL procedures, and Fig. 1b is the preprocessed result by multi-level clustering, where the node set is partitioned into five subsets represented by modules of example network. 2.2 Step 2: globe layout The focus of this step is to place all modules resulted from step 1 on a two-dimensional plane. As shown in Fig. 1c, the space is divided into several fan-shaped areas of varied sizes to place the modules, and the size of fan-shaped (determined by central angle $$\theta$$) is proportional to the number of nodes embedded in module, i.e. one module having more nodes is arranged to relative larger area to avoid node overlaps. In this step, the coordinates of nodes within modules are initialized to random values prepared for local optimization. The computation cost of globe layout is mainly resulted from the initialization procedure. To avoid nodes’ overlap, the algorithm has to check repeatedly whether the coordinates (i.e. abscissa and ordinate) of nodes are identical through a nested loop. For a network of $$n$$nodes, the cost time of nested loop is $$O (n^{\mathrm{2}})$$ when globe layout is finished. 2.3 Step 3: local optimization The procedure of local optimization. On the basis of global layout, the positions of child nodes embedded in modules should be optimized as shown in Fig. 1d. We adopted a strategy similar to original grid layout [10,15] with modifications to improve the computation performance. Given a network of $$n$$ nodes, the layout R can be denoted by $$\textbf{R}= \textbf{R}(\textbf{r}_{{1}}, \textbf{r}_{{2}},\ldots, \textbf{r}_{n})$$, where $$\textbf{r}_{i} = (x_{i},\ y_{i})$$ is the coordinates. The original grid optimization [1] seeks to find the best layout through optimizing a cost function $$f \textbf{(W, R)}$$ and positions all nodes on grid point to avoid node overlaps:   $$f({{\bf W}}, {{\bf R}}{)}=\sum\limits_{i<j}^n {f_{ij} ({{\bf W}},{{\bf R}})}\label{eq1}$$ (1) The cost between nodes $$i$$ and $$j$$ is $$f_{ij}$$$$(\textbf{W,R}) = w_{ij}d (\mathrm{r}_{i},\ \mathrm{r}_{j})$$, where $$w_{ij}$$ is the interaction weight of node $$i$$ and $$j$$, which describes the way nodes interplay and the weights between all node pairs constitute the weight matrix W. The term $$d(\mathrm{r}_{i}$$, $$\mathrm{r}_{j})$$ is the Manhattan distance between nodes $$i$$ and $$j$$. In general, the node-pairs with higher positive weight will be positioned closer and node-pairs of negative weight be positioned more distant during optimization procedure. As shown in Fig. 2, the procedure begins with a random layout R which is subjected to the locally optimization that moves every single node to its neighbouring vacant grid point to reduce the cost $$f\textbf{(W, R)}$$. To avoid the local minimum, the layout Rʼ is generated through a perturbation procedure that moves each node to a randomly chosen adjacent grid point with a given probability. The perturbed layout Rʼ is optimized once more and the layout of lower cost score is selected as the next input to repeat the optimization process until it reaches niter time. Fig. 2. View largeDownload slide The Pseudo-code of local optimization. Fig. 2. View largeDownload slide The Pseudo-code of local optimization. One main disadvantage of original grid optimization is the high time cost that results from the computation of weight matrix W. For a network of $$n$$nodes, if A is $$n \times n$$ adjacent matrix, W is matrix A multiply by A whose computational complexity is $$O$$ ($$n^{\mathrm{3}})$$ according to original grid layout algorithm [10]. To low the computation costs, a two-stage search strategy for the setting of weight matrix W is put forward and integrated in local optimization algorithm through the function getWeight(), please see Fig. 2 (line 2). Two-stage search strategy. The weight matrix is generated based on the path lengths (i.e. graph distances) between node-pairs. As shown in Fig. 3, for example, node A has three adjacent nodes B, C and D, and the path lengths between A and its adjacent nodes is defined as 1; node A is connected to node E and F via node D, therefore the path lengths between node A and E, F is defined as 2. At the first stage, the algorithm searches the node-pairs with path length $$=$$ 1 and then it searches the node pairs with path length $$=$$ 2 at the second stage as shown in Fig. 3b and c, respectively. Fig. 3. View largeDownload slide The two-stage search strategy. Fig. 3. View largeDownload slide The two-stage search strategy. The algorithm sets weight values according to the criterion that closely connected nodes (smaller path lengths) are assigned higher weights and sparsely connected nodes (larger path lengths) are assigned lower weights or negative weights. As shown in Fig. 3d, the weight values (e.g. 30, 10, $$-$$20) are set empirically and they can be adjusted according to usersʼ needs. The computation cost of local layout is mainly derived from weight matrix setting and the perturbing process. Obviously, for a network of $$n$$nodes, the two-stage search requires $$O(n^{2})$$ time to assign weight values through dual loops and the local layout algorithm also spends $$O(n^{2})$$ time to search vacant grid points on two-dimensional plane for local optimization in perturbing process, therefore the computation cost of local layout is $$O(n^{2}) + O(n^{2}) = O(n^{2})$$. 3. Algorithm parameter and computation complexity The number of iteration niter in local optimization is the crucial parameter that controls computation time. Small niter usually results in relatively low quality of drawings and large niter is not necessary because better drawing is harder and harder to obtain at later optimization. After repeated testing in practice, we chose a moderate value of $$niter = 10$$ that is usually enough to generate satisfied layouts. As discussed in the Section 2, for a network of $$m$$ edges and $$n$$ nodes, the running time of three steps is $$O (m \log n) + O (n^{2}) +niter \times O (n^{2}) = O (m \log n) + O(n^{{2}})$$. 4. Implementation GBL is implemented with Java language and it is integrated in a visualization graphical user interface (GUI) which has been developed based on JGraph, an open source graph visualization library also written in Java (https://github.com/jgraph/jgraphx). With the support of abundant graphical functionalities in JGraph, the GUI can provide user-friendly operations on the network drawings such as zooming in/out, showing/hiding labels and moving nodes. Figure 4a is an example drawing of global layout that includes 14 modules, each module is marked with different shapes and colours. All modules are placed on fan-shaped areas around a centre point and child nodes within these modules are positioned randomly with no local optimization when iteration parameter niter$$=$$ 0. Figure 4b is the result drawing after local optimization with niter$$=$$ 10, where the modulesʼ positions remain relative stable. GBL and visualization GUI can freely downloaded from http://jsjxy.jstu.edu.cn/Detail.aspx?DepartColumnId$=$14. Fig. 4. View largeDownload slide A layout of the yeast cell cycle regulatory network of 200 nodes visualized by GBL based on JGraph GUI. The network is divided into 14 modules by multi-level clustering algorithm. (a) Globe layout: the modules are placed on fan-shaped areas of varied sizes and child nodes within these modules are positioned randomly when niter$$=$$ 0. (b) Local optimization: the positions of child nodes embedded in modules are optimized based on global layout when niter$$=$$ 10. Fig. 4. View largeDownload slide A layout of the yeast cell cycle regulatory network of 200 nodes visualized by GBL based on JGraph GUI. The network is divided into 14 modules by multi-level clustering algorithm. (a) Globe layout: the modules are placed on fan-shaped areas of varied sizes and child nodes within these modules are positioned randomly when niter$$=$$ 0. (b) Local optimization: the positions of child nodes embedded in modules are optimized based on global layout when niter$$=$$ 10. 5. Results and discussions As shown in Table 1, 11 example networks have been selected to evaluate various algorithms for comparison purpose. There are five different types of networks (i.e. regulatory, protein–protein interaction, metabolic, social and scientific collaboration and communication) that represent diversified topological properties. Note the average node degrees of biological networks are relatively lower than other networks. Table 1 Test networks    Network name  Network type  Number of nodes  Number of edges  Average node degrees  Ref.  1  Yeast cell cycle  Regulatory network  200  270  2.700  [19]  2  Utez-screen  Protein–protein interaction  263  292  2.179  [20]  3  Ito-core  Protein–protein interaction  426  568  2.556  [21]  4  Y2H-CCSB  Protein–protein interaction  964  1598  3.200  [22]  5  PAO1  Metabolic network  1294  1590  2.449  [23]  6  S.cerevisiae iFF708  Metabolic network  2879  5616  3.884  [24]  7  Aspergillus niger  Metabolic network  3401  7193  4.229  [25]  8  Facebook combined  Facebook social network  4039  88234  43.69  [26]  9  Aspergillus oryzae  Metabolic network  4976  11042  4.446  [27]  10  CA-GrQc  Co-author network of relativity theory study  5242  14496  5.528  [26]  11  p2p-Gnutella08  Gnutella peer-to-peer communication network  6301  20777  6.595  [26]     Network name  Network type  Number of nodes  Number of edges  Average node degrees  Ref.  1  Yeast cell cycle  Regulatory network  200  270  2.700  [19]  2  Utez-screen  Protein–protein interaction  263  292  2.179  [20]  3  Ito-core  Protein–protein interaction  426  568  2.556  [21]  4  Y2H-CCSB  Protein–protein interaction  964  1598  3.200  [22]  5  PAO1  Metabolic network  1294  1590  2.449  [23]  6  S.cerevisiae iFF708  Metabolic network  2879  5616  3.884  [24]  7  Aspergillus niger  Metabolic network  3401  7193  4.229  [25]  8  Facebook combined  Facebook social network  4039  88234  43.69  [26]  9  Aspergillus oryzae  Metabolic network  4976  11042  4.446  [27]  10  CA-GrQc  Co-author network of relativity theory study  5242  14496  5.528  [26]  11  p2p-Gnutella08  Gnutella peer-to-peer communication network  6301  20777  6.595  [26]  As discussion in Section 2, GBL is more similar to the grid layouts than the force-directed ones. There are five main extant grid layout algorithms including original grid layout [10], Cerebral [12], SCCB-grid layout [13], LGL [15] and hybrid grid layout [17]. According to the report of the article [15], the performances of LGL are obviously better than the original grid layout and Cerebral which are only suitable for the drawings of networks with several hundred nodes. The SCCB-grid layout also does not meet the requirements of real-time drawing of complex networks with several thousand nodes due to its high time complexity [13]. Among available extant algorithms, LGL and hybrid grid layout attain relatively better performances and are suitable for the visualization of complex networks compared with other grid layouts. Unfortunately, for those networks of high-average node degrees, hybrid grid layout usually does not work and cannot output any layout results. In this case, LGL has been chosen as a reference algorithm to be compared with GBL. Five different evaluations [17] have been carried out and they are described as follows. Computation efficiency: the computation speed of algorithm. Ratio of edge–edge crossings: the number of edge–edge crossings divided by the number of edge combinations and smaller ratio will avoid higher visual complexity. Ratio of node–edge crossings: the number of node–edge crossings divided by the number of node–edge combinations and smaller ratio will avoid the more misunderstanding of network connectivity when edges cross the nodes. Relative edge length: the sum of all edgesʼ lengths divided by the product of area of layout space and the number of edges. It is used to examine whether the distributions of nodes are balanced on the layout space, and smaller relative edge length means better layout. Connectivity F-measures: evaluates whether nodes with dense connectivity are positioned together, and larger value indicates the better layout (higher modularity) [10,14,28]. Figure 5 is the comparison of computation efficiency between LGL and GBL. GBL has relatively high-computation speed owing to new method of weight matrix introduction and a network of 6000 nodes can be drawn around 100 s. In addition, the computation costs of LGL are increasing sharply for those type networks of high-average node degree, such as Facebook network (4039 nodes, average node degree $$=$$ 43.69) and CA-GrQc (5242 nodes, average node degree $$=$$ 5.528). For comparison, the time costs of GBL are shown a stable increasing on a quadratic function of complexity of $$O (n^{\mathrm{2}})$$, it is understandable that all high-degree nodes are preprocessed with graph clustering algorithm and assigned to certain modules, which lowered computation cost greatly. Fig. 5. View largeDownload slide The comparison of computation efficiency between LGL and GBL. All evaluations are based on averages of 10 runs ($$niter = 10)$$ on a Dell laptop (OS: windows 7 64 bit, CPU: Intel Core i7-4790, 3.60 GHz, Memory: 16.00 GB). Fig. 5. View largeDownload slide The comparison of computation efficiency between LGL and GBL. All evaluations are based on averages of 10 runs ($$niter = 10)$$ on a Dell laptop (OS: windows 7 64 bit, CPU: Intel Core i7-4790, 3.60 GHz, Memory: 16.00 GB). As shown in Fig. 6a–c, GBL has achieved relative outstanding performances compared with LGL in ratio of edge–edge crossings, node–edge crossings and relative edge length, dense connectivity within modules and sparse connectivity between modules are the main reasons led to the success of GBL in these directions. Figure 6d indicates that GBL has the better connectivity F-measure than that of LGL which is apparent because GBL groups densely connected nodes as modules. Fig. 6. View largeDownload slide The comparison of algorithm characterization between LGL and GBL. (a) ratio of edge–edge crossings, (b) ratio of node–edge crossings, (c) relative edge length and (d) connectivity F-measure. Fig. 6. View largeDownload slide The comparison of algorithm characterization between LGL and GBL. (a) ratio of edge–edge crossings, (b) ratio of node–edge crossings, (c) relative edge length and (d) connectivity F-measure. 6. Conclusions GBL is a new grid layout algorithm that has the advantage of displaying inherent modular structure while discern the linkage details of nodes within modules as well as it can avoid node overlaps with help of grid optimization. Owing to introduction of graph clustering algorithm and two-stage search strategy for the setting of weight matrix, compared with other grid layout, GBL outperform better performance in network visualization characteristics such as computation speed, edge–edge crossing, node–edge crossing, relative edge length and connective F-measure. To meet the requirements of real-time drawing, GBL speeds up the layout process dramatically and generates an acceptable layout within several minutes for a network of several thousand nodes. It should be also addressed that the algorithms of grid-optimization series including GBL require relatively high time costs comparing with those of force-directed series. Therefore, they are not suitable for the visualization of large-scale networks and we would like to design a further optimized grid layout algorithm to visualize larger networks. Funding National Natural Science Foundation of China (61472166); Natural Science Foundation of Jiangsu Province of China (BK20161199); and Jiangsu Overseas Research & Training Program for University Prominent Young & Middle-aged Teachers and Presidents (201613). Author contributions The basic idea was conceived by S.H. and F.Y. This idea was developed by S.H. and Y.L. who then conceived a new idea and developed it. F.Y. and D.G. took part in design and software evaluation. S.H. and F.Y. wrote the article. All authors read and approved the final article. References 1. Hartwell, L. H., Hopfield, J. J., Leibler, S. & Murray, A. W. ( 1999) From molecular to modular cell biology. Nature , 402, C47– C52. Google Scholar CrossRef Search ADS PubMed  2. Barabasi, A. L. & Oltvai, Z. N. ( 2004) Network biology: understanding the cellʼs functional organization. Nat. Rev. Genet. , 5, 101– 113. Google Scholar CrossRef Search ADS PubMed  3. Galvo, V., Miranda, J. G. V., Andrade, Jr R. F. S., Andrade, J. S. A., Gallos, L. K. & Makse, H. A. ( 2010) Modularity map of the network of human cell differentiation. Proc. Natl. Acad. Sci. USA , 107, 5750– 5755. Google Scholar CrossRef Search ADS   4. Weatheritt, R. J., Luck, K., Petsalaki, E., Davey, N. E. & Gibson, T. J. ( 2012) The identification of short linear motif-mediated interfaces within the human interactome. Bioinformatics , 28, 976– 982. Google Scholar CrossRef Search ADS PubMed  5. Hu, Z., Mellor, J., Wu, J., Kanehisa, M., Stuart, J. M. & DeLisi, C. ( 2007) Towards zoomable multidimensional maps of the cell. Nat. Biotechnol. , 25, 547– 554. Google Scholar CrossRef Search ADS PubMed  6. Tuikkala, J., Vahamaa, H., Salmela, P., Nevalainen, O. & Aittokallio, T. ( 2012) A multilevel layout algorithm for visualizing physical and genetic interaction networks, with emphasis on their modular organization. BioData Mining , 5, 2. Google Scholar CrossRef Search ADS PubMed  7. Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B. & Ideker, T. ( 2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. , 13, 2498– 2504. Google Scholar CrossRef Search ADS PubMed  8. Funahashi, A., Matsuoka, Y., Jouraku, A., Morohashi, M., Kikuchi, N. & Kitano, H. CellDesigner 3.5: a versatile modeling tool for biochemical networks. Proc. IEEE , 96, 1254– 1265. CrossRef Search ADS   9. Thomas, M. J. F. & Edward, M. R. ( 1991) Graph drawing by force-directed placement. Software-Practice & Experience , 21, 1129– 1164. Google Scholar CrossRef Search ADS   10. Li, W. & Kurata, H. ( 2005) A grid layout algorithm for automatic drawing of biochemical networks. Bioinformatics , 21, 2036– 2042. Google Scholar CrossRef Search ADS PubMed  11. Kato, M., Nagasaki, M., Doi, A. & Miyano, S. ( 2005) Automatic drawing of biological networks using cross cost and subcomponent data. Genome Inform.,  16, 22– 31. 12. Barsky, A., Gardy, J. L., Hancock, R. E. W., Munzner, T. ( 2007) Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation. Bioinformatics , 23, 1040– 1042. Google Scholar CrossRef Search ADS PubMed  13. Kojima, K., Nagasaki, M., Jeong, E., Kato, M. & Miyano, S. ( 2007) An efficient grid layout algorithm for biological networks utilizing various biological attributes. BMC Bioinformatics , 8, 76. Google Scholar CrossRef Search ADS PubMed  14. Kojima, K., Nagasaki, M. & Miyano, S. ( 2008) Fast grid layout algorithm for biological networks with sweep calculation. Bioinformatics , 24, 1433– 1441. Google Scholar CrossRef Search ADS PubMed  15. He, S., Mei, J., Shi, G., Wang, Z. & Li, W. ( 2010) LucidDraw: efficiently visualizing complex biochemical networks within MATLAB. BMC Bioinformatics , 11, 31. Google Scholar CrossRef Search ADS PubMed  16. Kojima, K., Nagasaki, M. & Miyano, S. ( 2010) An efficient biological pathway layout algorithm combining grid-layout and spring embedder for complicated cellular location information. BMC Bioinformatics , 11, 335. Google Scholar CrossRef Search ADS PubMed  17. Inoue, K., Shimozono, S., Yoshida, H. & Kurata, H. ( 2012) Application of approximate pattern matching in two dimensional spaces to grid layout for biochemical network maps. PLoS One , 7, e37739. 18. Noack, A. & Rotta, R. ( 2009) Multi-level Algorithms for Modularity Clustering. In: Experimental Algorithms.  SEA 2009 ( Vahrenhold J. eds). Lecture Notes in Computer Science , vol. 5526. Heidelberg, Berlin: Springer. Google Scholar CrossRef Search ADS   19. Kurata, H., Matoba, N. & Shimizu, N. ( 2003) CADLIVE for constructing a large-scale biochemical network based on a simulation-directed notation and its application to yeast cell cycle. Nucleic Acids Res.,  31, 4071– 4084. Google Scholar CrossRef Search ADS   20. Uetz, P., Giot, L., Cagney, G., Mansfield, T. A., Judson, R. S., Knight, J. R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P. Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S. & Rothberg, J. M. ( 2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature , 403, 623– 627. Google Scholar CrossRef Search ADS PubMed  21. Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M. & Sakaki, Y. ( 2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA , 98, 4569– 4574. Google Scholar CrossRef Search ADS   22. Yu, H., Braun, P., Yildirim, M. A., Lemmens, I., Venkatesan, K., Sahalie, J., Hirozane-Kishikawa, T., Gebreab, F., Li, N., Simonis, N., Hao, T., Rual, J.-F., Dricot, A., Vazquez, A., Murray, R. R., Simon, C., Tardivo, L., Tam, S., Svrzikapa, N., Fan, C., de Smet, A.-S., Motyl, A., Hudson, M. E., Park, J., Xin, X., Cusick, M. E., Moore, T., Boone, C., Snyder, M., Roth, F. P., Barabási, A.-L., Tavernier, J. Hill, D. E., & Vidal, M. ( 2008) High-quality binary protein interaction map of the yeast interactome network. Science , 322, 104– 110. Google Scholar CrossRef Search ADS PubMed  23. Oberhardt, M. A., Puchalka, J., Fryer, K. E., Martins dos Santos, V. A. P. & Papin, J. A. ( 2008) Genome-scale metabolic network analysis of the opportunistic pathogen pseudomonas aeruginosa PAO1. Journal of Bacteriology , 190, 2790– 2803. Google Scholar CrossRef Search ADS PubMed  24. Forster, J., Famili, I., Fu, P., Palsson, B. & Nielsen J. ( 2003) Genome-scale reconstruction of the saccharomyces cerevisiae metabolic network. Genome Res. , 13, 244– 253. Google Scholar CrossRef Search ADS PubMed  25. Andersen, M. R., Nielsen, M. L. & Nielsen, J. ( 2008) Metabolic model integration of the bibliome, genome, metabolome and reactome of Aspergillus niger. Mol Syst Biol , 4, 178. 26. Leskovec, J., Kleinberg, J. & Faloutsos, C. ( 2007) Graph evolution: densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data , 1, 1– 42. Google Scholar CrossRef Search ADS   27. Vongsangnak, W., Olsen, P., Hansen, K., Krogsgaard, S. & Nielsen, J. ( 2008) Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae. BMC Genomics , 9, 245. Google Scholar CrossRef Search ADS PubMed  28. Yamada, T., Saito, K. & Ueda, N. ( 2003) Cross-Entropy Directed Embedding of Network Data. Proceedings of the Twentieth International Conference on Machine Learning: 2003  ( Fawcett T. & Mishra N. eds). Washington DC: Kluwer Academic Publishers, pp. 832– 839. © The authors 2017. Published by Oxford University Press. All rights reserved.

### Journal

Journal of Complex NetworksOxford University Press

Published: Feb 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Unlimited reading Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere. ### Stay up to date Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates. ### Organize your research It’s easy to organize your research with our built-in tools. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. ### Monthly Plan • Read unlimited articles • Personalized recommendations • No expiration • Print 20 pages per month • 20% off on PDF purchases • Organize your research • Get updates on your journals and topic searches$49/month

14-day Free Trial

Best Deal — 39% off

### Annual Plan

• All the features of the Professional Plan, but for 39% off!
• Billed annually
• No expiration
• For the normal price of 10 articles elsewhere, you get one full year of unlimited access to articles.

$588$360/year

billed annually

14-day Free Trial