Parallel computations of local PageRank problem based on Graphics Processing UnitLai, Siyan; Shao, Bo; Xu, Ying; Lin, Xiaola
doi: 10.1002/cpe.4245pmid: N/A
In this paper, we study the issue of improving the performance of Markov chain Monte Carlo method to solve local PageRank problem under General Purpose Graphics Processing Unit environment. As a large number of dangling vertices cause large storage space of dangling vertices and thus slow down the Markov chain procession, we propose a reordering strategy to compress the storage space and reduce the computational complexity of Markov chain procession. In our performance study, by parallelizing and optimizing the proposed algorithm based on GPU, the reordering strategy can be up 12× faster compared with basic method, where the graphs have high‐proportion dangling vertices. According to our investigation on this issue, the variance of random walks determines the number of random walks in the computation; we thus introduce low‐discrepancy sequences to enhance the performance. Moreover, the low‐discrepancy sequences are organized to load in the on‐chip shared memory to accelerate fetching with a wise warp scheduling for bank conflict schema. A series of experiments have been conducted to evaluate the optimization efficiency. Compared with fetching data from off‐chip global memory, the shared‐memory‐based strategy can have over 10× speedup ratio performance. The experiments indicate that the size of shared memory has a significant impact on the parallelism of the proposed method as well.
Modeling analysis of Intelligent Manufacturing System based on SDNBai, Yun
doi: 10.1002/cpe.4270pmid: N/A
Intelligent Manufacturing contains software composed fundamentally of computing, storage, and networking resources. Software‐defined network (SDN) is a new architecture of network. The separation of network control plane and network data forwarding plane and the realization of programmable control is the design concept of SDN. This paper analyzes the characteristics and meaning of intelligent manufacturing systems, which leads to problems facing the Intelligent Manufacturing System, and then dispersing the benefits of SDN. This made the combination of intelligent manufacturing systems SDN concept model. The article makes a deep research on Intelligent Manufacturing from the aspects of technical connotation, equipment model, etc.
Comprehensive multi‐objective model to remote sensing data processing task scheduling problemXing, Lining; Li, Wen; He, Minfan; Tan, Xu
doi: 10.1002/cpe.4248pmid: N/A
Scientific scheduling of limited resource plays an important role in the remote sensing data processing. The remote sensing data processing task scheduling is characterized as one novel comprehensive multi‐objective model. In this proposed model, the remote sensing data processing task scheduling problem is divided into task dispensation and task scheduling sub‐problem with hundreds of variables being considered in it. In order to effectively solve this problem, Bayes belief model is applied to generate the initial dispensation plan, and learnable ant colony optimization is proposed to solve task scheduling sub‐problem. Experimental results suggest that the proposed comprehensive multi‐objective model and its solving methods are feasible and efficient to remote sensing data processing task scheduling, and it also promotes processing centers interoperability among heterogeneous and dispersed processing center. The model and the method of this paper can provide a valuable reference for solving other complex scheduling problem.
Secure communication scheme analysis via complex networksDing, Yong; Xiong, Ning; He, En; Li, Kezan
doi: 10.1002/cpe.4282pmid: N/A
Recently, some existing works have introduced novel way to construct complex networks from embedded time series, which provides new sights into the organizational properties of the time series in phase space. In this paper, we attempt to answer the fundamental question of “how much information regarding the dynamic property of the original time series can be extracted from these networks.” To this end, we propose a new method for reconstructing time series from the networks. We compare the reconstructed time series from these networks and that from the recurrence plot. We find that these networks contain topological information of the embedded time series to a certain degree. In general, they are more powerful than the recurrence plot method in the reconstruction of embedded time series. In addition, we study a new generalized projective synchronization (GPS) of coupled complex dynamical networks with different sizes via feedback control and impulsive control. Based on the stability analysis of impulsive system, a network synchronization criterion is established. These works may find potential application for secure communication via networks.
Design and optimisation of an efficient HDF5 I/O Kernel for massive parallel fluid flow simulationsErtl, Christoph; Frisch, Jérôme; Mundani, Ralf‐Peter
doi: 10.1002/cpe.4165pmid: N/A
More and more massive parallel codes running on several hundreds of thousands of cores are entering the computational science and engineering domain, allowing high‐fidelity computations on up to trillions of unknowns for very detailed analyses of the underlying problems. Such runs typically produce gigabytes of data, hindering both efficient storage and (interactive) data exploration. Advanced approaches based on inherently distributed data formats such as hierarchical data format version 5 become necessary here to avoid long latencies when storing the data and to support fast (random) access when retrieving the data for visual processing. This paper shows considerations and implementation aspects of an I/O kernel based on hierarchical data format version 5 that supports fast checkpointing, restarting, and selective visualisation using a single shared output file for an existing computational fluid dynamics framework. This functionality is achieved by including the framework's hierarchical data structure in the file, which also opens the door for additional steering functionality. Finally, the performance of the kernel's write routines are presented. Bandwidths close to the theoretical peak on modern supercomputing clusters were achieved by avoiding file‐locking and using collective buffering.
Fitting long‐tailed distribution to empirical dataGil, Joseph (Yossi); Monni, Cristina
doi: 10.1002/cpe.4223pmid: N/A
Power laws can fit a variety of distributions coming from real data, so a systematic approach to the measurement of the accuracy of fitting algorithms is essential. We discuss the limits of the analysis of empirical fat‐tailed distributions, which can describe a variety of evolving systems, both natural and man‐made. An algorithm to fit fat‐tailed distributions is presented and tested against samplings of the power law, the Yule, the log‐normal, and Weibull distributions. We compute the parameters defining the shape of each distribution and test the results against simulations. We compare our method with another state‐of‐the‐art technique to estimate the parameters of empirical distributions. The accuracy of the estimations is discussed, and we conclude that our method based on a weighted iterated χ2 test performs better than the other. Our algorithm is general and can be applied to any numerical dataset.
A generic parallel pattern interface for stream and data processingdel Rio Astorga, David; Dolz, Manuel F.; Fernández, Javier; García, J. Daniel
doi: 10.1002/cpe.4175pmid: N/A
Current parallel programming frameworks aid developers to a great extent in implementing applications that exploit parallel hardware resources. Nevertheless, developers require additional expertise to properly use and tune them to operate efficiently on specific parallel platforms. On the other hand, porting applications between different parallel programming models and platforms is not straightforward and demands considerable efforts and specific knowledge. Apart from that, the lack of high‐level parallel pattern abstractions, in those frameworks, further increases the complexity in developing parallel applications. To pave the way in this direction, this paper proposes GRPPI, a generic and reusable parallel pattern interface for both stream processing and data‐intensive C++ applications. GRPPI accommodates a layer between developers and existing parallel programming frameworks targeting multi‐core processors, such as C++ threads, OpenMP and Intel TBB, and accelerators, as CUDA Thrust. Furthermore, thanks to its high‐level C++ application programming interface and pattern composability features, GRPPI allows users to easily expose parallelism via standalone patterns or patterns compositions matching in sequential applications. We evaluate this interface using an image processing use case and demonstrate its benefits from the usability, flexibility, and performance points of view. Furthermore, we analyze the impact of using stream and data pattern compositions on CPUs, GPUs and heterogeneous configurations.