A Survey on Distributed Machine LearningVerbraeken, Joost; Wolting, Matthijs; Katzy, Jonathan; Kloppenburg, Jeroen; Verbelen, Tim; Rellermeyer, Jan S.
doi: 10.1145/3377454pmid: N/A
The demand for artificial intelligence has grown significantly over the past decade, and this growth has been fueled by advances in machine learning techniques and the ability to leverage hardware acceleration. However, to increase the quality of predictions and render machine learning solutions feasible for more complex applications, a substantial amount of training data is required. Although small machine learning models can be trained with modest amounts of data, the input for training larger models such as neural networks grows exponentially with the number of parameters. Since the demand for processing training data has outpaced the increase in computation power of computing machinery, there is a need for distributing the machine learning workload across multiple machines, and turning the centralized into a distributed system. These distributed systems present new challenges: first and foremost, the efficient parallelization of the training process and the creation of a coherent model. This article provides an extensive overview of the current state-of-the-art in the field by outlining the challenges and opportunities of distributed machine learning over conventional (centralized) machine learning, discussing the techniques used for distributed machine learning, and providing an overview of the systems that are available.
A Survey of Profit Optimization Techniques for Cloud ProvidersCong, Peijin; Xu, Guo; Wei, Tongquan; Li, Keqin
doi: 10.1145/3376917pmid: N/A
As the demand for computing resources grows, cloud computing becomes more and more popular as a pay-as-you-go model, in which the computing resources and services are provided to cloud users efficiently. For cloud providers, the typical goal is to maximize their profits. However, maximizing profits in a highly competitive cloud market is a huge challenge for cloud providers. In this article, a survey of profit optimization techniques is proposed to increase cloud provider profitability through service quality improvement, service pricing, energy consumption reduction, and virtual network function (VNF) deployment. The strategy of improving user service quality is discussed first, followed by the pricing strategy for cloud resources to maximize revenue. Then, this article summarizes the techniques for cloud data centers to reduce server power consumption. Finally, various heuristic algorithms for VNF deployment in the cloud are further described to reduce the cost of cloud providers while maintaining performance. We classify research works based on components of profit and methods used to demonstrate similarities and differences in these studies. We hope this survey will provide researchers with insights into cloud profit optimization techniques.
Computer-Generated Holograms for 3D ImagingSahin, Erdem; Stoykova, Elena; Mäkinen, Jani; Gotchev, Atanas
doi: 10.1145/3378444pmid: N/A
Holography is usually considered as the ultimate way to visually reproduce a three-dimensional scene. Computer-generated holography constitutes an important branch of holography, which enables visualization of artificially generated scenes as well as real three-dimensional scenes recorded under white-light illumination. In this article, we present a comprehensive survey of methods for synthesis of computer-generated holograms, classifying them into two broad categories: wavefront-based methods and ray-based methods. We examine their modern implementations in terms of the quality of reconstruction and computational efficiency. As it is an integral part of computer-generated holography, we devote a special section to speckle suppression, which is also discussed under two categories following the classification of underlying computer-generated hologram methods.
A Survey of Hierarchical Energy Optimization for Mobile Edge ComputingCong, Peijin; Zhou, Junlong; Li, Liying; Cao, Kun; Wei, Tongquan; Li, Keqin
doi: 10.1145/3378935pmid: N/A
With the development of wireless technology, various emerging mobile applications are attracting significant attention and drastically changing our daily lives. Applications such as augmented reality and object recognition demand stringent delay and powerful processing capability, which exerts enormous pressure on mobile devices with limited resources and energy. In this article, a survey of techniques for mobile device energy optimization is presented in a hierarchy of device design and operation, computation offloading, wireless data transmission, and cloud execution of offloaded computation. Energy management strategies for mobile devices from hardware and software aspects are first discussed, followed by energy-efficient computation offloading frameworks for mobile applications that trade application response time for device energy consumption. Then, techniques for efficient wireless data communication to reduce transmission energy are summarized. Finally, the execution mechanisms of application components or tasks in various clouds are further described to provide energy-saving opportunities for mobile devices. We classify the research works based on key characteristics of devices and applications to emphasize their similarities and differences. We hope that this survey will give insights to researchers into energy management mechanisms on mobile devices, and emphasize the crucial importance of optimizing device energy consumption for more research efforts in this area.
A Survey on Renamings of Software EntitiesLi, Guangjie; Liu, Hui; Nyamawe, Ally S.
doi: 10.1145/3379443pmid: N/A
More than 70% of characters in the source code are used to label identifiers. Consequently, identifiers are one of the most important source for program comprehension. Meaningful identifiers are crucial to understand and maintain programs. However, for reasons like constrained schedule, inexperience, and unplanned evolution, identifiers may fail to convey the semantics of the entities associated with them. As a result, such entities should be renamed to improve software quality. However, manual renaming and recommendation are fastidious, time consuming, and error prone, whereas automating the process of renamings is challenging: (1) It involves complex natural language processing to understand the meaning of identifers; (2) It also involves difficult semantic analysis to determine the role of software entities. Researchers proposed a number of approaches and tools to facilitate renamings. We present a survey on existing approaches and classify them into identification of renaming opportunities, execution of renamings, and detection of renamings. We find that there is an imbalance between the three type of approaches, and most of implementation of approaches and evaluation dataset are not publicly available. We also discuss the challenges and present potential research directions. To the best of our knowledge, this survey is the first comprehensive study on renamings of software entities.
A Survey of Blockchain-Based Strategies for HealthcareDe Aguiar, Erikson Júlio; Faiçal, Bruno S.; Krishnamachari, Bhaskar; Ueyama, Jó
doi: 10.1145/3376915pmid: N/A
Blockchain technology has been gaining visibility owing to its ability to enhance the security, reliability, and robustness of distributed systems. Several areas have benefited from research based on this technology, such as finance, remote sensing, data analysis, and healthcare. Data immutability, privacy, transparency, decentralization, and distributed ledgers are the main features that make blockchain an attractive technology. However, healthcare records that contain confidential patient data make this system very complicated because there is a risk of a privacy breach. This study aims to address research into the applications of the blockchain healthcare area. It sets out by discussing the management of medical information, as well as the sharing of medical records, image sharing, and log management. We also discuss papers that intersect with other areas, such as the Internet of Things, the management of information, tracking of drugs along their supply chain, and aspects of security and privacy. As we are aware that there are other surveys of blockchain in healthcare, we analyze and compare both the positive and negative aspects of their papers. Finally, we seek to examine the concepts of blockchain in the medical area, by assessing their benefits and drawbacks and thus giving guidance to other researchers in the area. Additionally, we summarize the methods used in healthcare per application area and show their pros and cons.
Graph GeneratorsBonifati, Angela; Holubová, Irena; Prat-Pérez, Arnau; Sakr, Sherif
doi: 10.1145/3379445pmid: N/A
The abundance of interconnected data has fueled the design and implementation of graph generators reproducing real-world linking properties or gauging the effectiveness of graph algorithms, techniques, and applications manipulating these data. We consider graph generation across multiple subfields, such as Semantic Web, graph databases, social networks, and community detection, along with general graphs. Despite the disparate requirements of modern graph generators throughout these communities, we analyze them under a common umbrella, reaching out the functionalities, the practical usage, and their supported operations. We argue that this classification is serving the need of providing scientists, researchers, and practitioners with the right data generator at hand for their work. This survey provides a comprehensive overview of the state-of-the-art graph generators by focusing on those that are pertinent and suitable for several data-intensive tasks. Finally, we discuss open challenges and missing requirements of current graph generators along with their future extensions to new emerging fields.
Tools for Reduced Precision ComputationCherubin, Stefano; Agosta, Giovanni
doi: 10.1145/3381039pmid: N/A
The use of reduced precision to improve performance metrics such as computation latency and power consumption is a common practice in the embedded systems field. This practice is emerging as a new trend in High Performance Computing (HPC), especially when new error-tolerant applications are considered. However, standard compiler frameworks do not support automated precision customization, and manual tuning and code transformation is the approach usually adopted in most domains. In recent years, research have been studying ways to improve the automation of this process. This article surveys this body of work, identifying the critical steps of this process, the most advanced tools available, and the open challenges in this research area. We conclude that, while several mature tools exist, there is still a gap to close, especially for tools based on static analysis rather than profiling, as well as for integration within mainstream, industry-strength compiler frameworks.
Multi-Label Active Learning Algorithms for Image ClassificationWu, Jian; Sheng, Victor S.; Zhang, Jing; Li, Hua; Dadakova, Tetiana; Swisher, Christine Leon; Cui, Zhiming; Zhao, Pengpeng
doi: 10.1145/3379504pmid: 34421185
Image classification is a key task in image understanding, and multi-label image classification has become a popular topic in recent years. However, the success of multi-label image classification is closely related to the way of constructing a training set. As active learning aims to construct an effective training set through iteratively selecting the most informative examples to query labels from annotators, it was introduced into multi-label image classification. Accordingly, multi-label active learning is becoming an important research direction. In this work, we first review existing multi-label active learning algorithms for image classification. These algorithms can be categorized into two top groups from two aspects respectively: sampling and annotation. The most important component of multi-label active learning is to design an effective sampling strategy that actively selects the examples with the highest informativeness from an unlabeled data pool, according to various information measures. Thus, different informativeness measures are emphasized in this survey. Furthermore, this work also makes a deep investigation on existing challenging issues and future promises in multi-label active learning with a focus on four core aspects: example dimension, label dimension, annotation, and application extension.