Leveraging Machine Learning Models to Improve Smart Contract Security: A Survey of Vulnerabilities and Detection MethodsAlsunaidi, Shikah J.; Aljamaan, Hamoud; Hammoudeh, Mohammad
doi: 10.1145/3772367pmid: N/A
Smart Contracts (SCs), self-executing programs on blockchain platforms, are transforming industries such as banking, healthcare, and supply chains through automated, trustless transactions. However, their inherent vulnerabilities have led to severe financial and operational losses, with large-scale exploits causing substantial economic damage. Machine Learning (ML) has emerged as a promising approach for SC vulnerability detection, yet its effectiveness, adaptability, and generalizability remain insufficiently explored. This article comprehensively classifies current Ethereum SC vulnerabilities and attacks. It also surveys 108 ML-based detection methods, covering both traditional models and a structured taxonomy of advanced approaches such as GNN-based, LLM-based, contrastive learning, ensemble, hybrid, meta-learning, and transfer learning techniques. The strengths, limitations, and practical challenges of these methods are systematically analyzed, with particular attention to factors such as detection stages, classification problems, dataset characteristics, feature engineering, performance evaluation, generalizability, detection capability, model aging, and ethical and privacy implications. Additionally, existing datasets on SC vulnerabilities are reviewed and consolidated. By integrating these insights, this work provides actionable guidelines and a foundation for building secure, resilient, and trustworthy SC ecosystems.
Survey on Deep Face Restoration: From Non-blind to Blind and BeyondLi, Wenjie; Wang, Mei; Zhang, Kai; Li, Juncheng; Li, Xiaoming; Zhang, Yuhang; Gao, Guangwei; Ma, Zhanyu
doi: 10.1145/3778162pmid: N/A
Face restoration (FR) is a specialized field within image restoration that aims to recover low-quality (LQ) face images into high-quality (HQ) face images. Recent advances in deep learning technology have led to significant progress in FR methods. In this article, we begin by examining the prevalent factors responsible for real-world LQ images and introduce degradation techniques used to synthesize LQ images. We also discuss notable benchmarks commonly utilized in the field. Next, we categorize FR methods based on different tasks and explain their evolution. Furthermore, we explore the various facial priors commonly utilized in restoration and discuss strategies to enhance their effectiveness. In the experimental section, we thoroughly evaluate the performance of state-of-the-art FR methods across various tasks using a unified benchmark. We analyze their performance from different perspectives. Finally, we discuss real-world applications and challenges faced in the field of FR, propose potential directions for future advancements. The open-source repository corresponding to this work can be found at https://github.com/24wenjie-li/Awesome-Face-Restoration.
Review of Explainable Graph-Based Recommender SystemsMarkchom, Thanet; Liang, Huizhi; Ferryman, James
doi: 10.1145/3772273pmid: N/A
Explainability of recommender systems has become essential to ensure users’ trust and satisfaction. Various types of explainable recommender systems have been proposed, including explainable graph-based recommender systems. This review article discusses state-of-the-art approaches of these systems and categorizes them based on three aspects: learning methods, explaining methods, and explanation types. It also explores the commonly used datasets, explainability evaluation methods, and future directions of this research area. Compared with the existing review articles, this article focuses on explainability based on graphs and covers the topics required for developing novel explainable graph-based recommender systems.
A Comprehensive Survey of Threat Intelligence Research: A Measurement-Based StudyFurumoto, Keisuke; Morikawa, Tomohiro; Kolehmainen, Antti; Silverajan, Bilhanan; Takahashi, Takeshi; Inoue, Daisuke
doi: 10.1145/3772280pmid: N/A
Multiple cyber-security-related sources, referred to as threat intelligence sources, are commonly used to counter sophisticated cyber attacks such as advanced persistent threat attacks and ransomware. In this article, in addition to describing various threat intelligence sources, we analyze research trends based on taxonomies for research purpose, research approach, and research datasets. We provide an extensive review of over 200 studies related to cyber threat intelligence published between 2001 and 2025 and examine the trends of representative research. The survey shows that there are issues related to datasets, such as the evaluation results depending on which vendors are included in the dataset. Therefore, we also conduct a measurement study to provide a detailed description of collected datasets. To the best of our knowledge, this is the first study to conduct a measurement study on a dataset to uncover insights for constructing a well-balanced dataset. We also identify open issues and challenges that need to be addressed in the future.
Multi-Step Reasoning with Large Language Models, a SurveyPlaat, Aske; Wong, Annie; Verberne, Suzan; Broekens, Joost; Van Stein, Niki; Bäck, Thomas
doi: 10.1145/3774896pmid: N/A
Large language models (LLMs) with billions of parameters exhibit in-context learning abilities, enabling few-shot learning on tasks that the model was not specifically trained for. Traditional models achieve breakthrough performance on language tasks, but do not perform well on basic reasoning benchmarks. However, a new in-context learning approach, Chain-of-thought, has demonstrated strong multi-step reasoning abilities on these benchmarks.The research on LLM reasoning abilities started with the question whether LLMs can solve grade school math word problems, and has expanded to other tasks in the past few years. This article reviews the field of multi-step reasoning with LLMs. We propose a taxonomy that identifies different ways to generate, evaluate, and control multi-step reasoning. We provide an in-depth coverage of core approaches and open problems, and we propose a research agenda for the near future.We find that multi-step reasoning approaches have progressed beyond math word problems, and can now successfully solve challenges in logic, combinatorial games, and robotics, sometimes by first generating code that is then executed by external tools. Many studies in multi-step methods use reinforcement learning for finetuning, external optimization loops, in-context reinforcement learning, and self-reflection.
Hardware-Level QoS Enforcement Features: Technologies, Use Cases, and Research ChallengesLarsson, Oliver; Metsch, Thijs; Klein, Cristian; Elmroth, Erik
doi: 10.1145/3774317pmid: N/A
Recent advancements in commodity server processors have enabled dynamic hardware-based quality-of-service (QoS) enforcement. These features have gathered increasing interest in research communities due to their versatility and wide range of applications. Thus, there exists a need to understand how scholars leverage hardware QoS enforcement in research, understand strengths and shortcomings, and identify gaps in current state-of-the-art research. This article observes relevant publications, presents a novel taxonomy, discusses the approaches used, and identifies trends. Furthermore, an opportunity is recognized for QoS enforcement utilization in service-based cloud computing environments, and open challenges are presented.
A Survey on Human Preference Learning for Aligning Large Language ModelsJiang, Ruili; Chen, Kehai; Bai, Xuefeng; He, Zhixuan; Li, Juntao; Yang, Muyun; Zhao, Tiejun; Nie, Liqiang; Zhang, Min
doi: 10.1145/3773279pmid: N/A
The recent surge in versatile large language models (LLMs) demonstrates remarkable success across a wide range of contexts. A key factor contributing to this success is LLM alignment, in which human preference learning plays a decisive role in steering the models’ capabilities toward fulfilling human objectives. In this survey, we review the progress in human preference learning within a unified framework, aiming to provide a comprehensive perspective on established methodologies while exploring avenues to further advance LLM alignment. Specifically, we categorize human preference feedback based on data sources and formats, summarize techniques for human preference modeling and usage, and present an overview of prevailing evaluation protocols for LLM alignment. Finally, we discuss the existing challenges and identify potential directions for future research, with a particular emphasis on generalizability, transferability, and controllability.
A Survey of Adaptation of Large Language Models to Idea and Hypothesis Generation: Downstream Task Adaptation, Knowledge Distillation Approaches and ChallengesOyelade, Olaide N; Wang, Hui; Rafferty, Karen
doi: 10.1145/3774628pmid: N/A
Idea and hypothesis generation are creative processes that demand a significant level of reasoning. Methods such as brainstorming, analytical reasoning, inductive reasoning and other forms of reasoning have proven useful in advancing research in this domain. Machine learning techniques have been widely investigated to address these challenging tasks. However, they are limited and have insufficient reasoning required for these tasks, making the emergence of language models reignite research in this direction. Large language models (LLMs) have debuted as the current state-of-the-art for achieving impressive generative tasks, and to support language understanding. Models such as the BERT, BARD, GPT and LLaMa have architectural layouts which are mostly transformer network based. These models headline impressive results in downstream tasks such as text classification, sentiment analysis, language inference, question answering, text summarization and named entity recognition among others. However, the need to adapt these models to the emerging downstream tasks of idea and hypothesis generation have uncovered a new research opportunity. In this study, systematic literature review is carried out to provide understanding on how LLMs have been applied to the classical downstream tasks and to then motivate adaptation of LLMs to idea and hypothesis generation. Furthermore, the study examines techniques applied to customization and knowledge distillation with the aim of contextualizing these methods to solve idea and hypothesis generation. We then explored the limitations of LLM-based research efforts to idea and hypothesis generation. A detailed and technical discussion of the findings of the study is presented, and we provide a high-level novel conceptual framework to describe and summarize our findings. Also, potential insights to combining knowledge graphs, causal inference, logic reasoning and LLMs distillation in idea and hypothesis generation are discussed. Finally, challenges in these research areas on adaptation of LLMs to idea and hypothesis generation are discussed.
Indexing Techniques for Graph Reachability QueriesZhang, Chao; Bonifati, Angela; Özsu, M. Tamer
doi: 10.1145/3776737pmid: N/A
We survey graph reachability indexing techniques for efficiently processing reachability queries in two popular graph models: plain graphs and edge-labeled graphs. Reachability queries determine whether a directed path exists between a source and a target vertex, forming a core class of navigational queries in graph analytics. Reachability indexes are specialized data structures that accelerate such query processing. Work on this topic goes back four decades—we include 33 of the proposed techniques. Plain graphs consist of only vertices and edges, with reachability queries checking for the existence of a path. Edge-labeled graphs extend plain graphs by adding labels to edges, and their queries further impose constraints on the labels along the path.We categorize indexing techniques for both plain and edge-labeled graphs and discuss them based on this classification, using representative methods to illustrate key ideas. We discuss the main challenges within each category and how these might be addressed in other approaches. We conclude with a discussion of the open challenges and future research directions, along the lines of integrating reachability indexes into modern graph database management systems. This survey serves as a comprehensive resource for researchers and practitioners interested in the advancements, techniques, and challenges of reachability indexing in graph analytics.
Blockchain in the Digital Twin Context: A Comprehensive SurveyLi, Dun; Han, Dezhi; Crespi, Noel; Minerva, Roberto; Raza, Syed Mohsan; Farahbakhsh, Reza; Liang, Wei; Zheng, Zibin
doi: 10.1145/3772366pmid: N/A
Digital twin (DT) technology integrates Internet of Things (IoT), communication networks, and sensor systems through high-fidelity modeling and multi-dimensional simulation, enabling dynamic mapping and real-time optimization of physical objects. However, DT development still faces several challenges, including cross-platform interoperability limitations, excessive latency in real-time scenarios, security vulnerabilities in distributed deployments, and the complexity of accurately modeling multi-modal systems. Blockchain (BC) enhances the security and functional scope of DTs across diverse applications. This survey begins by introducing the core principles of BC and DT, and then investigates the rationale and benefits behind their integration. From a data-centric perspective, we explore how Blockchain-empowered Digital Twins (BCDTs) enhance data storage, secure exchange, privacy protection, and system interoperability. The survey further explores the architecture of BCDT systems, covering network topology, functional modules, platform design, and representative prototypes, offering insights into real-world applications. In addition, we survey how BCDT supports the convergence of key Industry 4.0 technologies, including the Internet of Things, vehicle networks, unmanned aerial systems, artificial intelligence, federated learning, 5G mobile networks, and software-defined networking. Industrial-grade quality BCDT-supported applications are highlighted, providing a solid foundation for further research. Finally, we analyze the challenges faced by BCDT and offer some optimistic suggestions for further research in the field of BCDT.