Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation LearningSong, Kaiyou;Xie, Jin;Zhang, Shan;Luo, Zimeng
doi: 10.48550/arxiv.2304.06461pmid: N/A
Abstract: Self-supervised learning (SSL) has made remarkable progress in visual representation learning. Some studies combine SSL with knowledge distillation (SSL-KD) to boost the representation learning performance of small models. In this study, we propose a Multi-mode Online Knowledge Distillation method (MOKD) to boost self-supervised visual representation learning. Different from existing SSL-KD methods that transfer knowledge from a static pre-trained teacher to a student, in MOKD, two different models learn collaboratively in a self-supervised manner. Specifically, MOKD consists of two distillation modes: self-distillation and cross-distillation modes. Among them, self-distillation performs self-supervised learning for each model independently, while cross-distillation realizes knowledge interaction between different models. In cross-distillation, a cross-attention feature search strategy is proposed to enhance the semantic feature alignment between different models. As a result, the two models can absorb knowledge from each other to boost their representation learning performance. Extensive experimental results on different backbones and datasets demonstrate that two heterogeneous models can benefit from MOKD and outperform their independently trained baseline. In addition, MOKD also outperforms existing SSL-KD methods for both the student and teacher models.
Topological Guided Actor-Critic Modular Learning of Continuous Systems with Temporal ObjectivesLi, Lening;Qian, Zhentian
doi: 10.48550/arxiv.2304.10041pmid: N/A
Abstract: This work investigates the formal policy synthesis of continuous-state stochastic dynamic systems given high-level specifications in linear temporal logic. To learn an optimal policy that maximizes the satisfaction probability, we take a product between a dynamic system and the translated automaton to construct a product system on which we solve an optimal planning problem. Since this product system has a hybrid product state space that results in reward sparsity, we introduce a generalized optimal backup order, in reverse to the topological order, to guide the value backups and accelerate the learning process. We provide the optimality proof for using the generalized optimal backup order in this optimal planning problem. Further, this paper presents an actor-critic reinforcement learning algorithm when topological order applies. This algorithm leverages advanced mathematical techniques and enjoys the property of hyperparameter self-tuning. We provide proof of the optimality and convergence of our proposed reinforcement learning algorithm. We use neural networks to approximate the value function and policy function for hybrid product state space. Furthermore, we observe that assigning integer numbers to automaton states can rank the value or policy function approximated by neural networks. To break the ordinal relationship, we use an individual neural network for each automaton state's value (policy) function, termed modular learning. We conduct two experiments. First, to show the efficacy of our reinforcement learning algorithm, we compare it with baselines on a classic control task, CartPole. Second, we demonstrate the empirical performance of our formal policy synthesis framework on motion planning of a Dubins car with a temporal specification.
LibCity: A Unified Library Towards Efficient and Comprehensive Urban Spatial-Temporal PredictionJiang, Jiawei;Han, Chengkai;Jiang, Wenjun;Zhao, Wayne Xin;Wang, Jingyuan
doi: 10.48550/arxiv.2304.14343pmid: N/A
Abstract:As deep learning technology advances and more urban spatial-temporal data accumulates, an increasing number of deep learning models are being proposed to solve urban spatial-temporal prediction problems. However, there are limitations in the existing field, including open-source data being in various formats and difficult to use, few papers making their code and data openly available, and open-source models often using different frameworks and platforms, making comparisons challenging. A standardized framework is urgently needed to implement and evaluate these methods. To address these issues, we propose LibCity, an open-source library that offers researchers a credible experimental tool and a convenient development framework. In this library, we have reproduced 65 spatial-temporal prediction models and collected 55 spatial-temporal datasets, allowing researchers to conduct comprehensive experiments conveniently. By enabling fair model comparisons, designing a unified data storage format, and simplifying the process of developing new models, LibCity is poised to make significant contributions to the spatial-temporal prediction field.
Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated EnvironmentsAndres, Alain;Schäfer, Lukas;Villar-Rodriguez, Esther;Albrecht, Stefano V.;Del Ser, Javier
doi: 10.48550/arxiv.2304.09825pmid: N/A
Abstract: One of the key challenges of Reinforcement Learning (RL) is the ability of agents to generalise their learned policy to unseen settings. Moreover, training RL agents requires large numbers of interactions with the environment. Motivated by the recent success of Offline RL and Imitation Learning (IL), we conduct a study to investigate whether agents can leverage offline data in the form of trajectories to improve the sample-efficiency in procedurally generated environments. We consider two settings of using IL from offline data for RL: (1) pre-training a policy before online RL training and (2) concurrently training a policy with online RL and IL from offline data. We analyse the impact of the quality (optimality of trajectories) and diversity (number of trajectories and covered level) of available offline trajectories on the effectiveness of both approaches. Across four well-known sparse reward tasks in the MiniGrid environment, we find that using IL for pre-training and concurrently during online RL training both consistently improve the sample-efficiency while converging to optimal policies. Furthermore, we show that pre-training a policy from as few as two trajectories can make the difference between learning an optimal policy at the end of online training and not learning at all. Our findings motivate the widespread adoption of IL for pre-training and concurrent IL in procedurally generated environments whenever offline trajectories are available or can be generated.
Kernel-level Rootkit Detection, Prevention and Behavior Profiling: A Taxonomy and SurveyNadim, Mohammad;Lee, Wonjun;Akopian, David
doi: 10.48550/arxiv.2304.00473pmid: N/A
Abstract: One of the most elusive types of malware in recent times that pose significant challenges in the computer security system is the kernel-level rootkits. The kernel-level rootkits can hide its presence and malicious activities by modifying the kernel control flow, by hooking in the kernel space, or by manipulating the kernel objects. As kernel-level rootkits change the kernel, it is difficult for user-level security tools to detect the kernel-level rootkits. In the past few years, many approaches have been proposed to detect kernel-level rootkits. It is not much difficult for an attacker to evade the signature-based kernel-level rootkit detection system by slightly modifying the existing signature. To detect the evolving kernel-level rootkits, researchers have proposed and experimented with many detection systems. In this paper, we survey traditional kernel-level rootkit detection mechanisms in literature and propose a structured kernel-level rootkit detection taxonomy. We have discussed the strength and weaknesses or challenges of each detection approach. The prevention techniques and profiling kernel-level rootkit behavior affiliated literature are also included in this survey. The paper ends with future research directions for kernel-level rootkit detection.
Compact Distance Oracles with Large Sensitivity and Low StretchBilò, Davide;Choudhary, Keerti;Cohen, Sarel;Friedrich, Tobias;Krogmann, Simon;Schirneck, Martin
doi: 10.48550/arxiv.2304.14184pmid: N/A
Abstract: An $f$-edge fault-tolerant distance sensitive oracle ($f$-DSO) with stretch $\sigma \geq 1$ is a data structure that preprocesses an input graph $G$. When queried with the triple $(s,t,F)$, where $s, t \in V$ and $F \subseteq E$ contains at most $f$ edges of $G$, the oracle returns an estimate $\widehat{d}_{G-F}(s,t)$ of the distance $d_{G-F}(s,t)$ between $s$ and $t$ in the graph $G-F$ such that $d_{G-F}(s,t) \leq \widehat{d}_{G-F}(s,t) \leq \sigma d_{G-F}(s,t)$. For any positive integer $k \ge 2$ and any $0 < \alpha < 1$, we present an $f$-DSO with sensitivity $f = o(\log n/\log\log n)$, stretch $2k-1$, space $O(n^{1+\frac{1}{k}+\alpha+o(1)})$, and an $\widetilde{O}(n^{1+\frac{1}{k} - \frac{\alpha}{k(f+1)}})$ query time. Prior to our work, there were only three known $f$-DSOs with subquadratic space. The first one by Chechik et al. [Algorithmica 2012] has a stretch of $(8k-2)(f+1)$, depending on $f$. Another approach is storing an $f$-edge fault-tolerant $(2k-1)$-spanner of $G$. The bottleneck is the large query time due to the size of any such spanner, which is $\Omega(n^{1+1/k})$ under the Erdős girth conjecture. Bilò et al. [STOC 2023] gave a solution with stretch $3+\varepsilon$, query time $O(n^{\alpha})$ but space $O(n^{2-\frac{\alpha}{f+1}})$, approaching the quadratic barrier for large sensitivity. In the realm of subquadratic space, our $f$-DSOs are the first ones that guarantee, at the same time, large sensitivity, low stretch, and non-trivial query time. To obtain our results, we use the approximate distance oracles of Thorup and Zwick [JACM 2005], and the derandomization of the $f$-DSO of Weimann and Yuster [TALG 2013], that was recently given by Karthik and Parter [SODA 2021].
Pointless Global Bundle Adjustment With Relative Motions HessiansRupnik, Ewelina;Pierrot-Deseilligny, Marc
doi: 10.48550/arxiv.2304.05118pmid: N/A
Abstract: Bundle adjustment (BA) is the standard way to optimise camera poses and to produce sparse representations of a scene. However, as the number of camera poses and features grows, refinement through bundle adjustment becomes inefficient. Inspired by global motion averaging methods, we propose a new bundle adjustment objective which does not rely on image features' reprojection errors yet maintains precision on par with classical BA. Our method averages over relative motions while implicitly incorporating the contribution of the structure in the adjustment. To that end, we weight the objective function by local hessian matrices - a by-product of local bundle adjustments performed on relative motions (e.g., pairs or triplets) during the pose initialisation step. Such hessians are extremely rich as they encapsulate both the features' random errors and the geometric configuration between the cameras. These pieces of information propagated to the global frame help to guide the final optimisation in a more rigorous way. We argue that this approach is an upgraded version of the motion averaging approach and demonstrate its effectiveness on both photogrammetric datasets and computer vision benchmarks.
Bayesian Federated Learning: A SurveyCao, Longbing;Chen, Hui;Fan, Xuhui;Gama, Joao;Ong, Yew-Soon;Kumar, Vipin
doi: 10.48550/arxiv.2304.13267pmid: N/A
Abstract: Federated learning (FL) demonstrates its advantages in integrating distributed infrastructure, communication, computing and learning in a privacy-preserving manner. However, the robustness and capabilities of existing FL methods are challenged by limited and dynamic data and conditions, complexities including heterogeneities and uncertainties, and analytical explainability. Bayesian federated learning (BFL) has emerged as a promising approach to address these issues. This survey presents a critical overview of BFL, including its basic concepts, its relations to Bayesian learning in the context of FL, and a taxonomy of BFL from both Bayesian and federated perspectives. We categorize and discuss client- and server-side and FL-based BFL methods and their pros and cons. The limitations of the existing BFL methods and the future directions of BFL research further address the intricate requirements of real-life FL applications.