Multimedia Tools and Applications

Multimedia Tools and Applications | DeepDyve

journal article

LitStream Collection

An efficient crack detection and leakage monitoring in liquid metal pipelines using a novel BRetN and TCK-LSTM techniques

Sankarasubramanian, Praveen

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20170-6pmid: N/A

Nowadays, the pipeline system has the safest, most economical, and most efficient means of transporting petroleum products and other chemical fluids. But, the faults in pipelines cause resource wastage and environmental pollution. Most of the existing works focused either on the surface Crack Detection (CD) or Leakage Detection (LD) of pipes with limited features. Hence, efficient crack detection and leakage monitoring are proposed based on the Acoustic Emission (AE) signal and AE image features using a new Berout Retina Net (BRetN) and Tent Chaotic Kaiming-centric Long Short Term Memory (TCK-LSTM) methodologies. The process initiates from the gathering of input data, followed by preprocessing. Then, the cracks are detected by utilizing Berout Retina Net (BRetN), and the features of AE signals are retrieved. On the other hand, the AE signal is transformed into an AE image using Continuous Wavelet Transform (CWT). Further, the AE image features are extracted, followed by the integration of both the AE signal and AE image features. Further, the optimal features are chosen by using Gorilla Troops Optimizer (GTO). Eventually, the TCK-LSTM model is used for detecting the leakage level of the pipeline. The experimental outcomes illustrated that the proposed framework detected crack and leakage levels with 98.14% accuracy, 95.37% precision, and 98.84% specificity when analogizing over the existing techniques.

journal article

LitStream Collection

Augmented reality without SLAM

Gholami, Aminreza; Nasihatkon, Behrooz; Soryani, Mohsen

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20154-6pmid: N/A

Most augmented reality (AR) pipelines typically involve the computation of the camera’s pose in each frame, followed by the 2D projection of virtual objects. The camera pose estimation is commonly implemented as SLAM (Simultaneous Localisation and Mapping) algorithm. However, SLAM systems are often limited to scenarios where the camera intrinsics remain fixed or are known in all frames. This paper presents an initial effort to circumvent the pose estimation stage altogether and directly computes 2D projections using epipolar constraints. To achieve this, we initially calculate the fundamental matrices between the keyframes and each new frame. The 2D locations of objects can then be triangulated by finding the intersection of epipolar lines in the new frame. We propose a robust algorithm that can handle situations where some of the fundamental matrices are entirely erroneous. Most notably, we introduce a depth-buffering algorithm that relies solely on the fundamental matrices, eliminating the need to compute 3D point locations in the target view. By utilizing fundamental matrices, our method remains effective even when all intrinsic camera parameters vary over time. Notably, our proposed approach achieved sufficient accuracy, even with more degrees of freedom in the solution space.

journal article

LitStream Collection

Deception detection with multi-scale feature and multi-head attention in videos

Yuan, Shusen; Zhou, Guanqun; Xing, Hongbo; Jiang, Youjun; Cao, Yewen; Yang, Mingqiang

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20124-ypmid: N/A

Detecting deception in videos has been a challenging task, especially in real world situations. In this study, we extracted the facial action units from the micro-expression, and then calculated the frequency and the number of occurrences of each action unit. To get more information on different scales, we proposed a combination scheme of Multi-Scale Feature (MSF) model and Multi-Head Attention (MHA). The MSF model consists of two CNN with different convolution kernels and GELU is used as the active function. The MHA model was designed to divide the input features into different subspaces and generate attention for each subspace to make the features more effective. We evaluated our proposed method on the Real-life Trial dataset and achieved an accuracy of 87.81%. The results show that the MSF and MHA model could increase the accuracy of deception detection task. And the comparative experiment demonstrates the effectiveness of our proposed method.

journal article

LitStream Collection

An optimized cluster validity index for identification of cancer mediating genes

Hazra, Subir; Ghosh, Anupam

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20105-1pmid: N/A

One of the major challenges in bioinformatics lies in identification of modified gene expressions of an affected person due to medical ailments. Focused research has been observed till date in such identification, leading to multiple proposals pivoting in clustering of gene expressions. Moreover, while clustering proves to be an effective way to demarcate the affected gene expression vectors, there has been global research on the cluster count that optimizes the gene expression variations among the clusters. This study proposes a new index called mean-max index (MMI) to determine the cluster count which divides the data collection into ideal number of clusters depending on gene expression variations. MMI works on the principle of minimization of the intra cluster variations among the members and maximization of inter cluster variations. In this regard, the study has been conducted on publicly available dataset comprising of gene expressions for three diseases, namely lung disease, leukaemia, and colon cancer. The data count for normal as well as diseased patients lie at 10 and 86 for lung disease patients, 43 and 13 for patients observed with leukaemia, and 18 and 18 for patients with colon cancer respectively. The gene expression vectors for the three diseases comprise of 7129,22283, and 6600 respectively. Three clustering models have been used for this study, namely k-means, partition around medoid, and fuzzy c-means, all using the proposed MMI technique for finalizing the cluster count. The Comparative analysis reflects that the proposed MMI index is able to recognize much more true positives (biologically enriched) cancer mediating genes with respect to other cluster validity indices and it can be considered as superior to other with respect to enhanced accuracy by 85%.

journal article

LitStream Collection

A survey on blockchain security for electronic health record

G, Chandini A; Basarkod, P. I

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-19883-5pmid: N/A

Numerous healthcare organizations maintain track of the patients’ medical information with an Electronic Health Record (EHR). Nowadays, patients demand instant access to their medical records. Hence, Deep Learning (DL) methods are employed in electronic healthcare sectors for medical image processing and smart supply chain management. Various approaches are presented for the protection of healthcare data of patients using blockchain however, there are concerns regarding the security and privacy of patient medical records in the health industry, where data can be accessed instantly. The blockchain-based security with DL approaches helps to solve this problem and there is a need for improvements on the DL-based blockchain methods for privacy and security of patient data and access control strategies with developments in the supply chain. The survey provides a clear idea of DL-based strategies used in electronic healthcare data storage and security along with the integrity verification approaches. Also, it provides a comparative analysis to demonstrate the effectiveness of various blockchain-based EHR handling techniques. Moreover, future directions are provided to overcome the existing impact of various techniques in blockchain security for EHRs.

journal article

LitStream Collection

Auto-proctoring using computer vision in MOOCs system

Dang, Tuan Linh; Hoang, Nguyen Minh Nhat; Nguyen, The Vu; Nguyen, Hoang Vu; Dang, Quang Minh; Tran, Quang Hai; Pham, Huy Hoang

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20099-wpmid: N/A

The COVID-19 outbreak has caused a significant shift towards virtual education, where Massive Open Online Courses (MOOCs), such as EdX and Coursera, have become prevalent distance learning mediums. Online exams are also gaining popularity, but they pose a risk of cheating without proper supervision. Online proctoring can significantly improve the quality of education, and with the addition of extended modules on MOOCs, the incorporation of artificial intelligence in the proctoring process has become more accessible. Despite the advancements in machine learning-based cheating detection in third-party proctoring tools, there is still a need for optimization and adaptability of such systems for massive simultaneous user requirements of MOOCs. Therefore, we have developed an examination monitoring system based on advanced artificial intelligence technology. This system is highly scalable and can be easily integrated with our existing MOOCs platform, daotao.ai. Experimental results demonstrated that our proposed system achieved a 95.66% accuracy rate in detecting cheating behaviors, processed video inputs with an average response time of 0.517 seconds, and successfully handled concurrent user demands, thereby validating its effectiveness and reliability for large-scale online examination monitoring.

journal article

LitStream Collection

Cubixel: a novel paradigm in image processing using three-dimensional pixel representation

Aburass, Sanad

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20081-6pmid: N/A

This paper introduces the innovative concept of the Cubixel—a three-dimensional representation of the traditional pixel—alongside the derived metric, Volume of the Void (VoV), which measures spatial disparities within images. By converting pixels into Cubixels, we can analyze the image’s 3D properties, thereby enriching image processing and computer vision tasks. Utilizing Cubixels, we’ve developed algorithms for advanced image segmentation, edge detection, texture analysis, and feature extraction, yielding a deeper comprehension of image content. Our empirical experimental results on benchmark images and datasets showcase the applicability of these concepts. Further, we discuss future applications of Cubixels and VoV in various domains, particularly in medical imaging, where they have the potential to significantly enhance diagnostic processes. By interpreting images as complex ‘urban landscapes’, we envision a new frontier for deep learning models that simulate and learn from diverse environmental conditions. The integration of Cubixels into deep learning architectures promises to revolutionize the field, providing a pathway towards more intelligent, context-aware artificial intelligence systems. With this groundbreaking work, we aim to inspire future research that will unlock the full potential of image data, transforming both theoretical understanding and practical applications. Our code is available at https://github.com/sanadv/Cubixel.

journal article

LitStream Collection

An optimized congestion control protocol in cellular network for improving quality of service

V, Sandhya S.; Joshi, S. M.

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20126-wpmid: N/A

In recent decades, Cellular Networks (CN) have been used broadly in communication technologies. The most critical challenge in the CN was congestion control due to the distributed mobile environment. Some approaches, like mobile edge computing, congesting controlling systems, machine learning, and heuristic models, have failed to prevent congestion in CN. The reason for this problem is the lack of continuous monitoring function at every time interval. So, in this present study, a novel Golden Eagle-based Primal–dual Congestion Management (GEbPDCM) has been developed for the Long-Term Evolution (LTE) Ad hoc On-demand Vector (AODV) network. Here, the Golden Eagle function features will afford the continuous monitoring function to monitor data congestion. Hence, the main objective of this research is to improve the Quality of service (QoS) by optimizing congestion controls. Here, the QoS is measured by different metrics, such as delay, packet delivery ratio (PDR), throughput, packet loss, and energy consumption. Initially, the nodes were created in the MATLAB environment, and the GEbPDCM was activated to predict the data load and estimate the node density to measure the node status. Then, the high data overload was migrated to another free status node to control congestion. Finally, the proposed model efficiency was measured regarding delay, packet delivery ratio (PDR), throughput, packet loss, and energy consumption. The proposed model has scored high throughput at 97.1 Mbps and 97.1 PDR, reducing delay to 67.4 ms and 50.6 mJ energy consumption. Hence, the present model is suitable for the LTE network.

journal article

LitStream Collection

Predicting eye-tracking assisted web page segmentation

Sulayfani, Abdullah; Eraslan, Sukru; Yesilada, Yeliz

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20202-1pmid: N/A

Different kinds of algorithms have been proposed to identify the visual elements of web pages for different purposes, such as improving web accessibility, measuring web page visual quality and aesthetics etc. One group of these algorithms identifies the elements by analyzing the source code and visual representation of web pages, whereas another group discovers the attractive elements by analyzing the eye movements of users. A previous approach proposes combining these two approaches to consider both the source code and visual representation of web pages and users’ eye movements on those pages. The result of the proposed approach can be considered eye-tracking-assisted web page segmentation. However, since the eye-tracking data collection procedure is elaborate, time-consuming, and expensive, and it is not feasible to collect eye-tracking data for each page, we aim to develop a model to predict such segmentation without requiring eye-tracking data. In this paper, we present our experiments with different Machine and Deep Learning algorithms and show that the K-Nearest Neighbour (KNN) model yields the best results in prediction. We present a KNN model that predicts eye-tracking-assisted web page segmentation with an F1-score of 78.74%. This work shows how an Machine Learning algorithm can automate web page segmentation driven by eye-tracking data.

journal article

Open Access Collection

Efficient compressed storage and fast reconstruction of large binary images using chain codes

Strnad, Damjan; Žlaus, Danijel; Nerat, Andrej; Žalik, Borut

2025 Multimedia Tools and Applications

doi: 10.1007/s11042-024-20199-7pmid: N/A

Large binary images are used in many modern applications of image processing. For instance, they serve as inputs or target masks for training machine learning (ML) models in computer vision and image segmentation. Storing large binary images in limited memory and loading them repeatedly on demand, which is common in ML, calls for efficient image encoding and decoding mechanisms. In the paper, we propose an encoding scheme for efficient compressed storage of large binary images based on chain codes, and introduce a new single-pass algorithm for fast parallel reconstruction of raster images from the encoded representation. We use three large real-life binary masks to test the efficiency of the proposed method, which were derived from vector layers of single-class objects – a building cadaster, a woody vegetation landscape feature map, and a road network map. We show that the masks encoded by the proposed method require significantly less storage space than standard lossless compression formats. We further compared the proposed method for mask reconstruction from chain codes with a recent state-of-the-art algorithm, and achieved between 12%\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$12\%$$\end{document} and 33%\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$33\%$$\end{document} faster reconstruction on test data.

Showing 1 to 10 of 55 Articles

Articles per page

Multimedia Tools and Applications

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Related Journals: