TY - JOUR AU - Lin, ZeSheng AB - 1 Introduction Data classification is a core task in machine learning domain [1]. Extreme learning machines (ELM) [2], as a prompt and efficient machine learning algorithm, have been widely used in classification and prediction tasks in various fields [3]. By merging kernel functions with the ELM method, the KELM approach provides significant enhancements. With their robust generalization performance and rigorous mathematical basis, Kernel based learning methods can effectively boost the adaptability and accuracy of the model, without sacrificing the benefits of ELM [4]. Nevertheless, The KELM model may encounter local minimum [5] problems and exhibit weak controllability because its generalization performance is closely related to parameter selection [6–8], so metaheuristic algorithms need to be used to improve model prediction performance. Metaheuristic optimization methods such as particle swarm optimization(PSO) [9], grey wolf optimization algorithm(GWO) [10], and firefly algorithm(FA) [11] are frequently employed in fine-tuning the parameters of KELM. The pursuit of the global optimum poses a difficulty for all metaheuristic algorithms as they strive to maintain a balance between exploration and exploitation [12]. Thus, they may require more iterations and methods to discover the best worldwide solution. The search for the optimal algorithm for a given optimization problem continues to be a challenge. A swarm intelligence optimization technique called the whale optimization algorithm (WOA) was suggested in 2016 [13]. It optimizes and solves the objective function through the simulation of how whale groups search for food in the ocean. Compared with many heuristic optimization algorithms, the successful global optimization performance of WOA can be attributed to its adoption of natural and biological behaviors seen in whales. Therefore, the whale optimization algorithm has been a popular choice among researchers for enhancing the performance of KELM in numerous studies [14–16]. Although these studies strong advantages to solve many problems, they also come with limitations like slow convergence and a tendency to reach local optimal solutions. It still has certain flaws when solving complex engineering problems. As an illustration, it can be simple to get stuck in local optima, resulting in fluctuations in the performance of the model’s predictions. Therefore, the algorithm needs to be further improved to improve the stability of the optimization function. To mitigate these problems, this paper presents a technique for optimizing the performance of KELM in classification task based on an enhanced adaptive whale optimization algorithm (EAWOA-KELM). An enhanced version of the WOA algorithm was proposed through the implementation of various strategies such as T-distribution probability perturbation, Levy flight, novel nonlinear control parameters, and average position updating. In this study, 21 test functions from three benchmark functions sets were used to assess the effectiveness of EAWOA. Finally, the proposed model is ultimately applied to categorize data on a range of social issues, serving as strong evidence for its reliability and superiority. The arrangement of this paper is as stated in the following: Relevant studies are discussed in Section 2, while Section 3 presents an overview of the structure and working mechanisms of KELM and WOA. The proposed EAWOA and its improvement details are introduced in Section 4. Section 5 involved a series of experiments and discussions aimed at confirming the efficacy of EAWOA and its enhancement of KELM. In conclusion, Section 6 summarizes the main findings and future plans for the entire study. 2 Related work Nowadays, machine learning algorithms like ELM are extensively utilized for prediction and classification purposes, and the way their parameters are set significantly influences the algorithm’s results. Hence, various studies employ intelligent optimization algorithms like WOA algorithm to identify the most suitable parameters for machine learning algorithms. Intelligent statistical technologies represented by Extreme Learning Machines(ELM) have gained widespread usage in the domains of forecasting [17] and categorization [18]. ELM is a feedforward neural network with a single hidden layer, but it was later developed to incorporate deep structures which can perform representation learning by the stacking encoder [19]. Unlike other neural networks, ELM stands out for its speed as it has fewer layers and does not utilize backpropagation in parameter adjustment. Over the past few years, ELM has been a prominent subject of investigation due to its remarkable efficiency and user-friendly nature. It has also been effectively utilized in numerous research domains, including missing data handling, imbalance correction, and activity recognition [20]. The development of KELM is rooted in ELM and involves the conversion of ELM’s random mapping to a kernel-based mapping, which not only addresses the issue of requirement to decide the neurons number of the hidden layer, but also yields strong generalization capabilities [21]. Although great progress has been made on both theory and practice of ELM, one issue that is still not well considered is how to select an optimal kernel parameters when kernel tricks are applied to ELM. In other words, choosing appropriate parameters is very important for KELM performance. Among them, using metaheuristic algorithms to select parameters is a current mainstream method [22]. The process of optimization is centered around identifying the most ideal solution or parameter value from a group of choices under certain condition. Metaheuristic algorithms usually draw on the wisdom of natural sciences such as biology or physics to find optimal solutions [23]. Nevertheless, there are restrictions regarding the rapidity and consistency of algorithm convergence. Researchers continue to propose metaheuristic algorithms with higher efficiency. Based on various sources of inspiration, metaheuristic algorithms can typically be grouped into three categories: evolution, physics, and group dynamics. Algorithms based on biological evolution include genetic algorithm (GA) [24], differential evolution (DE) [25], evolutionary strategy(ES) [26], etc. Algorithms based on physical principles include water evaporation optimization(WEO) [27], simulated annealing (SA) [28], charged system search (CSS) [29], etc. Typical representatives of swarm intelligence optimization algorithms include PSO, GWO, FA, and others. As a result of observing the survival tactics of animals, numerous innovative optimization algorithms have been produced, including Sparrow Search Algorithm(SSA) [30], Gorilla Troops Optimizer(GTO) [31], WOA, etc. Moreover, various metaheuristic algorithms or concepts have distinct benefits and can work well together. Accordingly, a large number of researchers have suggested various improved hybrid algorithms that combine ideas from different algorithms or natural phenomena to enhance the effectiveness of the original algorithms, such as Improved Corrective Smoothed Particle Method [32], Exploratory Cuckoo Search [33], and Improved Sparrow Search Algorithm with HDPM [34]. With its proven success in solving optimization problems, the WOA algorithm has gained widespread recognition in fields like traffic network [35], path planning [36, 37]. In spite of its evident effectiveness over other advanced algorithms, it remains challenged by factors such as sluggish convergence, inadequate exploration potential, and being trapped in local optimal solutions [38]. For example, WOA uses a coefficient vector |A| that is heavily influenced by the convergence factor α, which determines whether the WOA algorithm performs exploration phase or encircling prey. In WOA, the exploration phase is chosen when. But as the number of iterations increases to the latter half, the WOA algorithm tends to only execute the development phase and weakens the exploration process [38]. Many strategies have been put forth to elevate the efficiency of WOA. Chen et al. proposed several multistrategy approaches to better balance the exploration phase or encircling prey [39, 40]. some researcher use Levy flight trajectory to the WOA algorithm to to increase diversity in solution [41, 42]. Fan et al. introduced a novel ESSAWOA algorithm which incorporates enhanced SSA and WOA [43]. By combining WOA and PSO, Huang et al. proposed whale particle optimization (WPO) to promote a greater range of particles [44]. Furthermore, there are advanced whale optimization methods that take cues from physical phenomenon. For instance, Tang et al. proposed a novel WOA algorithm that incorporates the concept of atom-like differential evolution, which defines whale behavior as quantum mechanical behavior [45]. Utilizing the benefits of alternative algorithms can help overcome the limitations of WOA. Throughout the development of WOA, Bilal H et al. developed a refined variation of WOA called Island-based Whale Optimization Algorithm by integrating island models to boost diversity [46]. Subsequently, they combined the hill-climbing algorithm with the WOA algorithm, resulting in the emergence of WOABHC. The hill-climbing algorithm serves to uphold the diversity within the algorithm [47]. Although the above algorithms have made significant progress in improving WOA, there is still potential development for WOA in global optimization, convergence speed, and other aspects. 3 Methods This section presents the fundamental principles behind the models and algorithms discussed in this paper. The details are outlined as follows: Section 3.1 provides a thorough examination of the architecture of KELM. Section 3.2 provides a detailed introduction to the whale optimization algorithm, offering an in-depth exploration of its three main stages: encircling prey, exploitation phase and exploration phase. Fig 1 displays the schematic diagram of KELM, while Algorithm 1 shows the schematic diagram of WOA. By delving into these components, this paper seeks to shed light on the inventive approach of blending KELM and WOA to deal with parameter optimization challenges and enhance predictive accuracy. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Model of ELM. https://doi.org/10.1371/journal.pone.0309741.g001 3.1 Kernel Extreme Learning Machine ELM is a highly efficient neural network known for its exceptional learning capabilities and performance. Due to its single hidden layer network and absence of backpropagation algorithm, ELM is able to achieve impressive results. Fig 1 illustrates the network design of ELM, which is comparable to that of a neural network in many ways. Except that neural networks use gradient descent to adjust parameters, while ELM’s parameters are manually set, so its running speed is much faster. However, it is prone to problems such as unstable training results and unsatisfactory generalization ability. In order to fortify the system and enhance its overall effectiveness, Kernel Extreme Learning Machine (KELM) adopts kernel mapping in place of the random mapping used in Extreme Learning Machine (ELM), leading to improved performance and generalization capability. The mathematical representation of ELM can be simplified to Eq (1). (1) where x and f(x) represent the input and output of the model respectively, h(x) or H represents the output matrix obtained by inputting x into the hidden layer, β defines the weight matrix connecting the hidden neurons and output layer which can be expressed as Eq (2): (2) where HT is the transposed matrix of H, I is the identity matrix, C is the regularization coefficient, and E is the expected output matrix. The kernel function Ω in KELM can be expressed as Eq (3): (3) This paper uses the Radial Basis Function(RBF) as the kernel function, and the calculation formula is: (4) where S = 2δ2, the standard output of KELM can be expressed as Eq (5)): (5) In fact, the effectiveness of the KELM model relies heavily on the chosen values for the regularization coefficient C and kernel function S. In other words, different (C, S) combinations will cause a direct influence on the KELM’s forecasting capability. Therefore, choosing the appropriate (C, S) combination is very important for KELM. The KELM model poses a nonlinear problem, making it a daunting task to obtain the best solution using conventional methods. To deal with this problem effectively, the paper puts forth a innovative WOA to effectively optimize the KELM parameters. 3.2 Whale Optimization Algorithm (WOA) WOA is a biologically-inspired algorithm created for optimization purposes. By mimicking the social habits of whale groups, it is able to effectively search globally and optimize locally for complex optimization problems. The following are the main principles of the WOA: Suppose there is a group of whales searching in a d-dimensional space, then the whale numbered i can be expressed as . Each whale’s position serves as a solution to the problem. The entire process of the WOA can be divided into 3 different stages: encircling prey, exploitation phase, search for prey. 3.2.1 Encircling prey. This stage simulates the behavior of whales contracting and surrounding their prey after discovering it. Nevertheless, due to the unknown optimal location in the search space, WOA algorithm presumes the prey’s position in the current iteration is the optimal position which obtains minimum fitness value. Use to represent this prey and other whales will surround the target prey. The calculation formula for this process is as Eqs (6) to (9): (6) (7) (8) (9) where t is the current iteration number, A and C are coefficients, the whale’s current position is X(t), while its updated position is X(t + 1). X* is the whale position that obtains the optimal value. α is the convergence factor, its expression is , where tmax represents the iterations number. α linearly decreases from 2 to 0 during the iteration process. r1 and r2 are random vectors from 0 to 1. 3.2.2 Exploitation phase. The exploitation phase demonstrates another way for whales to approach prey, which is complementary to the way they surround prey. The behavior involves spiraling motion, commonly referred to as spiral bubble net feeding. The logarithmic spiral model can be defined as Eqs (10) and (11): (10) (11) 3.2.3 Search for prey (exploration phase). The utilization of the first two bracketing strategies limits the thorough examination of the space, resulting in WOA getting trapped in local optimality. To combat this limitation, WOA introduces a random strategy in exploration phase. When A is greater than 1, a whale will be chosen at random as prey, and other whales will come closer to it. This technique will broaden the whale’s search area and lead to improved locations. This process is expressed by Eqs (12) and (13). It is similar to the process of Eqs (6)–(9), except that the prey surrounded is different. (12) (13) The pseudo code of the WOA algorithm is presented in Algorithm 1. Algorithm 1 Whale Optimization Algorithm 1: Initialize the position of the whale group: Xi(i = 1, 2, …k) 2: Calculate the fitness value of each whale, and use the whale with the smallest fitness value as the optimal whale X* 3: while t < Tmax do 4:  for each whale do 5:   Update parameters α, A, C, l, and p 6:   if p < 0.5 then 7:    if |A| < 1 then 8:     Employ Eq (7) to update the current position of the whale 9:    else 10:     Select a random whale Xrand 11:     Employ Eq (13) to update the current position of the whale 12:    end if 13:   else 14:    Employ Eq (10) to update the current position of the whale 15:   end if 16:  end for 17:  Apply boundary constraints to each search agent and Calculate the fitness of each whale 18:  Update the optimal whale X* 19:  t ← t + 1 20:  end while 21: return optimal whale X* 3.1 Kernel Extreme Learning Machine ELM is a highly efficient neural network known for its exceptional learning capabilities and performance. Due to its single hidden layer network and absence of backpropagation algorithm, ELM is able to achieve impressive results. Fig 1 illustrates the network design of ELM, which is comparable to that of a neural network in many ways. Except that neural networks use gradient descent to adjust parameters, while ELM’s parameters are manually set, so its running speed is much faster. However, it is prone to problems such as unstable training results and unsatisfactory generalization ability. In order to fortify the system and enhance its overall effectiveness, Kernel Extreme Learning Machine (KELM) adopts kernel mapping in place of the random mapping used in Extreme Learning Machine (ELM), leading to improved performance and generalization capability. The mathematical representation of ELM can be simplified to Eq (1). (1) where x and f(x) represent the input and output of the model respectively, h(x) or H represents the output matrix obtained by inputting x into the hidden layer, β defines the weight matrix connecting the hidden neurons and output layer which can be expressed as Eq (2): (2) where HT is the transposed matrix of H, I is the identity matrix, C is the regularization coefficient, and E is the expected output matrix. The kernel function Ω in KELM can be expressed as Eq (3): (3) This paper uses the Radial Basis Function(RBF) as the kernel function, and the calculation formula is: (4) where S = 2δ2, the standard output of KELM can be expressed as Eq (5)): (5) In fact, the effectiveness of the KELM model relies heavily on the chosen values for the regularization coefficient C and kernel function S. In other words, different (C, S) combinations will cause a direct influence on the KELM’s forecasting capability. Therefore, choosing the appropriate (C, S) combination is very important for KELM. The KELM model poses a nonlinear problem, making it a daunting task to obtain the best solution using conventional methods. To deal with this problem effectively, the paper puts forth a innovative WOA to effectively optimize the KELM parameters. 3.2 Whale Optimization Algorithm (WOA) WOA is a biologically-inspired algorithm created for optimization purposes. By mimicking the social habits of whale groups, it is able to effectively search globally and optimize locally for complex optimization problems. The following are the main principles of the WOA: Suppose there is a group of whales searching in a d-dimensional space, then the whale numbered i can be expressed as . Each whale’s position serves as a solution to the problem. The entire process of the WOA can be divided into 3 different stages: encircling prey, exploitation phase, search for prey. 3.2.1 Encircling prey. This stage simulates the behavior of whales contracting and surrounding their prey after discovering it. Nevertheless, due to the unknown optimal location in the search space, WOA algorithm presumes the prey’s position in the current iteration is the optimal position which obtains minimum fitness value. Use to represent this prey and other whales will surround the target prey. The calculation formula for this process is as Eqs (6) to (9): (6) (7) (8) (9) where t is the current iteration number, A and C are coefficients, the whale’s current position is X(t), while its updated position is X(t + 1). X* is the whale position that obtains the optimal value. α is the convergence factor, its expression is , where tmax represents the iterations number. α linearly decreases from 2 to 0 during the iteration process. r1 and r2 are random vectors from 0 to 1. 3.2.2 Exploitation phase. The exploitation phase demonstrates another way for whales to approach prey, which is complementary to the way they surround prey. The behavior involves spiraling motion, commonly referred to as spiral bubble net feeding. The logarithmic spiral model can be defined as Eqs (10) and (11): (10) (11) 3.2.3 Search for prey (exploration phase). The utilization of the first two bracketing strategies limits the thorough examination of the space, resulting in WOA getting trapped in local optimality. To combat this limitation, WOA introduces a random strategy in exploration phase. When A is greater than 1, a whale will be chosen at random as prey, and other whales will come closer to it. This technique will broaden the whale’s search area and lead to improved locations. This process is expressed by Eqs (12) and (13). It is similar to the process of Eqs (6)–(9), except that the prey surrounded is different. (12) (13) The pseudo code of the WOA algorithm is presented in Algorithm 1. Algorithm 1 Whale Optimization Algorithm 1: Initialize the position of the whale group: Xi(i = 1, 2, …k) 2: Calculate the fitness value of each whale, and use the whale with the smallest fitness value as the optimal whale X* 3: while t < Tmax do 4:  for each whale do 5:   Update parameters α, A, C, l, and p 6:   if p < 0.5 then 7:    if |A| < 1 then 8:     Employ Eq (7) to update the current position of the whale 9:    else 10:     Select a random whale Xrand 11:     Employ Eq (13) to update the current position of the whale 12:    end if 13:   else 14:    Employ Eq (10) to update the current position of the whale 15:   end if 16:  end for 17:  Apply boundary constraints to each search agent and Calculate the fitness of each whale 18:  Update the optimal whale X* 19:  t ← t + 1 20:  end while 21: return optimal whale X* 3.2.1 Encircling prey. This stage simulates the behavior of whales contracting and surrounding their prey after discovering it. Nevertheless, due to the unknown optimal location in the search space, WOA algorithm presumes the prey’s position in the current iteration is the optimal position which obtains minimum fitness value. Use to represent this prey and other whales will surround the target prey. The calculation formula for this process is as Eqs (6) to (9): (6) (7) (8) (9) where t is the current iteration number, A and C are coefficients, the whale’s current position is X(t), while its updated position is X(t + 1). X* is the whale position that obtains the optimal value. α is the convergence factor, its expression is , where tmax represents the iterations number. α linearly decreases from 2 to 0 during the iteration process. r1 and r2 are random vectors from 0 to 1. 3.2.2 Exploitation phase. The exploitation phase demonstrates another way for whales to approach prey, which is complementary to the way they surround prey. The behavior involves spiraling motion, commonly referred to as spiral bubble net feeding. The logarithmic spiral model can be defined as Eqs (10) and (11): (10) (11) 3.2.3 Search for prey (exploration phase). The utilization of the first two bracketing strategies limits the thorough examination of the space, resulting in WOA getting trapped in local optimality. To combat this limitation, WOA introduces a random strategy in exploration phase. When A is greater than 1, a whale will be chosen at random as prey, and other whales will come closer to it. This technique will broaden the whale’s search area and lead to improved locations. This process is expressed by Eqs (12) and (13). It is similar to the process of Eqs (6)–(9), except that the prey surrounded is different. (12) (13) The pseudo code of the WOA algorithm is presented in Algorithm 1. Algorithm 1 Whale Optimization Algorithm 1: Initialize the position of the whale group: Xi(i = 1, 2, …k) 2: Calculate the fitness value of each whale, and use the whale with the smallest fitness value as the optimal whale X* 3: while t < Tmax do 4:  for each whale do 5:   Update parameters α, A, C, l, and p 6:   if p < 0.5 then 7:    if |A| < 1 then 8:     Employ Eq (7) to update the current position of the whale 9:    else 10:     Select a random whale Xrand 11:     Employ Eq (13) to update the current position of the whale 12:    end if 13:   else 14:    Employ Eq (10) to update the current position of the whale 15:   end if 16:  end for 17:  Apply boundary constraints to each search agent and Calculate the fitness of each whale 18:  Update the optimal whale X* 19:  t ← t + 1 20:  end while 21: return optimal whale X* 4 Enhanced Adaptive Whale Optimization Algorithm The overall flowchart of the proposed EAWOA is shown in Fig 2. In WOA’s encircling prey phase and exploitation phase, other whales naturally move towards the optimal whale. By mutating the optimal whale, the algorithm’s performance can be significantly improved. Hence, a novel perturbation strategy based on adaptive T-distributions has been developed to perturb the optimal position to generate better solutions. Secondly, the convergence factor α is changed from linear decrease to nonlinear decrease, and a similar inertia weight ω is added. With this improvement, the algorithm is now able to conduct flexible searches at various stages. In addition, inspired by the grey wolf optimization algorithm, this paper uses the average position of three outstanding whales to replace the original random whale in the prey search stage. This process is similar to having three excellent hunters of whales surround their prey, enhancing the probability of the WOA algorithm finding the better value in the decision space. Finally, a innovative Levy flight was performed on the whale positions to improve global searching ability. This new Levy flight plan allows whale optimization algorithms to explore better solutions in different directions, enhancing the diversity of whale populations. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. The flowchart of the proposed EAWOA. https://doi.org/10.1371/journal.pone.0309741.g002 4.1 Adaptive T-distribution perturbation strategy 4.1.1 T-distribution perturbation strategy. T-distribution is an important type of distribution in statistics which commonly utilized in statistical inference for small sample sizes, parameter estimation and hypothesis testing. The T-distribution’s probability density function can be described by Eq 14. (14) where v is the number of degrees of freedom and Γ is the gamma function. 4.1.2 Adaptive T-distribution perturbation strategy. In this paper, we proposed a novel T-distribution perturbation to strengthen the optimal whale individual’s spatial exploration capability. With the potential for falling into local optimal dilemma in the later stages of iteration, it is advisable to improve the perturbation strategy during this time. The process of the adaptive t distribution strategy is shown as Eqs (15) and (16). where C1 = 0.64, C2 = 0.04 in this paper. trnd(t) is the T-distributed random number whose degree of freedom parameter is the number of iterations.trnd(t) represents a random vector generated from T-distributed, and its degree of freedom is the current number of iterations. A coefficient s between (0, 1) is implemented to moderate the disturbance intensity of the optimal whale. As the iterations increases, s becomes smaller, thus (1 − s) becomes larger, indicating that the degree of variation in the later period increases, which can better help the optimal whale position escape the local optimum in the later rounds. (15) (16) In order to guarantee an improved position after the disturbance, a greedy approach is implemented after the disturbance. The fitness value of the new and old positions is compared to determine if a position update is necessary. Eq (17) represents the greedy strategy: (17) 4.2 Dynamic adaptive weight adjustment 4.2.1 Nonlinear control parameter. In general, the swarm intelligence optimization algorithm is comprised of two separate stages: global search and local search. Ideally, algorithms use strong local search capabilities to determine the optimal value’s spatial range within the decision space, and then use local search to obtain the precise value. Therefore, the key to the algorithm achieving high search performance is how to effectively coordinate exploration and development capabilities. However, the original WOA algorithm’s control parameter α is linearly reduced, limiting its ability to fully utilize search capabilities. Hence, this paper proposes a nonlinear control parameter α. It is expressed as Eqs (18) and (19). (18) (19) where βmax = 2.0, βmin = 0, b = 2, c = 0. Fig 3 illustrates the changes in the convergence factor α both before and after improvement. This change of α enables the whale swarm to sustain bigger movements during the initial phases and smaller adjustments during the later part. It can be seen from Fig 3 the control parameter α changes nonlinear and dynamically. In contrast to the initial control parameter α, the enhanced control parameter α is initially larger and gradually becomes smaller, which enables the algorithm to perform extensive global and local searches during these two phases respectively. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Convergence factor α change graph. https://doi.org/10.1371/journal.pone.0309741.g003 4.2.2 The adjustment weight ω. To achieve a more harmonious balance between exploitation and exploration stages, the adjustment weight ω is set as formulas 20 and 21, where βmax = 0.9, βmin = 0.4. (20) (21) Similar to the principle of the nonlinear control parameters α, the adaptive weight adjustment ω further heightens algorithm’s effectiveness in global and local search. Apply ω to the three position update Eqs 7, 10 and 13, and get the following position update Eqs 22–24. As the number of iterations increases, ω decreases nonlinearly from 0.9 to 0.4, which helps each individual to make larger updates in the later stage and thus escape from the local optimum. (22) (23) (24) 4.3 Learning strategy based on Levy random flight The Levy flight model is an efficient form of random walk in enhancing algorithm diversity due to its random direction and length of motion. Many creatures in nature adopt this random walk strategy to increase the possibility of finding food. Inspired by this, Levy flight can be introduced into whale movements to increase the search possibility of the algorithm. This paper introduces a learning strategy to perform Levy flight on the particles after the updated position. Its expression is as Eqs 25 and 26, where A1 and C1 have the same meaning as A and C in encircling prey phase. step represents the vector of Levy flight, β = 1.5, and μ and v represent random vectors with the same dimensions as whale individuals. (25) (26) 4.4 New random whale in exploration phase Many algorithms will have a strategy to reduce local optimal situations. And this strategy in WOA is expressed as Eq 24. That is the choosing of a random whales instead of the optimal whale. Despite this, if the chosen whale falls into a poor space, it will cause a gathering of other poor-performing whales. And the already converged algorithm will return to a scattered state, resulting in a waste of computing resources. Inspired by the grey wolf optimization algorithm, this paper selects the 2nd to 4th best individual whales as hunters, uses their average value as prey and replaces the original random whales with it. It can preserve the randomness of the algorithm while allowing the algorithm to search towards a more ideal space. This process is expressed by Eq 27, where Xrank2, Xrank3 and Xrank4 respectively represent the whale positions with fitness values ranking 2nd to 4th. (27) By enhancing the methods described above, this paper has successfully developed the EAWOA model. Algorithm 2 contains the pseudo code for the EAWOA algorithm. Algorithm 2 Enhanced Adaptive Whale Optimization Algorithm (EAWOA) 1: Initialize the position of the whale group: Xi(i = 1, 2, …k) 2: Calculate the fitness value of the whale, and use the whale with the smallest fitness value as the optimal whale X* 3: while t < Tmax do 4:  Perform adaptive T-distribution perturbation on the optimal whale (Eq 16) 5:  for each whale do 6:   Update parameters α (Eq 19), w (Eq 21), A, C, l, and p 7:   if p < 0.5 then 8:    if |A| < 1 then 9:     Employ Eq 22 to update the current position of the whale 10:    else 11:     Employ Eq 27 to obtain the random whale 12:     Employ Eq 24 to update the current position of the whale 13:    end if 14:   else 15:    Employ Eq 23 to update the current position of the whale 16:   end if 17:  end for 18:  Perform Levy flight on each whale using Eq 25 19:  Apply boundary constraints to each search agent and calculate their fitness 20:  Update the optimal whale X* 21:  t ← t + 1 22: end while 23: return optimal whale X* 4.5 The whole model of EAWOA-KELM Different from other neural network algorithms, the weights of the KELM algorithm are randomly generated, so there is no need to go through the back propagation algorithm. Because of this, the operational efficiency of KELM is very high. However, the manual selection of parameters results in low accuracy of KELM. Therefore, this paper uses the EAWOA algorithm instead of manual parameter selection to increase the precision of KELM. The flowchart of EAWOA optimizing KELM is shown in Fig 4. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Flowchart of KELM parameters optimized by EAWOA. https://doi.org/10.1371/journal.pone.0309741.g004 Firstly, map the regularization parameter C and kernel parameter S to the position of whales (C, S). Additionally, the performance of KELM is related to the fitness function for WOA, which identifying the best combination of whale positions (C, S) to result in the most precise KELM classification. The entire process divides the data into training and testing sets in a certain proportion, and finally tests the data and outputs the accuracy of classification after the training is completed. By optimizing the parameters of the KELM algorithm, including kernel functions and regularization parameters, the optimal prediction model can be obtained faster. 4.1 Adaptive T-distribution perturbation strategy 4.1.1 T-distribution perturbation strategy. T-distribution is an important type of distribution in statistics which commonly utilized in statistical inference for small sample sizes, parameter estimation and hypothesis testing. The T-distribution’s probability density function can be described by Eq 14. (14) where v is the number of degrees of freedom and Γ is the gamma function. 4.1.2 Adaptive T-distribution perturbation strategy. In this paper, we proposed a novel T-distribution perturbation to strengthen the optimal whale individual’s spatial exploration capability. With the potential for falling into local optimal dilemma in the later stages of iteration, it is advisable to improve the perturbation strategy during this time. The process of the adaptive t distribution strategy is shown as Eqs (15) and (16). where C1 = 0.64, C2 = 0.04 in this paper. trnd(t) is the T-distributed random number whose degree of freedom parameter is the number of iterations.trnd(t) represents a random vector generated from T-distributed, and its degree of freedom is the current number of iterations. A coefficient s between (0, 1) is implemented to moderate the disturbance intensity of the optimal whale. As the iterations increases, s becomes smaller, thus (1 − s) becomes larger, indicating that the degree of variation in the later period increases, which can better help the optimal whale position escape the local optimum in the later rounds. (15) (16) In order to guarantee an improved position after the disturbance, a greedy approach is implemented after the disturbance. The fitness value of the new and old positions is compared to determine if a position update is necessary. Eq (17) represents the greedy strategy: (17) 4.1.1 T-distribution perturbation strategy. T-distribution is an important type of distribution in statistics which commonly utilized in statistical inference for small sample sizes, parameter estimation and hypothesis testing. The T-distribution’s probability density function can be described by Eq 14. (14) where v is the number of degrees of freedom and Γ is the gamma function. 4.1.2 Adaptive T-distribution perturbation strategy. In this paper, we proposed a novel T-distribution perturbation to strengthen the optimal whale individual’s spatial exploration capability. With the potential for falling into local optimal dilemma in the later stages of iteration, it is advisable to improve the perturbation strategy during this time. The process of the adaptive t distribution strategy is shown as Eqs (15) and (16). where C1 = 0.64, C2 = 0.04 in this paper. trnd(t) is the T-distributed random number whose degree of freedom parameter is the number of iterations.trnd(t) represents a random vector generated from T-distributed, and its degree of freedom is the current number of iterations. A coefficient s between (0, 1) is implemented to moderate the disturbance intensity of the optimal whale. As the iterations increases, s becomes smaller, thus (1 − s) becomes larger, indicating that the degree of variation in the later period increases, which can better help the optimal whale position escape the local optimum in the later rounds. (15) (16) In order to guarantee an improved position after the disturbance, a greedy approach is implemented after the disturbance. The fitness value of the new and old positions is compared to determine if a position update is necessary. Eq (17) represents the greedy strategy: (17) 4.2 Dynamic adaptive weight adjustment 4.2.1 Nonlinear control parameter. In general, the swarm intelligence optimization algorithm is comprised of two separate stages: global search and local search. Ideally, algorithms use strong local search capabilities to determine the optimal value’s spatial range within the decision space, and then use local search to obtain the precise value. Therefore, the key to the algorithm achieving high search performance is how to effectively coordinate exploration and development capabilities. However, the original WOA algorithm’s control parameter α is linearly reduced, limiting its ability to fully utilize search capabilities. Hence, this paper proposes a nonlinear control parameter α. It is expressed as Eqs (18) and (19). (18) (19) where βmax = 2.0, βmin = 0, b = 2, c = 0. Fig 3 illustrates the changes in the convergence factor α both before and after improvement. This change of α enables the whale swarm to sustain bigger movements during the initial phases and smaller adjustments during the later part. It can be seen from Fig 3 the control parameter α changes nonlinear and dynamically. In contrast to the initial control parameter α, the enhanced control parameter α is initially larger and gradually becomes smaller, which enables the algorithm to perform extensive global and local searches during these two phases respectively. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Convergence factor α change graph. https://doi.org/10.1371/journal.pone.0309741.g003 4.2.2 The adjustment weight ω. To achieve a more harmonious balance between exploitation and exploration stages, the adjustment weight ω is set as formulas 20 and 21, where βmax = 0.9, βmin = 0.4. (20) (21) Similar to the principle of the nonlinear control parameters α, the adaptive weight adjustment ω further heightens algorithm’s effectiveness in global and local search. Apply ω to the three position update Eqs 7, 10 and 13, and get the following position update Eqs 22–24. As the number of iterations increases, ω decreases nonlinearly from 0.9 to 0.4, which helps each individual to make larger updates in the later stage and thus escape from the local optimum. (22) (23) (24) 4.2.1 Nonlinear control parameter. In general, the swarm intelligence optimization algorithm is comprised of two separate stages: global search and local search. Ideally, algorithms use strong local search capabilities to determine the optimal value’s spatial range within the decision space, and then use local search to obtain the precise value. Therefore, the key to the algorithm achieving high search performance is how to effectively coordinate exploration and development capabilities. However, the original WOA algorithm’s control parameter α is linearly reduced, limiting its ability to fully utilize search capabilities. Hence, this paper proposes a nonlinear control parameter α. It is expressed as Eqs (18) and (19). (18) (19) where βmax = 2.0, βmin = 0, b = 2, c = 0. Fig 3 illustrates the changes in the convergence factor α both before and after improvement. This change of α enables the whale swarm to sustain bigger movements during the initial phases and smaller adjustments during the later part. It can be seen from Fig 3 the control parameter α changes nonlinear and dynamically. In contrast to the initial control parameter α, the enhanced control parameter α is initially larger and gradually becomes smaller, which enables the algorithm to perform extensive global and local searches during these two phases respectively. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Convergence factor α change graph. https://doi.org/10.1371/journal.pone.0309741.g003 4.2.2 The adjustment weight ω. To achieve a more harmonious balance between exploitation and exploration stages, the adjustment weight ω is set as formulas 20 and 21, where βmax = 0.9, βmin = 0.4. (20) (21) Similar to the principle of the nonlinear control parameters α, the adaptive weight adjustment ω further heightens algorithm’s effectiveness in global and local search. Apply ω to the three position update Eqs 7, 10 and 13, and get the following position update Eqs 22–24. As the number of iterations increases, ω decreases nonlinearly from 0.9 to 0.4, which helps each individual to make larger updates in the later stage and thus escape from the local optimum. (22) (23) (24) 4.3 Learning strategy based on Levy random flight The Levy flight model is an efficient form of random walk in enhancing algorithm diversity due to its random direction and length of motion. Many creatures in nature adopt this random walk strategy to increase the possibility of finding food. Inspired by this, Levy flight can be introduced into whale movements to increase the search possibility of the algorithm. This paper introduces a learning strategy to perform Levy flight on the particles after the updated position. Its expression is as Eqs 25 and 26, where A1 and C1 have the same meaning as A and C in encircling prey phase. step represents the vector of Levy flight, β = 1.5, and μ and v represent random vectors with the same dimensions as whale individuals. (25) (26) 4.4 New random whale in exploration phase Many algorithms will have a strategy to reduce local optimal situations. And this strategy in WOA is expressed as Eq 24. That is the choosing of a random whales instead of the optimal whale. Despite this, if the chosen whale falls into a poor space, it will cause a gathering of other poor-performing whales. And the already converged algorithm will return to a scattered state, resulting in a waste of computing resources. Inspired by the grey wolf optimization algorithm, this paper selects the 2nd to 4th best individual whales as hunters, uses their average value as prey and replaces the original random whales with it. It can preserve the randomness of the algorithm while allowing the algorithm to search towards a more ideal space. This process is expressed by Eq 27, where Xrank2, Xrank3 and Xrank4 respectively represent the whale positions with fitness values ranking 2nd to 4th. (27) By enhancing the methods described above, this paper has successfully developed the EAWOA model. Algorithm 2 contains the pseudo code for the EAWOA algorithm. Algorithm 2 Enhanced Adaptive Whale Optimization Algorithm (EAWOA) 1: Initialize the position of the whale group: Xi(i = 1, 2, …k) 2: Calculate the fitness value of the whale, and use the whale with the smallest fitness value as the optimal whale X* 3: while t < Tmax do 4:  Perform adaptive T-distribution perturbation on the optimal whale (Eq 16) 5:  for each whale do 6:   Update parameters α (Eq 19), w (Eq 21), A, C, l, and p 7:   if p < 0.5 then 8:    if |A| < 1 then 9:     Employ Eq 22 to update the current position of the whale 10:    else 11:     Employ Eq 27 to obtain the random whale 12:     Employ Eq 24 to update the current position of the whale 13:    end if 14:   else 15:    Employ Eq 23 to update the current position of the whale 16:   end if 17:  end for 18:  Perform Levy flight on each whale using Eq 25 19:  Apply boundary constraints to each search agent and calculate their fitness 20:  Update the optimal whale X* 21:  t ← t + 1 22: end while 23: return optimal whale X* 4.5 The whole model of EAWOA-KELM Different from other neural network algorithms, the weights of the KELM algorithm are randomly generated, so there is no need to go through the back propagation algorithm. Because of this, the operational efficiency of KELM is very high. However, the manual selection of parameters results in low accuracy of KELM. Therefore, this paper uses the EAWOA algorithm instead of manual parameter selection to increase the precision of KELM. The flowchart of EAWOA optimizing KELM is shown in Fig 4. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Flowchart of KELM parameters optimized by EAWOA. https://doi.org/10.1371/journal.pone.0309741.g004 Firstly, map the regularization parameter C and kernel parameter S to the position of whales (C, S). Additionally, the performance of KELM is related to the fitness function for WOA, which identifying the best combination of whale positions (C, S) to result in the most precise KELM classification. The entire process divides the data into training and testing sets in a certain proportion, and finally tests the data and outputs the accuracy of classification after the training is completed. By optimizing the parameters of the KELM algorithm, including kernel functions and regularization parameters, the optimal prediction model can be obtained faster. 5 Experiment and result analysis The experimental part is divided into two parts. In Section 5.1, we assess the effectiveness of the proposed EAWOA. And Section 5.2 proves that the KELM model optimized by EAWOA can be used to improve the accuracy of classification tasks. 5.1 Experiment about EAWOA 5.1.1 Algorithm parameter settings and test functions. This paper applied 21 benchmark functions from CEC 2005 [48], CEC 2017 [49] and CEC 2022 [50] to to assess the optimization capabilities of EAWOA. The descriptions of these functions are described in Tables 1–3. The images of some functions are displayed in Fig 5, including both unimodal and multimodal functions. These functions directly reflects optimization problems ranging from simple to complex scenarios. In order to guarantee a just and objective comparison, the parameters have been set as follows: the total population of whales and the count of iterations are set as 100 and 1000 respectively, and dimension parameter settings of each benchmark function are also shown in Tables 1, 2 and 3. All models are run independently 30 times and calculated to obtain two evaluation indicators: mean and standard deviation. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. Descriptions of the selected CEC 2005 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t001 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 2. Descriptions of the selected CEC 2017 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t002 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 3. Descriptions of the selected CEC 2022 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t003 Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Images of partial benchmark functions. https://doi.org/10.1371/journal.pone.0309741.g005 5.1.2 Comparison of EAWOA with other metaheuristic algorithms. In order to demonstrate the advantages of the put forward model in this paper, it underwent comparison with three popular metaheuristic algorithms: PSO, GWO, and especially the standard WOA. Running independently 30 times under the parameters specified in Section 5.1.1, each algorithm has its average value and standard deviation documented. Improved optimization and algorithm stability are achieved with lower average value and standard deviation. Table 4 shows the statistic results of each model. According to Table 4, the proposed EAWOA demonstrates the best outcomes than the state-of-the-art methods. Especially on the 10 benchmark functions of CEC 2005, it even achieved optimal values on 7 functions, this indicating its competitive in contrast to other exceptional metaheuristic algorithms. In addition, in the 21 benchmark functions of CEC2005, CEC 2017 and CEC 2022, EAWOA achieved better results than the standard WOA, which fully improves its effectiveness and adaptability. These functions include unimodal, multi-front, high-dimensional characteristics, indicating that in complex problems, the EAWOA algorithm can still converge to a higher accuracy, and can even solve for many functions to the theoretical optimal value. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 4. Statistic results between the proposed algorithm and other metaheuristic algorithms. https://doi.org/10.1371/journal.pone.0309741.t004 Simultaneously, the analysis reveals that the standard deviation of EAWOA is consistently lower than that of WOA, with the exception of the F5 test function discussed in this study. In comparison to all algorithms, EAWOA has attained the best standard deviation value on the majority of test functions, especially in the seven functions such as F1, F2, and F3, the standard deviation mean has reached the ideal value of 0. The experimental findings indicate that the EAWOA algorithm exhibits higher convergence stability and better robustness compared to other algorithms such as PSO, GWO, and WOA. 5.1.3 Comparison of EAWOA with other WOA variants. In order to reflect on the advantages of the EAWOA, this part selects WOA and 4 other rencent WOA algorithm variants for comparison: WOA_LFDE [41], eWOA [51], MWOA [52] and MSWOA [53]. Likewise, 30 independent trials were performed on EAWOA and other upgraded WOA, employing the identical parameters as described in Section 5.1.1, and the final averages and standard deviations were calculated. Table 5 presents the comparison results, indicating that EAWOA exhibits favorable performance: With a record of 10 wins, 10 losses, and 1 draw, it ranks the same as WOA_LFDE. Compared to the other three improved models, it has obvious advantages. Compared to EWOA, it won on 9 benchmark functions, lost on 5 benchmark functions, and tied on 7 benchmark functions. Compared to MWOA, it won on 13 benchmark functions, lost on 1 test function, and benchmark on 7 test functions. Compared to MSWOA, it won on 16 benchmark functions, lost on 2 benchmark functions, and tied on 3 benchmark functions. The results above demonstrate that EAWOA outperforms individual improved WOA algorithms more frequently on test functions, and it attains the theoretical optimal values on seven functions, such as F1, F2, and F3. When it comes to standard deviation, EAWOA consistently shows smaller values compared to other advanced algorithms on 14 out of 21 functions, highlighting its stability advantage. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 5. Statistical results of various algorithms. https://doi.org/10.1371/journal.pone.0309741.t005 In this experiment, it is apparent that the EAWOA algorithm not only surpasses the standard WOA in the context of global and local search capabilities, but also exhibits considerable benefits over other improved versions of WOA. 5.1.4 Convergence speed analysis. In addition to assessing the fitness value, the EAWOA algorithm’s efficiency can also be evaluated by convergence speed. This section delves into the benefits of its fast convergence speed. Fig 6 offers a comparison between EAWOA and standard WOA, the improved MSWOA and WOA_LFDE. The nonlinear convergence parameters α and ω can skilfully strengthen convergence. Simultaneously, other strategies also perform a key function in finding the global optimum, helping EAWOA quickly approach the optimal value. In contrast to the other three algorithm, the convergence curve of the EAWOA algorithm decreases rapidly, making it possible to quickly reach the theoretical optimal solution for most functions. The trend in Fig 6 shows a significant decrease in the evolution curve of the EAWOA algorithm as the number of iterations increases. It has the ability to rapidly reach the best solution when dealing with numerous basic test functions like F1-F4, including single-mode functions. Even reaching convergence on many functions hundreds of iterations earlier than other algorithms, such as F11 of CEC 2005. Furthermore, EAWOA shows considerable advantages in convergence speed for other multi-modal or mixed functions. By comparing with Table 5, it is evident that EAWOA not only exhibits faster convergence speed but also showcases high optimization accuracy, demonstrating fast convergence and strong global search capabilities. Although WOA_LFDE obtained the same raking in the previous experiments as EAWOA, it exhibited slower convergence compared to the EAWOA algorithm. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Convergence curves of various improved WOA algorithms. https://doi.org/10.1371/journal.pone.0309741.g006 The experiments in Sections 5.1.1 to 5.1.4 prove that the EAWOA model proposed in this paper has considerable advantages compared with some current metaheuristic algorithms, standard WOA, and some excellent variants of WOA. In addition, EAWOA can also reach convergence faster during the iterative process. 5.2 Experiment about EAWOA-KELM 5.2.1 Classification datasets. This paper uses seven five classification datasets to validate the advantages of the improved KELM: SDAS [54], NPHA [55], TME [56], SRP [57], Abalone [58], Balance-Scale [59] and Glass [60]. SDAS is a dataset which related to students’ performance in high education. It contains 36 features and 3 classifications. 36 features include gender, academic performance, parental qualification, etc., while classifications include dropout, enrolled, and graduate. In order to ensure even distribution of data, we took 500 samples for each category of students, totaling 1500 samples. NPHA is a dataset that predicts the number of doctors an elderly person visits within a year. It has 714 examples, including 14 features and 3 categories. TME is a dataset that classifies music based on emotions. It contains 400 examples and uses 50 features to divide music into 4 emotional types: happy, sad, angry, and relax. SPR is a dataset for accent detection and recognition, it contains 329 individual English words, spoken by native speakers originating from six different nations. And the final Abalone is a dataset predicts the age of abalone based on 8 characteristics. In order to make the classification dataset more uniform, we selected 100 samples for each category from 5 to 14 years old, forming a total of 1000 samples. The dataset Balance-Scale illustrates the balance results of a scale, consisting of 625 data points, each with 4 attributes and a category indicating whether the scale is tilted to the left, tilted to the right, or in balance. The last data set Glass consists of 214 data points and utilizes 9 attributes to categorize 6 different types of glass. 5.2.2 EAWOA-KELM for classification data tasks. In this section, 7 models related to KELM are used to classify the 5 datasets mentioned above, namely namely KELM, SSA-KELM, GTO-KELM, eWOA-KELM, WOA_LFDE-KELM, MWOA-KELM, MSWOA-KELM and our proposed EAWOA-KELM. Following the dataset introduced in Section 5.2.1, the experimental parameters are set as follows: After randomizing the data set in each classification task, split it into a training set and a test set using a 4:1 ratio. The population size for enhancing KELM through metaheuristic algorithm is set to 40, and the dimension to 2, which represents the two optimized parameters C and S, the iterations number is set to 100. Evaluation was based on accuracy, recall, and F1 score, with the average value computed over five runs. The outcomes are detailed in Table 6. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 6. Performance metrics of various classifiers on different datasets. https://doi.org/10.1371/journal.pone.0309741.t006 From Table 6, it can be seen that the classification results of KELM on the dataset have significantly improved after being optimized by different improved models. This indicates that the use of metaheuristic algorithms to optimize (C, S) parameter combinations has a significant effect on the improvement of KELM models. In the evaluation involving seven datasets, it was found that EAWOA-KELM outperformed WOA-KELM in classification tasks, showing an improvement of more than 5% in certain datasets among the nine models tested. In comparison to other enhanced algorithms, EAWOA-KELM outperformed them, showcasing the strong global search capability of the EAWOA algorithm in this study and its tendency to avoid getting stuck in local optima while tuning KELM parameters. 5.2.3 Experimental selection of parameters βmin and βmax. This paper introduces a new inertia weight ω to the position update of the enhanced whale optimization algorithm, as outlined in formulas 22-24, with the expression of provided in formulas 20 and 21. Altering the values of and in will lead to diverse optimization results with the EAWOA algorithm. In order to quickly determine the optimal value range of parameters and, this paper selects the SDAS data set in Table 6 as the standard data set for testing. The EAWOA-KELM model algorithm was executed 5 times, with the average values of accuracy, recall, and F1 score serving as the basis for result assessment. The detailed results are provided in Table 7. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 7. Performance metrics for different βmin and βmax values. https://doi.org/10.1371/journal.pone.0309741.t007 βmin and βmax values set the highest and lowest boundaries for the extent of individual position adjustment in the EAWOA algorithm, specifically defining the level of self-preservation. Hence, this parameter plays a crucial role in determining the ultimate classification outcome. To discover a more effective pairing of βmin and βmax within a practical range, this study employs a binary strategy. Within the βmax range of 0.6 to 0.9 and the βmin range of 0.1 to 0.5, it has been established that 20 combinations encompass intervals varying between 0 and 1 in size. The impact of βmin and βmax values on the results is evident from the findings in Table 7. Variances in these values lead to differences of more than 6% in accuracy, recall, and F1 score. Experiments show that when βmax is fixed, the best result is achieved when βmin is 0.5 less than βmax. Within the group, (0.4, 0.9), (0.3, 0.8), and (0.1, 0.6) all excelled in the three indicators. (0.2, 0.7) outperformed the rest in both indicators and is just 0.008% lower than (0.1, 0.7) in F1 score. The most favorable outcomes were achieved by (0.4, 0.9) among these combinations. 5.1 Experiment about EAWOA 5.1.1 Algorithm parameter settings and test functions. This paper applied 21 benchmark functions from CEC 2005 [48], CEC 2017 [49] and CEC 2022 [50] to to assess the optimization capabilities of EAWOA. The descriptions of these functions are described in Tables 1–3. The images of some functions are displayed in Fig 5, including both unimodal and multimodal functions. These functions directly reflects optimization problems ranging from simple to complex scenarios. In order to guarantee a just and objective comparison, the parameters have been set as follows: the total population of whales and the count of iterations are set as 100 and 1000 respectively, and dimension parameter settings of each benchmark function are also shown in Tables 1, 2 and 3. All models are run independently 30 times and calculated to obtain two evaluation indicators: mean and standard deviation. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. Descriptions of the selected CEC 2005 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t001 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 2. Descriptions of the selected CEC 2017 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t002 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 3. Descriptions of the selected CEC 2022 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t003 Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Images of partial benchmark functions. https://doi.org/10.1371/journal.pone.0309741.g005 5.1.2 Comparison of EAWOA with other metaheuristic algorithms. In order to demonstrate the advantages of the put forward model in this paper, it underwent comparison with three popular metaheuristic algorithms: PSO, GWO, and especially the standard WOA. Running independently 30 times under the parameters specified in Section 5.1.1, each algorithm has its average value and standard deviation documented. Improved optimization and algorithm stability are achieved with lower average value and standard deviation. Table 4 shows the statistic results of each model. According to Table 4, the proposed EAWOA demonstrates the best outcomes than the state-of-the-art methods. Especially on the 10 benchmark functions of CEC 2005, it even achieved optimal values on 7 functions, this indicating its competitive in contrast to other exceptional metaheuristic algorithms. In addition, in the 21 benchmark functions of CEC2005, CEC 2017 and CEC 2022, EAWOA achieved better results than the standard WOA, which fully improves its effectiveness and adaptability. These functions include unimodal, multi-front, high-dimensional characteristics, indicating that in complex problems, the EAWOA algorithm can still converge to a higher accuracy, and can even solve for many functions to the theoretical optimal value. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 4. Statistic results between the proposed algorithm and other metaheuristic algorithms. https://doi.org/10.1371/journal.pone.0309741.t004 Simultaneously, the analysis reveals that the standard deviation of EAWOA is consistently lower than that of WOA, with the exception of the F5 test function discussed in this study. In comparison to all algorithms, EAWOA has attained the best standard deviation value on the majority of test functions, especially in the seven functions such as F1, F2, and F3, the standard deviation mean has reached the ideal value of 0. The experimental findings indicate that the EAWOA algorithm exhibits higher convergence stability and better robustness compared to other algorithms such as PSO, GWO, and WOA. 5.1.3 Comparison of EAWOA with other WOA variants. In order to reflect on the advantages of the EAWOA, this part selects WOA and 4 other rencent WOA algorithm variants for comparison: WOA_LFDE [41], eWOA [51], MWOA [52] and MSWOA [53]. Likewise, 30 independent trials were performed on EAWOA and other upgraded WOA, employing the identical parameters as described in Section 5.1.1, and the final averages and standard deviations were calculated. Table 5 presents the comparison results, indicating that EAWOA exhibits favorable performance: With a record of 10 wins, 10 losses, and 1 draw, it ranks the same as WOA_LFDE. Compared to the other three improved models, it has obvious advantages. Compared to EWOA, it won on 9 benchmark functions, lost on 5 benchmark functions, and tied on 7 benchmark functions. Compared to MWOA, it won on 13 benchmark functions, lost on 1 test function, and benchmark on 7 test functions. Compared to MSWOA, it won on 16 benchmark functions, lost on 2 benchmark functions, and tied on 3 benchmark functions. The results above demonstrate that EAWOA outperforms individual improved WOA algorithms more frequently on test functions, and it attains the theoretical optimal values on seven functions, such as F1, F2, and F3. When it comes to standard deviation, EAWOA consistently shows smaller values compared to other advanced algorithms on 14 out of 21 functions, highlighting its stability advantage. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 5. Statistical results of various algorithms. https://doi.org/10.1371/journal.pone.0309741.t005 In this experiment, it is apparent that the EAWOA algorithm not only surpasses the standard WOA in the context of global and local search capabilities, but also exhibits considerable benefits over other improved versions of WOA. 5.1.4 Convergence speed analysis. In addition to assessing the fitness value, the EAWOA algorithm’s efficiency can also be evaluated by convergence speed. This section delves into the benefits of its fast convergence speed. Fig 6 offers a comparison between EAWOA and standard WOA, the improved MSWOA and WOA_LFDE. The nonlinear convergence parameters α and ω can skilfully strengthen convergence. Simultaneously, other strategies also perform a key function in finding the global optimum, helping EAWOA quickly approach the optimal value. In contrast to the other three algorithm, the convergence curve of the EAWOA algorithm decreases rapidly, making it possible to quickly reach the theoretical optimal solution for most functions. The trend in Fig 6 shows a significant decrease in the evolution curve of the EAWOA algorithm as the number of iterations increases. It has the ability to rapidly reach the best solution when dealing with numerous basic test functions like F1-F4, including single-mode functions. Even reaching convergence on many functions hundreds of iterations earlier than other algorithms, such as F11 of CEC 2005. Furthermore, EAWOA shows considerable advantages in convergence speed for other multi-modal or mixed functions. By comparing with Table 5, it is evident that EAWOA not only exhibits faster convergence speed but also showcases high optimization accuracy, demonstrating fast convergence and strong global search capabilities. Although WOA_LFDE obtained the same raking in the previous experiments as EAWOA, it exhibited slower convergence compared to the EAWOA algorithm. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Convergence curves of various improved WOA algorithms. https://doi.org/10.1371/journal.pone.0309741.g006 The experiments in Sections 5.1.1 to 5.1.4 prove that the EAWOA model proposed in this paper has considerable advantages compared with some current metaheuristic algorithms, standard WOA, and some excellent variants of WOA. In addition, EAWOA can also reach convergence faster during the iterative process. 5.1.1 Algorithm parameter settings and test functions. This paper applied 21 benchmark functions from CEC 2005 [48], CEC 2017 [49] and CEC 2022 [50] to to assess the optimization capabilities of EAWOA. The descriptions of these functions are described in Tables 1–3. The images of some functions are displayed in Fig 5, including both unimodal and multimodal functions. These functions directly reflects optimization problems ranging from simple to complex scenarios. In order to guarantee a just and objective comparison, the parameters have been set as follows: the total population of whales and the count of iterations are set as 100 and 1000 respectively, and dimension parameter settings of each benchmark function are also shown in Tables 1, 2 and 3. All models are run independently 30 times and calculated to obtain two evaluation indicators: mean and standard deviation. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. Descriptions of the selected CEC 2005 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t001 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 2. Descriptions of the selected CEC 2017 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t002 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 3. Descriptions of the selected CEC 2022 benchmark functions. https://doi.org/10.1371/journal.pone.0309741.t003 Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Images of partial benchmark functions. https://doi.org/10.1371/journal.pone.0309741.g005 5.1.2 Comparison of EAWOA with other metaheuristic algorithms. In order to demonstrate the advantages of the put forward model in this paper, it underwent comparison with three popular metaheuristic algorithms: PSO, GWO, and especially the standard WOA. Running independently 30 times under the parameters specified in Section 5.1.1, each algorithm has its average value and standard deviation documented. Improved optimization and algorithm stability are achieved with lower average value and standard deviation. Table 4 shows the statistic results of each model. According to Table 4, the proposed EAWOA demonstrates the best outcomes than the state-of-the-art methods. Especially on the 10 benchmark functions of CEC 2005, it even achieved optimal values on 7 functions, this indicating its competitive in contrast to other exceptional metaheuristic algorithms. In addition, in the 21 benchmark functions of CEC2005, CEC 2017 and CEC 2022, EAWOA achieved better results than the standard WOA, which fully improves its effectiveness and adaptability. These functions include unimodal, multi-front, high-dimensional characteristics, indicating that in complex problems, the EAWOA algorithm can still converge to a higher accuracy, and can even solve for many functions to the theoretical optimal value. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 4. Statistic results between the proposed algorithm and other metaheuristic algorithms. https://doi.org/10.1371/journal.pone.0309741.t004 Simultaneously, the analysis reveals that the standard deviation of EAWOA is consistently lower than that of WOA, with the exception of the F5 test function discussed in this study. In comparison to all algorithms, EAWOA has attained the best standard deviation value on the majority of test functions, especially in the seven functions such as F1, F2, and F3, the standard deviation mean has reached the ideal value of 0. The experimental findings indicate that the EAWOA algorithm exhibits higher convergence stability and better robustness compared to other algorithms such as PSO, GWO, and WOA. 5.1.3 Comparison of EAWOA with other WOA variants. In order to reflect on the advantages of the EAWOA, this part selects WOA and 4 other rencent WOA algorithm variants for comparison: WOA_LFDE [41], eWOA [51], MWOA [52] and MSWOA [53]. Likewise, 30 independent trials were performed on EAWOA and other upgraded WOA, employing the identical parameters as described in Section 5.1.1, and the final averages and standard deviations were calculated. Table 5 presents the comparison results, indicating that EAWOA exhibits favorable performance: With a record of 10 wins, 10 losses, and 1 draw, it ranks the same as WOA_LFDE. Compared to the other three improved models, it has obvious advantages. Compared to EWOA, it won on 9 benchmark functions, lost on 5 benchmark functions, and tied on 7 benchmark functions. Compared to MWOA, it won on 13 benchmark functions, lost on 1 test function, and benchmark on 7 test functions. Compared to MSWOA, it won on 16 benchmark functions, lost on 2 benchmark functions, and tied on 3 benchmark functions. The results above demonstrate that EAWOA outperforms individual improved WOA algorithms more frequently on test functions, and it attains the theoretical optimal values on seven functions, such as F1, F2, and F3. When it comes to standard deviation, EAWOA consistently shows smaller values compared to other advanced algorithms on 14 out of 21 functions, highlighting its stability advantage. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 5. Statistical results of various algorithms. https://doi.org/10.1371/journal.pone.0309741.t005 In this experiment, it is apparent that the EAWOA algorithm not only surpasses the standard WOA in the context of global and local search capabilities, but also exhibits considerable benefits over other improved versions of WOA. 5.1.4 Convergence speed analysis. In addition to assessing the fitness value, the EAWOA algorithm’s efficiency can also be evaluated by convergence speed. This section delves into the benefits of its fast convergence speed. Fig 6 offers a comparison between EAWOA and standard WOA, the improved MSWOA and WOA_LFDE. The nonlinear convergence parameters α and ω can skilfully strengthen convergence. Simultaneously, other strategies also perform a key function in finding the global optimum, helping EAWOA quickly approach the optimal value. In contrast to the other three algorithm, the convergence curve of the EAWOA algorithm decreases rapidly, making it possible to quickly reach the theoretical optimal solution for most functions. The trend in Fig 6 shows a significant decrease in the evolution curve of the EAWOA algorithm as the number of iterations increases. It has the ability to rapidly reach the best solution when dealing with numerous basic test functions like F1-F4, including single-mode functions. Even reaching convergence on many functions hundreds of iterations earlier than other algorithms, such as F11 of CEC 2005. Furthermore, EAWOA shows considerable advantages in convergence speed for other multi-modal or mixed functions. By comparing with Table 5, it is evident that EAWOA not only exhibits faster convergence speed but also showcases high optimization accuracy, demonstrating fast convergence and strong global search capabilities. Although WOA_LFDE obtained the same raking in the previous experiments as EAWOA, it exhibited slower convergence compared to the EAWOA algorithm. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Convergence curves of various improved WOA algorithms. https://doi.org/10.1371/journal.pone.0309741.g006 The experiments in Sections 5.1.1 to 5.1.4 prove that the EAWOA model proposed in this paper has considerable advantages compared with some current metaheuristic algorithms, standard WOA, and some excellent variants of WOA. In addition, EAWOA can also reach convergence faster during the iterative process. 5.2 Experiment about EAWOA-KELM 5.2.1 Classification datasets. This paper uses seven five classification datasets to validate the advantages of the improved KELM: SDAS [54], NPHA [55], TME [56], SRP [57], Abalone [58], Balance-Scale [59] and Glass [60]. SDAS is a dataset which related to students’ performance in high education. It contains 36 features and 3 classifications. 36 features include gender, academic performance, parental qualification, etc., while classifications include dropout, enrolled, and graduate. In order to ensure even distribution of data, we took 500 samples for each category of students, totaling 1500 samples. NPHA is a dataset that predicts the number of doctors an elderly person visits within a year. It has 714 examples, including 14 features and 3 categories. TME is a dataset that classifies music based on emotions. It contains 400 examples and uses 50 features to divide music into 4 emotional types: happy, sad, angry, and relax. SPR is a dataset for accent detection and recognition, it contains 329 individual English words, spoken by native speakers originating from six different nations. And the final Abalone is a dataset predicts the age of abalone based on 8 characteristics. In order to make the classification dataset more uniform, we selected 100 samples for each category from 5 to 14 years old, forming a total of 1000 samples. The dataset Balance-Scale illustrates the balance results of a scale, consisting of 625 data points, each with 4 attributes and a category indicating whether the scale is tilted to the left, tilted to the right, or in balance. The last data set Glass consists of 214 data points and utilizes 9 attributes to categorize 6 different types of glass. 5.2.2 EAWOA-KELM for classification data tasks. In this section, 7 models related to KELM are used to classify the 5 datasets mentioned above, namely namely KELM, SSA-KELM, GTO-KELM, eWOA-KELM, WOA_LFDE-KELM, MWOA-KELM, MSWOA-KELM and our proposed EAWOA-KELM. Following the dataset introduced in Section 5.2.1, the experimental parameters are set as follows: After randomizing the data set in each classification task, split it into a training set and a test set using a 4:1 ratio. The population size for enhancing KELM through metaheuristic algorithm is set to 40, and the dimension to 2, which represents the two optimized parameters C and S, the iterations number is set to 100. Evaluation was based on accuracy, recall, and F1 score, with the average value computed over five runs. The outcomes are detailed in Table 6. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 6. Performance metrics of various classifiers on different datasets. https://doi.org/10.1371/journal.pone.0309741.t006 From Table 6, it can be seen that the classification results of KELM on the dataset have significantly improved after being optimized by different improved models. This indicates that the use of metaheuristic algorithms to optimize (C, S) parameter combinations has a significant effect on the improvement of KELM models. In the evaluation involving seven datasets, it was found that EAWOA-KELM outperformed WOA-KELM in classification tasks, showing an improvement of more than 5% in certain datasets among the nine models tested. In comparison to other enhanced algorithms, EAWOA-KELM outperformed them, showcasing the strong global search capability of the EAWOA algorithm in this study and its tendency to avoid getting stuck in local optima while tuning KELM parameters. 5.2.3 Experimental selection of parameters βmin and βmax. This paper introduces a new inertia weight ω to the position update of the enhanced whale optimization algorithm, as outlined in formulas 22-24, with the expression of provided in formulas 20 and 21. Altering the values of and in will lead to diverse optimization results with the EAWOA algorithm. In order to quickly determine the optimal value range of parameters and, this paper selects the SDAS data set in Table 6 as the standard data set for testing. The EAWOA-KELM model algorithm was executed 5 times, with the average values of accuracy, recall, and F1 score serving as the basis for result assessment. The detailed results are provided in Table 7. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 7. Performance metrics for different βmin and βmax values. https://doi.org/10.1371/journal.pone.0309741.t007 βmin and βmax values set the highest and lowest boundaries for the extent of individual position adjustment in the EAWOA algorithm, specifically defining the level of self-preservation. Hence, this parameter plays a crucial role in determining the ultimate classification outcome. To discover a more effective pairing of βmin and βmax within a practical range, this study employs a binary strategy. Within the βmax range of 0.6 to 0.9 and the βmin range of 0.1 to 0.5, it has been established that 20 combinations encompass intervals varying between 0 and 1 in size. The impact of βmin and βmax values on the results is evident from the findings in Table 7. Variances in these values lead to differences of more than 6% in accuracy, recall, and F1 score. Experiments show that when βmax is fixed, the best result is achieved when βmin is 0.5 less than βmax. Within the group, (0.4, 0.9), (0.3, 0.8), and (0.1, 0.6) all excelled in the three indicators. (0.2, 0.7) outperformed the rest in both indicators and is just 0.008% lower than (0.1, 0.7) in F1 score. The most favorable outcomes were achieved by (0.4, 0.9) among these combinations. 5.2.1 Classification datasets. This paper uses seven five classification datasets to validate the advantages of the improved KELM: SDAS [54], NPHA [55], TME [56], SRP [57], Abalone [58], Balance-Scale [59] and Glass [60]. SDAS is a dataset which related to students’ performance in high education. It contains 36 features and 3 classifications. 36 features include gender, academic performance, parental qualification, etc., while classifications include dropout, enrolled, and graduate. In order to ensure even distribution of data, we took 500 samples for each category of students, totaling 1500 samples. NPHA is a dataset that predicts the number of doctors an elderly person visits within a year. It has 714 examples, including 14 features and 3 categories. TME is a dataset that classifies music based on emotions. It contains 400 examples and uses 50 features to divide music into 4 emotional types: happy, sad, angry, and relax. SPR is a dataset for accent detection and recognition, it contains 329 individual English words, spoken by native speakers originating from six different nations. And the final Abalone is a dataset predicts the age of abalone based on 8 characteristics. In order to make the classification dataset more uniform, we selected 100 samples for each category from 5 to 14 years old, forming a total of 1000 samples. The dataset Balance-Scale illustrates the balance results of a scale, consisting of 625 data points, each with 4 attributes and a category indicating whether the scale is tilted to the left, tilted to the right, or in balance. The last data set Glass consists of 214 data points and utilizes 9 attributes to categorize 6 different types of glass. 5.2.2 EAWOA-KELM for classification data tasks. In this section, 7 models related to KELM are used to classify the 5 datasets mentioned above, namely namely KELM, SSA-KELM, GTO-KELM, eWOA-KELM, WOA_LFDE-KELM, MWOA-KELM, MSWOA-KELM and our proposed EAWOA-KELM. Following the dataset introduced in Section 5.2.1, the experimental parameters are set as follows: After randomizing the data set in each classification task, split it into a training set and a test set using a 4:1 ratio. The population size for enhancing KELM through metaheuristic algorithm is set to 40, and the dimension to 2, which represents the two optimized parameters C and S, the iterations number is set to 100. Evaluation was based on accuracy, recall, and F1 score, with the average value computed over five runs. The outcomes are detailed in Table 6. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 6. Performance metrics of various classifiers on different datasets. https://doi.org/10.1371/journal.pone.0309741.t006 From Table 6, it can be seen that the classification results of KELM on the dataset have significantly improved after being optimized by different improved models. This indicates that the use of metaheuristic algorithms to optimize (C, S) parameter combinations has a significant effect on the improvement of KELM models. In the evaluation involving seven datasets, it was found that EAWOA-KELM outperformed WOA-KELM in classification tasks, showing an improvement of more than 5% in certain datasets among the nine models tested. In comparison to other enhanced algorithms, EAWOA-KELM outperformed them, showcasing the strong global search capability of the EAWOA algorithm in this study and its tendency to avoid getting stuck in local optima while tuning KELM parameters. 5.2.3 Experimental selection of parameters βmin and βmax. This paper introduces a new inertia weight ω to the position update of the enhanced whale optimization algorithm, as outlined in formulas 22-24, with the expression of provided in formulas 20 and 21. Altering the values of and in will lead to diverse optimization results with the EAWOA algorithm. In order to quickly determine the optimal value range of parameters and, this paper selects the SDAS data set in Table 6 as the standard data set for testing. The EAWOA-KELM model algorithm was executed 5 times, with the average values of accuracy, recall, and F1 score serving as the basis for result assessment. The detailed results are provided in Table 7. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 7. Performance metrics for different βmin and βmax values. https://doi.org/10.1371/journal.pone.0309741.t007 βmin and βmax values set the highest and lowest boundaries for the extent of individual position adjustment in the EAWOA algorithm, specifically defining the level of self-preservation. Hence, this parameter plays a crucial role in determining the ultimate classification outcome. To discover a more effective pairing of βmin and βmax within a practical range, this study employs a binary strategy. Within the βmax range of 0.6 to 0.9 and the βmin range of 0.1 to 0.5, it has been established that 20 combinations encompass intervals varying between 0 and 1 in size. The impact of βmin and βmax values on the results is evident from the findings in Table 7. Variances in these values lead to differences of more than 6% in accuracy, recall, and F1 score. Experiments show that when βmax is fixed, the best result is achieved when βmin is 0.5 less than βmax. Within the group, (0.4, 0.9), (0.3, 0.8), and (0.1, 0.6) all excelled in the three indicators. (0.2, 0.7) outperformed the rest in both indicators and is just 0.008% lower than (0.1, 0.7) in F1 score. The most favorable outcomes were achieved by (0.4, 0.9) among these combinations. 6 Conclusions and future work The kernel extreme learning machine holds significant importance in the realm of machine learning and is extensively employed for data classification. To optimize its efficiency for solving classification problems, an innovative model EAWOA, is proposed in this paper. Multiple strategies were used in EAWOA, including innovative T-distribution perturbations, nonlinear parameters, novel Levy flight, and surrounding prey strategy by 3 excellent whales. This improved algorithm has successfully addressed the problems that are inherent in the original WOA. Based on 21 benchmark functions and 7 classification datasets, the experimental results have demonstrated the superiority of EAWOA and EAWOA-KELM models in global search, convergence speed, classification accuracy, and other aspects. Utilizing a fusion of whale algorithm and kernel extreme learning machine, this technique elevates the data classification accuracy and efficiency. Despite the positive experimental results demonstrating the effectiveness of EAWOA in optimizing KELM, there are still some limitations. The following are the main limitations of EAWOA. First, EAWOA’s main limitations include the need for resetting after each optimization process, despite its ability to potentially optimize parameters for a wider range of machine learning classifiers. Second, the incorporation of extra Levy random flights and adaptive factors in EAWOA increases its complexity, leading to a higher demand for computing resources and longer execution times. Hence, further research is essential to enhance the shortcomings of this model. EAWOA will be employed in upcoming studies to enhance the performance of additional classifiers. The next focus will also explore ways to decrease the computational complexity of the model in order to reduce the overall running time. Moreover, EAWOA-KELM can also make more flexible adjustments to deal with constrained optimization problems. Acknowledgments I express my sincere gratitude to the editor and reviewers for their meticulous review process and invaluable suggestions that have been instrumental in enhancing the quality of this work. TI - Optimizing Kernel Extreme Learning Machine based on a Enhanced Adaptive Whale Optimization Algorithm for classification task JO - PLoS ONE DO - 10.1371/journal.pone.0309741 DA - 2025-01-03 UR - https://www.deepdyve.com/lp/public-library-of-science-plos-journal/optimizing-kernel-extreme-learning-machine-based-on-a-enhanced-cfUgVVaDbM SP - e0309741 VL - 20 IS - 1 DP - DeepDyve ER -