TY - JOUR AU - Sheng, Huankun AB - I. Introduction With the continuous development of medical informatics, the amount of medical data is growing rapidly. Medical data are a general term for data and information from multiple fields such as consultation services, disease prevention, health checkups, etc. These data mainly include: electronic medical records, medical images, laboratory data, sign indicator data and personal health data. Medical data is a crucial diagnostic basis for doctors, providing more comprehensive treatment clues and assisting doctors in providing more accurate diagnoses. Disease diagnosis is the cornerstone of prevention and treatment and can be used to determine the type, characteristics, and severity of diseases through medical data analysis and early symptom detection, providing the basis for early intervention and treatment. The diagnostic accuracy directly affects the success rate of disease treatment. The improvement of the accuracy rate can not only effectively improve the cure success rate, survival rate, survival cycle and quality of life of patients but also reduce the medical cost of patients. However, current medical data face problems such as large data volume, multiple data types, high data dimensionality, high value but low value density, and real-time nature [1]. These challenges lead to high time costs, low diagnostic accuracy, and a reliance on empirical knowledge in disease diagnosis and research [2], hindering patient recovery, survival rate, and quality of life improvements and healthcare cost reductions [3]. In this context, integrating machine learning techniques for medical diagnosis is emerging as a significant research trend [4, 5]. Nonetheless, raw medical data contain numerous irrelevant and redundant features [6], which not only obstruct data analysis but also lead to the ’curse of dimensionality’ [7]. Consequently, extracting essential information effectively from raw data is important for enhancing both the accuracy and efficiency of diagnoses. Feature selection (FS) has always been a critical research domain in machine learning. Identifying the most relevant and effective feature subsets is the objective of FS [8, 9]. In the domain of medical data processing, FS is particularly important, aiding in selecting the most representative and useful features and thereby simplifying the model and enhancing the efficiency and accuracy of data processing. Additionally, FS improves the model interpretability, enabling doctors to understand the decision-making process, which in turn increases their trust in the model. Finally, robust models can be established by eliminating irrelevant and redundant features [10]. However, searching within the feature subspace to identify useful feature subsets is an NP-hard optimization problem [11–13]. All search methods can be broadly categorized into three types: complete search algorithms, heuristic search algorithms, and random search algorithms [14]. The first type of algorithm is rarely used because it requires considerable computing power and is easily affected by changes in the size of the data. Heuristic search algorithms generally have moderate search capabilities. They are prone to becoming trapped in local optima and cannot effectively handle the problem of combination explosion in feature subset solution spaces [15]. Random search algorithms, represented by metaheuristic algorithms, use stochastic methods to obtain feature subsets, allowing for a larger search space and effectively avoiding local optima [16, 17]. Metaheuristics are primarily split into single-solution-based and population-based algorithms [18]. The former works by optimizing a single solution. In contrast, the latter creates a group of solutions, termed a ’population’, in each iteration. This approach is more effective at avoiding local optima [19]. Population-based metaheuristic methods can be further divided into evolution-based algorithms, human-based algorithms, physics-based algorithms, sports-based algorithms, light-based intelligent algorithms, and swarm intelligence algorithms [20, 21]. Inspired by natural biological populations, swarm intelligence algorithms offer innovative solutions for subspace searching [22]. Key examples of swarm intelligence algorithms include particle swarm optimization (PSO) [23], ant colony optimization (ACO) [24], cuckoo optimization algorithm (COA) [25], bat algorithm (BA) [26], gray wolf optimizer (GWO) [27], whale optimization algorithm (WOA) [28], salp swarm algorithm(SSA) [29], nutcracker optimizer algorithm (NOA) [30], manta ray foraging optimization (MRFO) [31], African vulture optimization algorithm (AVOA) [32], Harris hawks optimization (HHO) [33], and more. These algorithms process medical data effectively via distributed computing. Information-sharing mechanisms also significantly improve the model efficiency and adaptability. Additionally, their inherent flexibility and robustness ensure stable performance, even in the presence of individual errors or failures. Importantly, swarm intelligence algorithms excel at preventing premature convergence and achieve superior optimization precision through collaborative decision-making and strategic search methodologies. Swarm intelligence algorithms have achieved significant success in medical data FS. However, it is difficult for these algorithms to balance exploration and exploitation, maintain population diversity, maintain convergence, and adjust parameters [34, 35]. Moreover, many studies using swarm intelligence algorithms for medical data feature selection address simple and limited problems, targeting only specific diseases or datasets, with a generalizability that is not better adapted to today’s rapidly evolving medical data needs. In addition, some algorithms are inefficient for large-scale data. Therefore, given the characteristics of medical data, such as the variability and instability of the data and feature volume [36], more flexible and adaptive processing methods are needed. The introduction of a novel algorithm named the mountain gazelle optimizer (MGO) is not affected by parameter settings because it is parameter-free [37–39]. In addition, the algorithm perfectly balances exploration and exploitation by using four different mechanisms at all optimization stages. Moreover, since the MGO uses many finite vectors, it has an excellent ability to escape from local optima and can explore all optimization spaces. In addition, according to the experimental results of Abdollahzadeh et al., the MGO algorithm has a strong ability to solve continuous problems, with very good results when both the population size and dimensionality change. However, the MGO still has certain limitations in terms of solution diversity, local search ability, and escaping local optima. Moreover, the current MGO algorithm is mainly used to solve continuous problems, and there is no binary version for solving feature selection problems. This situation prompted us to improve the MGO for these problems and apply it to feature selection tasks. In this study, we first propose an improved mountain gazelle algorithm named IMGO, which uses an iterative chaotic map with infinite collapses (ICMIC) to initialize the gazelle population and nonlinear factors to control the coefficient vectors, includes a spiral perturbation mechanism to perturb the position of individuals and performs a depth search of the neighborhood of the optimal individuals. To verify the improved performance of this algorithm, it is evaluated on 23 benchmark functions, and its superiority is demonstrated in comparison experiments with the original algorithm and 8 well-known and newly proposed metaheuristic algorithms, namely, the WOA, PSO, GWO, marine predator algorithm (MPA) [40], NOA, Kepler optimization algorithm (KOA) [41], SSA and BA. The binary version of IMGO (BIMGO) is then developed to handle the feature selection task. The proposed BIMGO algorithm is evaluated on 16 medical datasets and compared with the binary versions of the eight metaheuristic algorithms mentioned above. The experimental results show that the proposed BIMGO algorithm outperforms the comparison algorithms in terms of the fitness value, number of selected features, and convergence. In addition, the use of the 5% Wilcoxon rank-sum test verifies that the BIMGO algorithm performs significantly better than the competing algorithms on most of the datasets. The main contributions of this paper are summarized as follows: For the problem of low population diversity of the original algorithm, in the population initialization stage, ICMIC chaotic mapping is used instead of the original random initialization to improve the diversity of the population. By introducing a nonlinear control factor, the global search in the early stage of the algorithm and the local search in the late stage of the algorithm are enhanced, and the search efficiency of the algorithm is improved. To enhance the local search ability of the algorithm and eliminate local traps, a spiral perturbation mechanism is adopted to perturb the position of individual gazelles during the iteration process. The algorithm searches for the neighborhood of the optimal individual at the later stage of each iteration, effectively enhancing the development capability of the algorithm. A binary version of the IMGO is developed to apply the proposed algorithm to the feature selection task, which is ideal for dealing with feature selection in medical data. The remainder of this paper is structured as follows. The second section describes related progress on using swarm intelligence algorithms to analyze medical data. The MGO algorithm and the proposed IMGO and BIMGO algorithms are presented in Section III. In the fourth section, the IMGO is evaluated on 23 benchmark functions. Furthermore, the applicability of the BIMGO for extracting effective features from medical data is verified. The last section discusses the conclusions and future research directions of this research. II. Related works Swarm intelligence algorithms have shown significant advantages in accomplishing feature selection tasks represented by medical data through mechanisms such as distributed computing, information sharing, and collaborative decision making. This section reviews related works on the use of swarm intelligence algorithms to solve the problem of medical data analysis, which ranges from early research on simple identification of important factors for diseases to more recent research on medical diagnosis aids. Chen et al. [42] illustrated a combination of PSO and the 1-nearest neighbor (1-NN) mechanism, which can effectively identify important factors in patients with obstructive sleep apnea (OSA). Lin et al. [43] combined the endocrine-based PSO algorithm with the artificial bee colony (ABC) algorithm and used support vector machines (SVMs) to perform sorting on specific medical datasets. Brahim Sahmad et al. [44] integrated the binary firefly algorithm, quickly identifying high-quality solutions and significantly improving the classification accuracy on medical datasets. Anter et al. [45] introduced a hybrid crow search optimization algorithm that combines chaos theory with the fuzzy c-means algorithm. The performance of this hybrid approach on medical datasets demonstrated its effectiveness and stability. Sahlol et al. [46] combined fractional order and the marine predator algorithm (MPA), which achieved high classification accuracy on a COVID-19 dataset. Asghari Varzaneh et al. [47] applied the horse herd optimization algorithm (HOA) to predict the intubation risk of hospitalized patients with COVID-19, efficiently identifying critical predictive factors for better performance. Nadimi Shahraki et al. [48] deployed a binary version of the quantum-based avian navigation optimizer algorithm (BQANA) with a threshold method for high-dimensional medical datasets. The method outperforms nine well-known binary metaheuristic algorithms in optimal feature subset selection. Elgamal et al. [49] suggested an improved reptile search algorithm (IRSA) by incorporating chaos theory and simulated annealing. This method effectively improves the search capability, performing better than the original algorithm and comparison methods on medical datasets. Wang et al. [50] introduced SVM-MPA, a novel combination of the MPA with an SVM. They constructed an effective subject-independent anterior cruciate ligament defect detection model to provide an accurate preoperative auxiliary testing method for the clinical diagnosis of anterior cruciate ligament (ACL) deficiency. Finally, Mohammad H. Nadimi-Shahraki [51] enhanced the WOA with the suggested mechanism and search strategies. They proposed the E-WOA and BE-WOA as binary versions verified on a medical disease dataset. The test results showed that the E-WOA outperformed the latest optimization methods. This algorithm was also successfully applied to diagnose COVID-19, providing a feasible model for diagnostic medical treatment. Braik et al. [52] proposed three methods based on the Capuchin search algorithm, ECSA, PCSA, and SCSA, combined the binary versions of these three algorithms with a k-nearest neighbor (KNN) classifier, and demonstrated their performance on a medical dataset. To address the challenge of rapidly increasing glaucoma infections, Singh et al. [53] proposed a hybrid algorithm based on the emperor penguin optimization algorithm and bacterial foraging optimization to extract effective features from retinal fundus baseline images, minimizing the number of features while improving the classification accuracy. This method can assist overworked ophthalmologists and prevent individuals from losing vision. To address the problems of slow convergence and imbalance between exploration and exploitation with the hunger games search (HGS), Hashim et al. [54] proposed an improved version of the HGS named mHGS, which was able to solve the feature selection problem well for Parkinson’s disease phonation datasets. Neggaz et al. [55] proposed an enhanced variant of the manta ray foraging optimizer (MRFO), MRFOSCA, using trigonometry operators inspired by the sine cosine algorithm (SCA), effectively improving the problem of convergence to local minima. This approach was used to solve the feature selection problem represented by medical datasets. Related studies have shown that various swarm intelligence algorithms have been used to select important features from medical data to assist in data analysis, as shown in Table 1. However, the aforementioned swarm intelligence algorithms have some limitations when dealing with medical datasets. First, some studies only target specific and single medical data, without the universality of analyzing medical data. Second, some studies have focused on low-dimensional datasets and may not have had the ability to analyze current high- and ultrahigh-dimensional medical data. Again, there are issues of unbalanced search strategies and low population diversity. Finally, some studies only involve medical data and are not specifically focused on medical data, and the evaluation criteria used do not account for the characteristics of medical data. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. Related researches. https://doi.org/10.1371/journal.pone.0307288.t001 To address these issues, in this study, the BIMGO algorithm was developed for feature selection for medical data, aiming to improve the robustness of the algorithm, the diversity of the solutions, and the balance of the search strategies. This algorithm was applied to the task of feature selection for medical data in multiple dimensions. III. Proposed algorithm A. Mountain gazelle optimizer The MGO algorithm is a novel swarm intelligence algorithm [39], inspired by the social behavior and group life of mountain gazelles that live around the Arabian Peninsula. Mountain gazelles will form three groups in their life: female herds, single young male herds, and territorial male herds [56]. Every gazelle can become a member of the three groups during the optimization operation. Because young male gazelles are not mature, strong enough to mate or control the female herd, selecting a search population comprising one-third of the population in the MGO incur minimal costs. The best global solution for the MGO in herd territories is adult male gazelles. The MGO uses four mechanisms for mathematical modeling, described as follows. 1) Solitary territorial males. When male gazelles age and are sufficiently muscular, they choose areas far from other territories to establish and protect their own territory. Young male gazelles engage in fights when they attempt to challenge territorial males for territory or female gazelles. Adult territorial males are influenced by young males and the current optimal individual. The territory of an adult male is modeled as.(1): (1) where malegazelle represents the location vector of the best individual. ri1 and ri2 are random numbers, with values of either 1 or 2. BH is the vector representing the effect factor of the minorities affected by search agents and nonsearch agents. BH is determined by Eq (2). The value of F, denoting the weights affected by the iteration, is calculated by Eq (3). (2)(3) In Eq (2), ra represents the interval of the presence of young individuals, and Xra is the random solution within this interval, indicating young males. In the optimization process, Mpr is the average of randomly selected search agents. r1 and r2 are random values ranging from 0 to 1. N represents the population size. In Eq (3), N1 represents a random number from a standard distribution in the problem dimensions, exp is an exponential function, Iter denotes the iteration at present, and MaxIter indicates the maximum number of iterations. To enhance the search capabilities, a coefficient vector Cofi is proposed: (4) where r3 and r4 are random parameters, with values ranging from 0 to 1. N2, N3, and N4 are random numbers distributed within the dimensions of the problem, and a is a control parameter determined by Eq (5): (5) Eq (5) shows that a depends on the iteration process, and its value range is [−2,−1). 2) Maternity herds. The continuation of the life cycle of the mountain gazelle group is inseparable from male gazelles reproducing with individuals in the female herd. Male gazelles play an important role when young males attempt to mate with females or when females give birth. Female herds are influenced by the best individual of the current iteration, randomized search agents, and young males; this behavior is expressed using Eq (6): (6) where BH is calculated using Eq (2), Cof1,i and Cof2,i are random vectors calculated through Eq (4). ri3 and ri4 are random coefficients, each with a value of either 1 or 2. Xrand represents the vector position of a random individual. 3) Bachelor male herds. When young male gazelles grow into adults, they establish their own territories and attempt to control female gazelles. During this process, young male individuals engage in violent fights with older males. This behavior can be expressed using Eq (7): (7) where X(t) is the location of the individual in the current iteration. ri5 and ri6 are random coefficients with values of either 1 or 2. r6 in Eq (8) is a random number ranging from 0 to 1. D denotes the vector of coefficients influenced by the current and optimal individuals, and its value is calculated by Eq (8): (8) 4) Migration to search for food. The maintenance of the life cycle of gazelles is inseparable from their food consumption. Gazelles continuously search for food and migrate, leveraging their impressive running and jumping abilities. This process can be expressed as Eq (9): (9) where ub and lb represent the upper and lower limits of the problem, respectively, and r7 is a random value between 0 and 1. The above four mechanisms are utilized by all individuals to produce a new generation, which is then added to the population. This process of generating a new generation is akin to a duplicate. With the completion of each generation, the whole groups are arranged in ascending order. In general, the best individuals are protected in the group, while the poor individuals are removed. B. Improved mountain gazelle optimizer To achieve high accuracy as well as fast convergence speed, four different mechanisms are adopted in the MGO. However, there are still some shortcomings, such as the imbalance between exploration and exploitation and the ability to easily fall into local optimal solutions. In an effort to improve the capability of the MGO and achieve better feature selection in medical data analysis, this paper presents an improved mountain gazelle optimizer. The IMGO method includes four innovations. First, ICMIC mapping is utilized as an initialization method, ensuring the multiplicity of the population and improving the early exploration capabilities of the algorithm. Second, the control factor a in the coefficient vector is replaced with a nonlinear factor that balances the global and local search capacities and improves the search efficiency. Third, a spiral perturbation mechanism is utilized to improve the local search capability. Eventually, the search is exploited in the optimal individual’s neighborhood to further enhance exploitation. 1) Initialization of ICMIC mapping. In the MGO algorithm, the population is initialized randomly. This initialization strategy limits the algorithm’s performance because it does not ensure the diversity of solutions and may easily cause the algorithm to fall into local optima. We use chaotic mapping as the initialization method to address this problem. The initial population obtained by this method covers the entire solution space, enhancing the global search capability. Additionally, the population created through chaotic mapping is uniformly distributed, aiding in minimizing the likelihood of becoming trapped in local optima. This approach contributes to the improvement of the convergence speed [57, 58]. Logistic mapping and tent mapping are the most common chaotic maps. However, these mapping approaches have a limited number of folds in their iterative regions. In contrast, the ICMIC [59] is an infinitely folded iterative chaotic map. Moreover, the ICMIC has high Lyapunov exponents; therefore, it has stronger chaotic properties than do other chaotic mappings. In addition, the ICMIC has the advantages of initial value sensitivity and uniform distribution. Therefore, in this paper, the ICMIC is used to initialize the population. The ergodicity of the ICMIC overcomes the shortcomings of traditional optimization algorithms by enabling better diversity in the initial state of the population, preventing premature convergence and improving the accuracy and convergence of global optimization. Eq (10) provides the mathematical expression for the ICMIC: (10) where Xi∈[−1,0)∪(0,1] is the i-th gazelle individual and α∈(0,1) is the control parameter. A good chaotic sequence can only be obtained when α>0.6; hence, we set α = 0.7 and Eq (10) is then replaced by Eq (11): (11) Algorithm 1 shows the details of population initialization. Algorithm 1. Population initialization using ICMIC mapping. ε = 0.7π; % Setting parameter values generate a random number ω0 within the range[−1.0)∪(0,1]; for each gazelle ifrom 1 to N do   ; % Using ICMIC Chaos Mapping   ; % Updating the position after chaotic mapping end for 2) Nonlinear control factor. Cofr is a coefficient vector randomly selected in each iteration, participating in the operation of three out of four mechanisms, namely TSM, MH, and BMH, and its value is largely determined by the control factor a in Eq (5). According to Eq (5), Fig 1(A) shows that the control factor linearly increases from -2 to -1. If the algorithm becomes stuck at a "nonideal point" in the initial phase, the constant change rate makes it prone to premature convergence to local optima. Therefore, we introduce a nonlinear control factor to improve the global search during the early stages, aiming for a comprehensive search for the solution domain. Additionally, the focus shifts to strengthening the local search in the later stages, seeking better possibilities within the known range. In this case, the control factor a is adjusted from Eq (5) to Eq (12). (12) where t represents the current iteration, and MaxIter is the maximum number of iterations. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Iterative chart of control factor. https://doi.org/10.1371/journal.pone.0307288.g001 As shown in Fig 1(B), the preslope is large, and the nonlinear control factor changes quickly, which is more favorable than the linear control factor for increasing the search range and enhancing the global search ability at the initial iteration. When approaching MaxIter, the linear factor still maintains a uniform change. Thus, the search range cannot be effectively controlled, and the convergence ability of the algorithm is reduced, while the nonlinear control factor changes slowly, effectively controlling the search range, enhancing the local search ability, accelerating the convergence speed, and improving the quality of finding feasible solutions. 3) Spiral perturbation mechanism. New individuals are continuously added during the iteration process through four mechanisms. However, there is no position update for the existing gazelles in the original algorithm, leading to insufficient local search capabilities and preventing the algorithm from converging quickly. Therefore, to increase the local search capacity, a spiral perturbation mechanism for gazelle individuals is introduced. The spiral search strategy is proposed in the WOA for modeling the behavior of whale populations in terms of rounding prey. During the iteration process, individual whales use this strategy to update their positions, thus increasing the diversity of individuals while ensuring the convergence speed of the algorithm. Inspired by the spiral search in the whale optimization algorithm, the current individuals are perturbed after the gazelle population is updated; then, the fitness values of the new individuals obtained after the perturbation are compared with those before the perturbation, and the better individuals are retained in this paper. The perturbation process is influenced by the global optimal individual of the current iteration, and the perturbation method is a spiral search, which is shown in Eq (13): (13) where malegazelle is the optimal individual, c is the spiral shape constant, and l is the path coefficient, which is a random number within the range of [–1, 1]. By introducing the spiral perturbation mechanism and selecting the elite individuals among the individuals before and after the perturbation, the localized search capability of the current individual and the convergence speed can be effectively enhanced. This approach also augments the diversity of individuals and optimizes the overall search efficiency. 4) Optimal individual neighborhood search. The gazelle algorithm utilizes four unique mechanisms named TSM, MH, BMH, and MSF for optimization, primarily focusing on strengthening its global search capabilities. However, its limited local search ability makes convergence challenging. To achieve a better equilibrium between exploration and exploitation throughout all optimization stages and given the high probability of the existence of globally optimal solutions within the optimal individuals and their neighborhoods, we introduced an optimal individual neighborhood search strategy. This method records the optimal gazelle and the suboptimal gazelle in each iteration, defining an area between and as the optimal individual’s neighborhood. A local search is then conducted within this neighborhood, and the random individual BNt within the neighborhood is mathematically represented by: (14) where random represents a randomly selected gazelle within the neighborhood, ubbn and lbbn are the upper and lower bounds, respectively, and t is the current iteration number. Finally, the fitness value of the individual obtained from Eq (14) is compared with that of the current best value to update the optimal individual malegazelle. Through the introduction of the optimal individual neighborhood search strategy, the local search of the neighborhood of the optimal and suboptimal individuals is strengthened, effectively improving the development and convergence abilities of the algorithm. 5) Algorithm framework. The process of the proposed IMGO is shown in Fig 2. Initially, the IMGO employs the ICMIC to initialize the gazelle population. In each iteration, nonlinear control factors are calculated. Subsequently, the IMGO updates the population in parallel using four distinct mechanisms. Then, a spiral perturbation is applied to adjust the positions of the gazelle individuals. Finally, the algorithm conducts a search in the optimal individual’s neighborhood to update the optimal individual. By improving the local exploitation capabilities and balancing global and local searches, the algorithm effectively enhances the convergence speed and overall performance. The pseudocode for the IMGO is detailed in Algorithm 2. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. The flowchart of IMGO. https://doi.org/10.1371/journal.pone.0307288.g002 Algorithm 2. Pseudocode of IMGO. 1: Initialize Population size N, maximum iterations T; 2: Set all parameters; 3: Initialize a population based on Algorithm 1; %Initialization of populations using ICMIC 4: Calculate fitness levels of gazelles; 5: While (current iteration t0.6; hence, we set α = 0.7 and Eq (10) is then replaced by Eq (11): (11) Algorithm 1 shows the details of population initialization. Algorithm 1. Population initialization using ICMIC mapping. ε = 0.7π; % Setting parameter values generate a random number ω0 within the range[−1.0)∪(0,1]; for each gazelle ifrom 1 to N do   ; % Using ICMIC Chaos Mapping   ; % Updating the position after chaotic mapping end for 2) Nonlinear control factor. Cofr is a coefficient vector randomly selected in each iteration, participating in the operation of three out of four mechanisms, namely TSM, MH, and BMH, and its value is largely determined by the control factor a in Eq (5). According to Eq (5), Fig 1(A) shows that the control factor linearly increases from -2 to -1. If the algorithm becomes stuck at a "nonideal point" in the initial phase, the constant change rate makes it prone to premature convergence to local optima. Therefore, we introduce a nonlinear control factor to improve the global search during the early stages, aiming for a comprehensive search for the solution domain. Additionally, the focus shifts to strengthening the local search in the later stages, seeking better possibilities within the known range. In this case, the control factor a is adjusted from Eq (5) to Eq (12). (12) where t represents the current iteration, and MaxIter is the maximum number of iterations. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Iterative chart of control factor. https://doi.org/10.1371/journal.pone.0307288.g001 As shown in Fig 1(B), the preslope is large, and the nonlinear control factor changes quickly, which is more favorable than the linear control factor for increasing the search range and enhancing the global search ability at the initial iteration. When approaching MaxIter, the linear factor still maintains a uniform change. Thus, the search range cannot be effectively controlled, and the convergence ability of the algorithm is reduced, while the nonlinear control factor changes slowly, effectively controlling the search range, enhancing the local search ability, accelerating the convergence speed, and improving the quality of finding feasible solutions. 3) Spiral perturbation mechanism. New individuals are continuously added during the iteration process through four mechanisms. However, there is no position update for the existing gazelles in the original algorithm, leading to insufficient local search capabilities and preventing the algorithm from converging quickly. Therefore, to increase the local search capacity, a spiral perturbation mechanism for gazelle individuals is introduced. The spiral search strategy is proposed in the WOA for modeling the behavior of whale populations in terms of rounding prey. During the iteration process, individual whales use this strategy to update their positions, thus increasing the diversity of individuals while ensuring the convergence speed of the algorithm. Inspired by the spiral search in the whale optimization algorithm, the current individuals are perturbed after the gazelle population is updated; then, the fitness values of the new individuals obtained after the perturbation are compared with those before the perturbation, and the better individuals are retained in this paper. The perturbation process is influenced by the global optimal individual of the current iteration, and the perturbation method is a spiral search, which is shown in Eq (13): (13) where malegazelle is the optimal individual, c is the spiral shape constant, and l is the path coefficient, which is a random number within the range of [–1, 1]. By introducing the spiral perturbation mechanism and selecting the elite individuals among the individuals before and after the perturbation, the localized search capability of the current individual and the convergence speed can be effectively enhanced. This approach also augments the diversity of individuals and optimizes the overall search efficiency. 4) Optimal individual neighborhood search. The gazelle algorithm utilizes four unique mechanisms named TSM, MH, BMH, and MSF for optimization, primarily focusing on strengthening its global search capabilities. However, its limited local search ability makes convergence challenging. To achieve a better equilibrium between exploration and exploitation throughout all optimization stages and given the high probability of the existence of globally optimal solutions within the optimal individuals and their neighborhoods, we introduced an optimal individual neighborhood search strategy. This method records the optimal gazelle and the suboptimal gazelle in each iteration, defining an area between and as the optimal individual’s neighborhood. A local search is then conducted within this neighborhood, and the random individual BNt within the neighborhood is mathematically represented by: (14) where random represents a randomly selected gazelle within the neighborhood, ubbn and lbbn are the upper and lower bounds, respectively, and t is the current iteration number. Finally, the fitness value of the individual obtained from Eq (14) is compared with that of the current best value to update the optimal individual malegazelle. Through the introduction of the optimal individual neighborhood search strategy, the local search of the neighborhood of the optimal and suboptimal individuals is strengthened, effectively improving the development and convergence abilities of the algorithm. 5) Algorithm framework. The process of the proposed IMGO is shown in Fig 2. Initially, the IMGO employs the ICMIC to initialize the gazelle population. In each iteration, nonlinear control factors are calculated. Subsequently, the IMGO updates the population in parallel using four distinct mechanisms. Then, a spiral perturbation is applied to adjust the positions of the gazelle individuals. Finally, the algorithm conducts a search in the optimal individual’s neighborhood to update the optimal individual. By improving the local exploitation capabilities and balancing global and local searches, the algorithm effectively enhances the convergence speed and overall performance. The pseudocode for the IMGO is detailed in Algorithm 2. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. The flowchart of IMGO. https://doi.org/10.1371/journal.pone.0307288.g002 Algorithm 2. Pseudocode of IMGO. 1: Initialize Population size N, maximum iterations T; 2: Set all parameters; 3: Initialize a population based on Algorithm 1; %Initialization of populations using ICMIC 4: Calculate fitness levels of gazelles; 5: While (current iteration t0.6; hence, we set α = 0.7 and Eq (10) is then replaced by Eq (11): (11) Algorithm 1 shows the details of population initialization. Algorithm 1. Population initialization using ICMIC mapping. ε = 0.7π; % Setting parameter values generate a random number ω0 within the range[−1.0)∪(0,1]; for each gazelle ifrom 1 to N do   ; % Using ICMIC Chaos Mapping   ; % Updating the position after chaotic mapping end for 2) Nonlinear control factor. Cofr is a coefficient vector randomly selected in each iteration, participating in the operation of three out of four mechanisms, namely TSM, MH, and BMH, and its value is largely determined by the control factor a in Eq (5). According to Eq (5), Fig 1(A) shows that the control factor linearly increases from -2 to -1. If the algorithm becomes stuck at a "nonideal point" in the initial phase, the constant change rate makes it prone to premature convergence to local optima. Therefore, we introduce a nonlinear control factor to improve the global search during the early stages, aiming for a comprehensive search for the solution domain. Additionally, the focus shifts to strengthening the local search in the later stages, seeking better possibilities within the known range. In this case, the control factor a is adjusted from Eq (5) to Eq (12). (12) where t represents the current iteration, and MaxIter is the maximum number of iterations. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Iterative chart of control factor. https://doi.org/10.1371/journal.pone.0307288.g001 As shown in Fig 1(B), the preslope is large, and the nonlinear control factor changes quickly, which is more favorable than the linear control factor for increasing the search range and enhancing the global search ability at the initial iteration. When approaching MaxIter, the linear factor still maintains a uniform change. Thus, the search range cannot be effectively controlled, and the convergence ability of the algorithm is reduced, while the nonlinear control factor changes slowly, effectively controlling the search range, enhancing the local search ability, accelerating the convergence speed, and improving the quality of finding feasible solutions. 3) Spiral perturbation mechanism. New individuals are continuously added during the iteration process through four mechanisms. However, there is no position update for the existing gazelles in the original algorithm, leading to insufficient local search capabilities and preventing the algorithm from converging quickly. Therefore, to increase the local search capacity, a spiral perturbation mechanism for gazelle individuals is introduced. The spiral search strategy is proposed in the WOA for modeling the behavior of whale populations in terms of rounding prey. During the iteration process, individual whales use this strategy to update their positions, thus increasing the diversity of individuals while ensuring the convergence speed of the algorithm. Inspired by the spiral search in the whale optimization algorithm, the current individuals are perturbed after the gazelle population is updated; then, the fitness values of the new individuals obtained after the perturbation are compared with those before the perturbation, and the better individuals are retained in this paper. The perturbation process is influenced by the global optimal individual of the current iteration, and the perturbation method is a spiral search, which is shown in Eq (13): (13) where malegazelle is the optimal individual, c is the spiral shape constant, and l is the path coefficient, which is a random number within the range of [–1, 1]. By introducing the spiral perturbation mechanism and selecting the elite individuals among the individuals before and after the perturbation, the localized search capability of the current individual and the convergence speed can be effectively enhanced. This approach also augments the diversity of individuals and optimizes the overall search efficiency. 4) Optimal individual neighborhood search. The gazelle algorithm utilizes four unique mechanisms named TSM, MH, BMH, and MSF for optimization, primarily focusing on strengthening its global search capabilities. However, its limited local search ability makes convergence challenging. To achieve a better equilibrium between exploration and exploitation throughout all optimization stages and given the high probability of the existence of globally optimal solutions within the optimal individuals and their neighborhoods, we introduced an optimal individual neighborhood search strategy. This method records the optimal gazelle and the suboptimal gazelle in each iteration, defining an area between and as the optimal individual’s neighborhood. A local search is then conducted within this neighborhood, and the random individual BNt within the neighborhood is mathematically represented by: (14) where random represents a randomly selected gazelle within the neighborhood, ubbn and lbbn are the upper and lower bounds, respectively, and t is the current iteration number. Finally, the fitness value of the individual obtained from Eq (14) is compared with that of the current best value to update the optimal individual malegazelle. Through the introduction of the optimal individual neighborhood search strategy, the local search of the neighborhood of the optimal and suboptimal individuals is strengthened, effectively improving the development and convergence abilities of the algorithm. 5) Algorithm framework. The process of the proposed IMGO is shown in Fig 2. Initially, the IMGO employs the ICMIC to initialize the gazelle population. In each iteration, nonlinear control factors are calculated. Subsequently, the IMGO updates the population in parallel using four distinct mechanisms. Then, a spiral perturbation is applied to adjust the positions of the gazelle individuals. Finally, the algorithm conducts a search in the optimal individual’s neighborhood to update the optimal individual. By improving the local exploitation capabilities and balancing global and local searches, the algorithm effectively enhances the convergence speed and overall performance. The pseudocode for the IMGO is detailed in Algorithm 2. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. The flowchart of IMGO. https://doi.org/10.1371/journal.pone.0307288.g002 Algorithm 2. Pseudocode of IMGO. 1: Initialize Population size N, maximum iterations T; 2: Set all parameters; 3: Initialize a population based on Algorithm 1; %Initialization of populations using ICMIC 4: Calculate fitness levels of gazelles; 5: While (current iteration t