TY - JOUR
AU - Yong, Hwan Seok
AB - Introduction Computed tomography (CT) is widely used for diagnosis, patient evaluation, and monitoring of body composition changes. Volumetric techniques are essential for diagnostic purposes, and recent efforts focus on deriving non-invasive imaging biomarkers for various diseases [1, 2]. Hounsfield unit values enable assessment of the muscle, adipose tissue, and bone structures’ volume and status. Chronic obstructive pulmonary disease (COPD) is a systemic disease with significant consequences, particularly in its relationship with sarcopenia [3–15]. As COPD progresses, it affects parenchymal function and the respiratory muscles, leading to decreased activity, reduced myogenic quality, oxygen utilization, and exercise capacity. Sarcopenia, characterized by diminished skeletal muscle mass, has been closely linked to COPD, as well as increased morbidity and mortality in various conditions, such as trauma and different cancers. CT plays a crucial role in diagnosing and monitoring sarcopenia by analyzing the changes in the muscle mass. The pectoralis major is a crucial respiratory muscle covering a large part of the chest and connecting the shoulder and abdomen. Studies have demonstrated its volume and density to be closely associated with various pulmonary diseases [7, 9, 13, 15–26]. Respiratory muscle features are associated with parenchymal biomarkers, including airflow limitation severity and diffusing capacity for carbon monoxide [21, 22], used in diagnosing diseases, such as asthma and COPD. McDonald et al. found that the CT-derived pectoralis muscle area offers valuable indicators of COPD morbidity and is potentially more predictive than the body mass index [19]. Additionally, patients with high-density respiratory muscles who exercise may experience reduced symptoms associated with pulmonary disease, such as decreased shortness of breath [7, 23–25]. However, the body composition assessment was usually conducted manually by an experienced radiologist; the measurements of the muscle tended to be performed manually in specific slice locations of the CT volume, showing the correlation with the volume of interest. In-slice muscle area measurements may be distorted owing to muscle contraction, breath-holding, and patient positioning during the CT scanning process. The repeatability of the process might be assured by long-term education and experience; however, it is still time-consuming and expensive. Previous studies of muscle segmentation on CT images have primarily relied on traditional image processing techniques, such as thresholding, region growing, and contouring, which may suffer from inaccuracies because of variations in image quality, contrast, and noise [27–29]. Some recent studies have attempted to employ machine learning and deep-learning approaches for muscle segmentation, showing promising results in terms of accuracy and automation [30–32]. González et al. proposed an efficient algorithm for segmenting the pectoralis muscle and adjacent structures [33], such as subcutaneous fat and the pectoralis minor, demonstrating performance within partially labeled datasets. However, as the primary goal was not automatic segmentation within a volume, the axial range was delimited. Meanwhile, recent deep learning-based approaches similar to that of our study have been reported [34–36], but these primarily focus on measuring the Pectoralis Major Area at specific locations, such as the aortic arch, in contrast to the comprehensive segmentation approach. In this study, we aimed to address these limitations by developing a deep-learning-based automated volumetric pectoralis muscle segmentation method that can measure volumes and muscle mass for various applications. This comprehensive approach may be more useful in various clinical applications. Materials and methods Overall procedures The proposed segmentation method utilizes deep-learning techniques to combine a muscle-area detection model and a segmentation model. The detection model was used to identify the vertical range of the pectoralis muscle from the patient’s CT volume dataset. Then, in the segmentation model, we segmented the actual pectoralis muscle area from the CT image slice. The model was designed to separate the detection and segmentation stages to address the weight imbalance in the training dataset. It also allows for a more efficient deep-learning process and improves the performance of the voxel-based segmentation model. Fig 1 shows a diagram illustrating the overall process flow of the proposed method. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Overall process flow diagram of the proposed method. https://doi.org/10.1371/journal.pone.0290950.g001 Detection model A binary classification model was designed to detect the vertical range of the pectoralis muscle from a patient image voxel dataset. The model was designed based on the modified EfficientNet (B6) model introduced by Tan et al. [37, 38]. As the compound scaling method was the primary concern of the model design, it showed promising results compared with other classification models with the dataset of the current study, with relatively fewer parameters and better efficiency. A fully connected layer was added to the last layer to use the feature extraction function of the model. An adaptive average pooling method was used for the linearization of the intermediate result of the model because the length of the CT image dataset may vary owing to the scan conditions and parameters, which may lead to a nonlinear input data range. The training dropout ratio was set to 0.3 to avoid overfitting [39], and an additional linear function was adopted to determine the output feature size. As the overall human organ structure information would be referred to in order to obtain vertical range detection, multiple coronal plane CT images with various locations were fused into multichannel images. With this method, we may lose the amount of image-based information, including the textures and precise volume sizes; however, it is relatively lightweight and produces accurate results because the model only aims to detect the vertical range. For the training dataset, multichannel images of the patient combined with 2.5-dimensional sagittal information were analyzed. The labeled area masks in the coronal plane were set as the annotation data and used for the training dataset. The volume data of the CT image were interpolated into smaller volumes with isotropic voxel distance to avoid anisotropic transformation from the difference between the physical and voxel distances. From the patient CT volume image, three coronal two-dimensional (2D) images were acquired from 3, 4, and 5 half-of-quarter positions of the voxelized isotropic volume. Fig 2 shows the overall multichannel image fusion process and sample results. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Multichannel coronal plane image generation method. (a) Three coronal two-dimensional images are acquired from the patient’s isotropic volume at the 3, 4, and 5 half-of-quarter positions, providing different perspectives of the pectoralis muscle area. (b) Fused multichannel image (left) and its labeled data (right) from the manually labeled pectoralis muscle area, demonstrating the combination of the three coronal images into a single multichannel image for input to the segmentation model and the corresponding ground-truth segmentation provided by expert radiologists. https://doi.org/10.1371/journal.pone.0290950.g002 In total, 10 images were acquired from a single patient by varying the acquiring position of the volume to augment the dataset and obtain the robustness of the model. A binary cross-entropy loss function with logits was applied for the loss function as the range result was expected to be binary. Moreover, the outputs were thresholded to obtain binary predictions, which were then compared to the ground-truth labels. The accuracy was calculated as the proportion of correctly classified instances in each mini-batch, considering a threshold for binary predictions. For prediction, the same procedure for acquiring multichannel coronal plane images was performed in an isotropic interpolated CT image volume. For better accuracy and vertical leakage prevention, five images were acquired by varying the acquiring position of the volume and input to predict the pectoralis major muscle. The predicted ranges of the five images were median calculated and used as the final range. Upon determining the vertical range of the pectoralis muscle presence, the corresponding slices were fed into the segmentation model to accurately delineate the axial area of the pectoralis muscle. Segmentation For the segmentation model, the conventional 2D Attention U-Net model [40] was modified with an additional fully connected layer and an attention block for additional feature attention in the upsampling and downsampling processes. Fig 3 shows a graphical illustration of the model. The Attention U-Net architecture was designed to increase the weight of the target area by adding an attention gate to the skip connection path, which provides spatial information in the U-Net model learning process. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Graphical illustration of the proposed segmentation model based on the conventional 2D attention U-Net. An additional attention block is attached to each sampling process. https://doi.org/10.1371/journal.pone.0290950.g003 The segmentation model comprised six layers in both the encoder and decoder paths, with two strides and two scale factors. Each layer consisted of a feature map with 16–512 channels. In every layer, two convolution operations were performed, maintaining the same number of channels, width, and height. Batch normalization was applied, followed by the Rectified Linear Unit as the activation function. Several modifications were made to the conventional 2D U-Net model to better address the challenges of pectoralis muscle segmentation on CT images. Skip connections and attention blocks were added to the decoder path to provide spatial information during learning and apply soft attention on the areas being learned. The final layer was enhanced with an attention block for additional feature attention during the upsampling and downsampling processes. For post-processing, a simple connected-component detection algorithm was employed to remove unnecessary disconnected small areas and prevent leakage, resulting in cleaner and more precise muscle boundaries. These modifications to the conventional 2D U-Net model were designed to address the specific challenges in pectoralis muscle segmentation from CT images and contribute to the improved performance of the proposed model. The Dice loss function was employed to optimize the model, and accuracy was computed based on the proportion of correctly classified pixels, considering a threshold for binarizing the predicted segmentation. Following the segmentation of the pectoralis muscle in each axial slice, a post-processing step was added to reduce the inter-slice gaps and enhance the overall segmentation contour quality. This step involves a series of morphological operations, including dilation and erosion, to smoothen the segmented boundaries and fill in any minor holes. Additionally, a 3D connectivity analysis is carried out to exclude any non-pectoralis major components that might have been misclassified during the segmentation process. Finally, inter-slice interpolation is used to ensure consistency and continuity between the segmented slices, thereby minimizing any discrepancies in the volume calculation. These post-processing steps are crucial to improve the segmentation’s overall robustness and accuracy, contributing to a more reliable quantitative analysis of the pectoralis muscle. Datasets: Active learning The training dataset for the models was increased using an active learning process with multiple stages. This method improved the performance of the models while reducing the labeling burden and increasing time efficiency. Over 32,000 slices of CT images from 178 patients were used for the initial training and validation datasets. The ground-truth labels for the pectoralis muscle area were manually segmented by three radiologists experienced in chest CT interpretation. The readers were blinded to the patient’s clinical information and independently delineated the pectoralis major muscle on each CT slice. In cases of disagreement between the two radiologists, a consensus was reached through discussion and consultation with a third senior radiologist to ensure the accuracy and reliability of the ground-truth labels. The manual segmentation process involved identifying the pectoralis major muscle boundaries on axial CT slices and drawing contours around the muscle area. This process was repeated for all relevant CT slices. The final muscle volume was calculated by summing the area of the muscle across all slices and multiplying by the slice thickness. Minor flaws in the results of the pectoralis muscle volume mask label were edited and used for additional training to facilitate transfer learning. The dataset was randomly divided into the training and validation datasets with a ratio of 8:2, and the data were augmented with random rotation, mirroring, elastic deformation, and scaling, among others. For better accuracy and robustness of the initial training model, the segmented results of additional cases of the CT image dataset from patients were generated using the initially trained model. These results were visually graded and modified by experienced radiologists and used as the training dataset for the next stage of the model. The overall characteristics of the patient dataset used in the active learning process are presented in Table 1. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. The overall characteristics of the patient dataset. https://doi.org/10.1371/journal.pone.0290950.t001 For the final training and validation of the model, 7,796 cases of patient CT images were obtained from the Korea University Guro Hospital. Of these, 5,131 cases were images of patients with COPD, while the remaining 2,665 cases were of normal individuals from the diagnostic cohort. All CT examinations were performed using a diagnostic routine with normal- and low-dose CT scans, and the CT images were reconstructed using a standard sharpness kernel with a simple iterative reconstruction method, which is the standard image reconstruction protocol for chest and lung imaging. The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Ethics Committee of Korea University Guro Hospital (IRB No.: 2018GR0179 and 2021GR0335). The ethics committee waived the requirement for informed consent due to the retrospective nature of the study. The overall manual segmentation was performed using commercial image archiving and labeling software (AVIEW Metrics, Coreline Soft, Seoul, Korea) and an additional in-house-developed merging module. Fig 4 shows a sample case of a labeled patient CT image in the training dataset. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Sample case of a patient CT image with an overlayed label of the pectoralis major muscle. https://doi.org/10.1371/journal.pone.0290950.g004 The noise content of CT images varies owing to many factors, such as imaging conditions, patient’s size, and patient’s shape. It could be assumed that a better segmentation result could be achieved if the standardization and leveling of the noise of the CT image preceded. Therefore, the noise reduction process for the overall CT image was performed using commercial image denoising software (ClariCT.AI, ClariPi, Seoul, Republic of Korea) as a preprocessing process. Training setup The deep-learning models were designed using the PyTorch library (1.8.1) in acceptable hardware settings using an NVIDIA RTX2080Ti GPU. Training of the entire dataset was performed with a dual NVIDIA V100 GPU based on cloud-based hardware, provided by the National IT Industry Promotion Agency. Validation process during training and visual assessment The results of the proposed method were evaluated using the Dice similarity coefficient (DSC) values, which could comparatively analyze the ground truth and predicted values. DSC was calculated by obtaining the ratio of the overlapped area between the predicted value and the ground truth divided by the sum of both areas. The DSC value was calculated as follows: where is the segmented volume with the proposed method, and is the volume labeled by experienced radiologists. The accuracy of the proposed segmentation method is higher when the value is closer to 1. The segmentation results of the proposed method were visually assessed by experienced radiologists, technicians, and clinicians. The results were graded from 1 to 10 points based on accuracy, leakage, and under/over-segmentation. Evaluation of the method based on patient follow-up data Patient chest CT follow-up data were used to evaluate the proposed segmentation method. For better accuracy of the overall process on the evaluation dataset, another 250 pairs of patients who did not have any particular findings related to the musculoskeletal system were included. The patients CT datasets were collected from Korea University Guro Hospital. The duration between the performed date of the two CT scans was less than 3 months. The CT dataset was processed using the proposed method, and the segmentation results were visually assessed by experienced radiologists to eliminate outliers and bad cases, such as a truncated muscle area, via selecting the region of interest (ROI) from the scanning process. The difference in the volume and density of the pectoralis muscle area was calculated by subtracting the labeled and predicted mask volumes and ROI in the CT dataset. Overall procedures The proposed segmentation method utilizes deep-learning techniques to combine a muscle-area detection model and a segmentation model. The detection model was used to identify the vertical range of the pectoralis muscle from the patient’s CT volume dataset. Then, in the segmentation model, we segmented the actual pectoralis muscle area from the CT image slice. The model was designed to separate the detection and segmentation stages to address the weight imbalance in the training dataset. It also allows for a more efficient deep-learning process and improves the performance of the voxel-based segmentation model. Fig 1 shows a diagram illustrating the overall process flow of the proposed method. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Overall process flow diagram of the proposed method. https://doi.org/10.1371/journal.pone.0290950.g001 Detection model A binary classification model was designed to detect the vertical range of the pectoralis muscle from a patient image voxel dataset. The model was designed based on the modified EfficientNet (B6) model introduced by Tan et al. [37, 38]. As the compound scaling method was the primary concern of the model design, it showed promising results compared with other classification models with the dataset of the current study, with relatively fewer parameters and better efficiency. A fully connected layer was added to the last layer to use the feature extraction function of the model. An adaptive average pooling method was used for the linearization of the intermediate result of the model because the length of the CT image dataset may vary owing to the scan conditions and parameters, which may lead to a nonlinear input data range. The training dropout ratio was set to 0.3 to avoid overfitting [39], and an additional linear function was adopted to determine the output feature size. As the overall human organ structure information would be referred to in order to obtain vertical range detection, multiple coronal plane CT images with various locations were fused into multichannel images. With this method, we may lose the amount of image-based information, including the textures and precise volume sizes; however, it is relatively lightweight and produces accurate results because the model only aims to detect the vertical range. For the training dataset, multichannel images of the patient combined with 2.5-dimensional sagittal information were analyzed. The labeled area masks in the coronal plane were set as the annotation data and used for the training dataset. The volume data of the CT image were interpolated into smaller volumes with isotropic voxel distance to avoid anisotropic transformation from the difference between the physical and voxel distances. From the patient CT volume image, three coronal two-dimensional (2D) images were acquired from 3, 4, and 5 half-of-quarter positions of the voxelized isotropic volume. Fig 2 shows the overall multichannel image fusion process and sample results. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Multichannel coronal plane image generation method. (a) Three coronal two-dimensional images are acquired from the patient’s isotropic volume at the 3, 4, and 5 half-of-quarter positions, providing different perspectives of the pectoralis muscle area. (b) Fused multichannel image (left) and its labeled data (right) from the manually labeled pectoralis muscle area, demonstrating the combination of the three coronal images into a single multichannel image for input to the segmentation model and the corresponding ground-truth segmentation provided by expert radiologists. https://doi.org/10.1371/journal.pone.0290950.g002 In total, 10 images were acquired from a single patient by varying the acquiring position of the volume to augment the dataset and obtain the robustness of the model. A binary cross-entropy loss function with logits was applied for the loss function as the range result was expected to be binary. Moreover, the outputs were thresholded to obtain binary predictions, which were then compared to the ground-truth labels. The accuracy was calculated as the proportion of correctly classified instances in each mini-batch, considering a threshold for binary predictions. For prediction, the same procedure for acquiring multichannel coronal plane images was performed in an isotropic interpolated CT image volume. For better accuracy and vertical leakage prevention, five images were acquired by varying the acquiring position of the volume and input to predict the pectoralis major muscle. The predicted ranges of the five images were median calculated and used as the final range. Upon determining the vertical range of the pectoralis muscle presence, the corresponding slices were fed into the segmentation model to accurately delineate the axial area of the pectoralis muscle. Segmentation For the segmentation model, the conventional 2D Attention U-Net model [40] was modified with an additional fully connected layer and an attention block for additional feature attention in the upsampling and downsampling processes. Fig 3 shows a graphical illustration of the model. The Attention U-Net architecture was designed to increase the weight of the target area by adding an attention gate to the skip connection path, which provides spatial information in the U-Net model learning process. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Graphical illustration of the proposed segmentation model based on the conventional 2D attention U-Net. An additional attention block is attached to each sampling process. https://doi.org/10.1371/journal.pone.0290950.g003 The segmentation model comprised six layers in both the encoder and decoder paths, with two strides and two scale factors. Each layer consisted of a feature map with 16–512 channels. In every layer, two convolution operations were performed, maintaining the same number of channels, width, and height. Batch normalization was applied, followed by the Rectified Linear Unit as the activation function. Several modifications were made to the conventional 2D U-Net model to better address the challenges of pectoralis muscle segmentation on CT images. Skip connections and attention blocks were added to the decoder path to provide spatial information during learning and apply soft attention on the areas being learned. The final layer was enhanced with an attention block for additional feature attention during the upsampling and downsampling processes. For post-processing, a simple connected-component detection algorithm was employed to remove unnecessary disconnected small areas and prevent leakage, resulting in cleaner and more precise muscle boundaries. These modifications to the conventional 2D U-Net model were designed to address the specific challenges in pectoralis muscle segmentation from CT images and contribute to the improved performance of the proposed model. The Dice loss function was employed to optimize the model, and accuracy was computed based on the proportion of correctly classified pixels, considering a threshold for binarizing the predicted segmentation. Following the segmentation of the pectoralis muscle in each axial slice, a post-processing step was added to reduce the inter-slice gaps and enhance the overall segmentation contour quality. This step involves a series of morphological operations, including dilation and erosion, to smoothen the segmented boundaries and fill in any minor holes. Additionally, a 3D connectivity analysis is carried out to exclude any non-pectoralis major components that might have been misclassified during the segmentation process. Finally, inter-slice interpolation is used to ensure consistency and continuity between the segmented slices, thereby minimizing any discrepancies in the volume calculation. These post-processing steps are crucial to improve the segmentation’s overall robustness and accuracy, contributing to a more reliable quantitative analysis of the pectoralis muscle. Datasets: Active learning The training dataset for the models was increased using an active learning process with multiple stages. This method improved the performance of the models while reducing the labeling burden and increasing time efficiency. Over 32,000 slices of CT images from 178 patients were used for the initial training and validation datasets. The ground-truth labels for the pectoralis muscle area were manually segmented by three radiologists experienced in chest CT interpretation. The readers were blinded to the patient’s clinical information and independently delineated the pectoralis major muscle on each CT slice. In cases of disagreement between the two radiologists, a consensus was reached through discussion and consultation with a third senior radiologist to ensure the accuracy and reliability of the ground-truth labels. The manual segmentation process involved identifying the pectoralis major muscle boundaries on axial CT slices and drawing contours around the muscle area. This process was repeated for all relevant CT slices. The final muscle volume was calculated by summing the area of the muscle across all slices and multiplying by the slice thickness. Minor flaws in the results of the pectoralis muscle volume mask label were edited and used for additional training to facilitate transfer learning. The dataset was randomly divided into the training and validation datasets with a ratio of 8:2, and the data were augmented with random rotation, mirroring, elastic deformation, and scaling, among others. For better accuracy and robustness of the initial training model, the segmented results of additional cases of the CT image dataset from patients were generated using the initially trained model. These results were visually graded and modified by experienced radiologists and used as the training dataset for the next stage of the model. The overall characteristics of the patient dataset used in the active learning process are presented in Table 1. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. The overall characteristics of the patient dataset. https://doi.org/10.1371/journal.pone.0290950.t001 For the final training and validation of the model, 7,796 cases of patient CT images were obtained from the Korea University Guro Hospital. Of these, 5,131 cases were images of patients with COPD, while the remaining 2,665 cases were of normal individuals from the diagnostic cohort. All CT examinations were performed using a diagnostic routine with normal- and low-dose CT scans, and the CT images were reconstructed using a standard sharpness kernel with a simple iterative reconstruction method, which is the standard image reconstruction protocol for chest and lung imaging. The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Ethics Committee of Korea University Guro Hospital (IRB No.: 2018GR0179 and 2021GR0335). The ethics committee waived the requirement for informed consent due to the retrospective nature of the study. The overall manual segmentation was performed using commercial image archiving and labeling software (AVIEW Metrics, Coreline Soft, Seoul, Korea) and an additional in-house-developed merging module. Fig 4 shows a sample case of a labeled patient CT image in the training dataset. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Sample case of a patient CT image with an overlayed label of the pectoralis major muscle. https://doi.org/10.1371/journal.pone.0290950.g004 The noise content of CT images varies owing to many factors, such as imaging conditions, patient’s size, and patient’s shape. It could be assumed that a better segmentation result could be achieved if the standardization and leveling of the noise of the CT image preceded. Therefore, the noise reduction process for the overall CT image was performed using commercial image denoising software (ClariCT.AI, ClariPi, Seoul, Republic of Korea) as a preprocessing process. Training setup The deep-learning models were designed using the PyTorch library (1.8.1) in acceptable hardware settings using an NVIDIA RTX2080Ti GPU. Training of the entire dataset was performed with a dual NVIDIA V100 GPU based on cloud-based hardware, provided by the National IT Industry Promotion Agency. Validation process during training and visual assessment The results of the proposed method were evaluated using the Dice similarity coefficient (DSC) values, which could comparatively analyze the ground truth and predicted values. DSC was calculated by obtaining the ratio of the overlapped area between the predicted value and the ground truth divided by the sum of both areas. The DSC value was calculated as follows: where is the segmented volume with the proposed method, and is the volume labeled by experienced radiologists. The accuracy of the proposed segmentation method is higher when the value is closer to 1. The segmentation results of the proposed method were visually assessed by experienced radiologists, technicians, and clinicians. The results were graded from 1 to 10 points based on accuracy, leakage, and under/over-segmentation. Evaluation of the method based on patient follow-up data Patient chest CT follow-up data were used to evaluate the proposed segmentation method. For better accuracy of the overall process on the evaluation dataset, another 250 pairs of patients who did not have any particular findings related to the musculoskeletal system were included. The patients CT datasets were collected from Korea University Guro Hospital. The duration between the performed date of the two CT scans was less than 3 months. The CT dataset was processed using the proposed method, and the segmentation results were visually assessed by experienced radiologists to eliminate outliers and bad cases, such as a truncated muscle area, via selecting the region of interest (ROI) from the scanning process. The difference in the volume and density of the pectoralis muscle area was calculated by subtracting the labeled and predicted mask volumes and ROI in the CT dataset. Results The segmentation results of the proposed method are presented in Figs 5 and 6. The comparison results with the labeled dataset and its predicted result are presented in Fig 5 and the overall volume of the predicted pectoralis muscle is presented in Fig 6. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. The labeled image of the patient CT dataset and its predicted result using the proposed method. https://doi.org/10.1371/journal.pone.0290950.g005 Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Sample results of the predicted pectoralis muscle area from the patient CT image dataset. https://doi.org/10.1371/journal.pone.0290950.g006 In the proposed method, the training accuracy and loss value for the pectoralis range detection model were 0.9954 and 0.0725, respectively. For the segmentation model, the loss values were 0.9836 and 0.1725. Upon visual assessment by experienced radiologists, our method received a score of 9.3 out of 10 points for pectoralis muscle range detection and a score of 8.7 points for overall mask segmentation. The radiologists noted that while the overall performance of the segmentation was good, there were occasional minor issues, such as leakage of the mammary gland tissue. The overall visual assessment results are presented in Table 2. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 2. The visual assessment score of proposed method in active learning process. https://doi.org/10.1371/journal.pone.0290950.t002 The comparison result from the range detection model is presented in Fig 7. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 7. Sample result of the predicted range from the patient CT image data. (a) Multichannel fused coronal image used as input data, providing a comprehensive representation of the pectoralis muscle area. (b) The predicted range results from the range detection model, illustrating the model’s ability to segment the pectoralis muscle area in the input data. (c) Binarized range area, showcasing the final segmentation of the pectoralis muscle area after post-processing steps, such as connected-component detection. CT, computed tomography. https://doi.org/10.1371/journal.pone.0290950.g007 In the evaluation of the proposed method, the volume and density of the predicted pectoralis muscle from two different CT datasets were compared. The comparison results of the volume showed quite small changes despite the different time and scanning parameters. The average ratios of volume and density had standard deviations of 20.68% and 2.14%, respectively. After excluding outliers identified in the visual assessment process, the standard deviations were reduced to 12.97% and 2.05%, respectively. The calculated average ratios and absolute differences of volume and density are presented in Table 3. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 3. Validation result comparing the pectoralis muscle area in two different CT image datasets. https://doi.org/10.1371/journal.pone.0290950.t003 Discussion In this study, we attempted to develop a deep-learning-based automated volumetric pectoralis muscle segmentation method able to measure volume and muscle mass for various applications with encouraging results. The segmentation model was divided into two steps because we thought that the CT image should be modified minimally; any filtering or interpolation method would ruin the segmentation result of the image because the values were highly related to the distribution of the Hounsfield unit, which was the derivation of the delicate CT physics. Therefore, we tended to use the CT volume as-is. The range detection of the CT volume was separated into an independent model, and the CT slice images were used as-is to the 2D segmentation model. The range detection method using multichannel images fused from multi-coronal images was designed because the characteristics of the pectoralis muscle range were shown to be different from other ranges of the dataset; it was denser than the chest area, sparse than the abdominal area, and included specific structures, such as the bronchus and shoulder bone. We thought that the axial/sagittal information should be integrated to obtain the precise range; however, we did not wish to use the entire volume to obtain a simple range of the axial image. Therefore, we designed a multichannel image intended to reduce the image dimension without losing all the structural information of the volume. We defined the pectoralis major muscle from various studies [14, 19]; the ranges were usually set as the clavicle point as the upper origin, the bicipital groove as insertion, and the disconnected point from the abdominal muscle as the lower bound. However, in this study, the measurement of the muscle was conducted using CT, and the mid-clavicle point tended to be unattached, while the bicipital groove was excluded from the chest CT imaging range. Interestingly, by having range discrimination, the accuracy of the manual drawing of the range was shown to be relatively lower than the results of the proposed method. It was assumed to be a considerably difficult task, even for a specialist, to annotate the precise area of the muscle range in the CT images. The efficiency and accuracy of each component can be improved by dividing the process into two proposed models. Therefore, we designed the three-dimensional (3D) U-Net model and the proposed model and compared the initial results, which showed that a large amount of segmentation leakage to the abdomen occurred with the 3D U-Net model. In comparison, the number of occurrences of leakage was relatively small with the proposed model, and we assumed that we enhanced the accuracy of the range detection by implementing a dedicated, separated model and additional data augmentation, which could lead to a more accurate final result. The model derives segmentation results in the form of a 3D volume; however, the upper and lower bounds of the pectoralis major muscle derived with the model remain debatable, which might be owing to the lack of a fully covered muscle label dataset. However, the results in the 2D slice were very encouraging and accurate, and the area segregation with the pectoralis minor muscle was also suitable. Therefore, the proposed method can be used as a segmentation method for 2D slice images. For data preprocessing, we used CT denoising software because we believed that the noise from the CT image might affect the calculation of the image feature. Currently, most CT scans are performed using a low-dose scan protocol for the radiation safety principle; ALARA stands for “as low as reasonably achievable,” stating that if receiving that dose has no direct benefit, we should try to avoid it even if it is a small dose. The software provides denoising techniques that can be applied to low-dose CT and shows almost the same level of noise reduction as the iterative reconstruction (IR) technique from major CT vendors. Moreover, because the software does not require raw projection data from the scanning process, a denoising technique can be adopted for retrospective CT data. Owing to these advantages, various studies have used CT images, and validation studies have used software for data preprocessing. The utilization of pectoralis muscle features as an imaging biomarker holds promise; however, manual measurements of these features in specific slice locations in CT volumes can be prone to inaccuracies due to muscle contraction, hold-breath level, and patient positioning during the scanning process [1, 19, 41–46]. This can lead to distorted results and difficulty in accurately measuring the volume of interest. The former range detection model was designed with the Inception v3 model; however, it was changed to EfficientNet (B6) because the model showed almost the same or better result, with smaller model size and better convenience in coding, variant input types, and templates. EfficientNet simultaneously considers three scaling factors of the model as follows: depth, width, and resolution, which we explored through various studies and methods. Existing models rarely consider multiple scaling factors simultaneously, and only the degree of depth scaling or width scaling was adjusted. EfficientNet achieves higher accuracy with even fewer parameters through compound scaling, leveraging the model for feature extraction. For the segmentation model, various models were developed and tested, including U-Net, TransFuse-L, HarDNet-MSEG, and Attention U-Net models; among them, Attention U-Net models were selected since they have the best performance and focus only on specific areas of medical images, which would make the model suitable for the segmentation of this dataset. We evaluated the segmentation performance along with four key metrics as follows: the mean Dice coefficient, mean intersection over union, precision, and accuracy. The precision and accuracy were both calculated using pixel-wise. We compared our method with other state-of-the-art methods and typical methods using the same dataset. The proposed segmentation method performed well in segmenting the pectoralis muscle. The mean Dice coefficient, mean intersection over union, precision, and accuracy of the proposed segmentation model were 0.972 and 0.949, respectively, whereas those of TransFuse-L, HarDNet-MSEG, and conventional U-Net were 0.896/0.818, 0.938/0.888, 0.940/0.891, respectively. In the discussion of the obtained results, it is worth noting the variability introduced by the inclusion of the upper bound of the pectoralis muscle, which is attached to the arm. This region is not typically fully scanned in chest CT, hence, contributes to the variability of the results. The average ratios of volume and density showed a significant standard deviation, particularly when this region was included. Upon excluding outliers during the visual assessment, which did not account for the arm range, a decrease in standard deviation was observed. However, it is postulated that the improved accuracy was not solely due to outlier exclusion, but also likely attributed to the exclusion of the arm-attached region of the pectoralis muscle. Future work may address this variability by improving the segmentation accuracy at the boundaries of the muscle and refining the scanning technique to cover the entire muscle region. This study has a strict limitation in that the pectoralis major muscle segmentation has a relatively small impact on the image biomarker domain. However, it might have very powerful potential in that the amount of muscle development and loss would be a direct factor associated with sarcopenia, which is currently a major concern and a powerful index in the field of rehabilitation medicine. Conclusion The deep learning-based pectoralis muscle segmentation method developed in this study demonstrated promising results in accurately segmenting the pectoralis muscle in chest CT images. The proposed method has the potential to be employed for measuring the pectoralis muscle volume, mass, and density in various clinical and research settings. By facilitating the quantitative assessment of muscle-related parameters, this technique could contribute to the investigation of muscle-related disorders and the evaluation of patient outcomes, further expanding the method’s potential utility.
TI - Deep learning-based pectoralis muscle volume segmentation method from chest computed tomography image using sagittal range detection and axial slice-based segmentation
JF - PLoS ONE
DO - 10.1371/journal.pone.0290950
DA - 2023-09-05
UR - https://www.deepdyve.com/lp/public-library-of-science-plos-journal/deep-learning-based-pectoralis-muscle-volume-segmentation-method-from-0XBVasJI74
SP - e0290950
VL - 18
IS - 9
DP - DeepDyve
ER -