TY - JOUR AU - Guda, Chittibabu AB - Introduction It is estimated that human bodies are inhabitated by an average of 500 - 1000 different microbial species [1,2]. The complex role of human microbiome in shaping human health and disease has been extensively investigated in recent years due to the advancement of metagenome sequencing technologies [3–5]. Generation of massive volumes of sequencing data poses new challenges for the data analytics and interpretation of results in the microbiome research. Likewise, advanced analytical and visualization tools in this research arena can improve our ability to understand the roles of microbes in diverse environments and how they interact with each other and their human hosts. Thorough analysis of microbiome data consists of two essential components: the upstream community profiling that include alpha and beta diversity analysis, taxonomic profiling and abundance estimation and the downstream characterization of microbial communities such as differential abundance estimation and functional and metabolic profiling. In recent years, several data analysis and visualization methods have been developed for microbiome data analysis [6–8]. For 16S rRNA, raw sequence reads were initially processed and clustered into operational taxonomical units (OTUs) using a variety of reputed tools, such as MOTHUR [9] or QIIME [10] and QIIME2 [11]. Similarly, the whole genome sequencing reads were processed using DIAMOND+MEGAN [12], which aligns all reads against a protein reference database. Similarly, tools such as Vegan [13], MicrobiomeAnalyst [14], MicobiomeR [15], Metavizr [16], Microbiome helper [17], Phyloseq [18], Animalcules [19], WisDOM [20] were developed for data analysis and visualization with each tool having varying capabilities and limitations. A comparative analysis of different tasks performed by the popular metagenomics tools is provided in Table 1. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. Comparison of MetaDAVis and other popular microbiome analysis tools. https://doi.org/10.1371/journal.pone.0319949.t001 Current metagenomics tools mainly support taxonomic profiling and abundance estimation, alpha and beta diversity analysis, dimension reduction visualization, and differential abundance estimation. Similarly, many independent statistical and machine-learning approaches have been developed to perform the above tasks which were not well integrated with these methods limiting the options for users to test and visualize the output from different tools. For instance, eventhough QIIME2 and Mothur provide excellent analytical and visualization tools, they do not support the whole genome sequencing (WGS) data analysis. Microbiome helper has a collection of scripts in multiple programming languages to facilitate interaction and interoperability among multiple tools but offers limited interactive visualization capabilities. On the other hand, STAMP [21] enables statistical analysis of taxonomic and functional profiles with various visualization tools, but it lacks abundance distribution and diversity index analyses. Some of the R-based metagenomics packages such as microbiomeR package provides only command-line workflows. Metavizr offers a graphical-user interface (GUI) with limited metagenomic visualizations. Similarly, R Shiny-based applications such as Phyloseq has useful tools for annotation, visualization, and diversity analysis but does not provide abundance analysis. More recent tools such as Animalcules, offers good interactive features in command-line and GUI modes for alpha/beta diversity and differential abundance analysis between two conditional groups but not for the multiple group comparisions. However, this program lacks a web interface. Lastly, wiSDOM, an R shiny standalone and web-based application provides many diversity profiling and statistical analysis functions but only works with the 16S rRNA data (Table 1). Moreover, most of the existing tools also require programming expertise and significant effort from the user to install and configure different programming languages such as R, Matlab, and Python on local servers. To address most of these issues, here we present an interactive Metagenome Data Analysis and Visualization (MetaDAVis) tool using an R-based Shiny application and web interface. The rich set of features offered by MetaDAVis are presented in Table 1 in comparison to the existing methods. There are six functional modules offered by our tool, where each module can perform a subset of tasks based on the chosen option using multiple methods. The novely of MetaDAVis lies in its design that enables it to function interactively by taking the user’s choice of methods and variables as input and provide publication-quality plots and result tables that can be downloaded in different formats. Design and implementation Multiple R packages listed in S1 Table were used to create and implement MetaDAVis, which can be installed through Github. It requires R package version 4.4.2 or higher and Shiny package version 1.10.0 or higher. After loading the dependent libraries in R, users can launch the R Shiny GUI on a desktop using shiny::runGitHub(“MetaDAVis”, “gudalab”), or access on the web at the URL: https://www.gudalab-rtools.net/MetaDAVis (S1 Fig). Example datasets used for the development work include 16S rRNA (NCBI SRA: SRP128892) [22] and whole genome sequencing reads (NCBI SRA: SRP108707) [23] from inflammatory bowel disease, which were processed using Qiime2, Diamond and MEGAN in our previous studies [24]. Input file formats Our application accepts files in.txt,.tsv, or.csv format. Users can directly upload Level 7 Qiime2 results generated using Greengenes or Silva. Additionally, it supports MEGAN data from whole metagenome sequences (ensure to remove the metadata column if included in the level7.csv file from Qiime2). For Qiime2 input files, the first column serves as an index containing sample IDs, while the second to Nth columns represent taxonomy (S2A Fig). For MEGAN input files, the first seven columns (Level_1 to Level_7) are followed by sample names (S2B Fig). If users wish to upload their files, the first seven columns should contain Kingdom, Phylum, Class, Order, Family, Genus, and Species, followed by sample names (S2C Fig). Metadata files must have two columns. The first column should list sample IDs that match those in the count data input, while the second column, Condition, indicates a user-defined categorical variable, such as “case” and “control” (for two or more groups) (S2D Fig). Users can also refer to our example count data and metadata files available on the tool’s upload page (S3 Fig) or example datasets from our GitHub repository (https://github.com/GudaLab/MetaDAVis). Guidelines Once the Input files are uploaded, each of the six modues in MetaDAVis can be used independently in any order. This tool was tested in Linux (RedHat and Ubuntu) and Windows 10 and 11. A user’s manual can be accessed at https://www.gudalab-rtools.net/MetaDAVis/manual/MetaDAVis_manual.pdf to aid in the installation and usage of the application. Summary tables were developed using the DataTables (DT) package to display results in up to 100 rows, while the entire tables can be downloaded as.csv files. Similarly, graphical plots were downloaded in multiple formats using the downloadHandler function from shiny packages. We implemented custom color options using the RColorBrewer package, allowing users to download the figures in the same color format coding format in all the modules except for correlation analysis and MaAsLin3 results. We also provided example data (‘Example Data (To test our tool)’ under the ‘Select Input Format’ section. This feature helps demonstrate our tool’s performance, offering reassurance to users Input file formats Our application accepts files in.txt,.tsv, or.csv format. Users can directly upload Level 7 Qiime2 results generated using Greengenes or Silva. Additionally, it supports MEGAN data from whole metagenome sequences (ensure to remove the metadata column if included in the level7.csv file from Qiime2). For Qiime2 input files, the first column serves as an index containing sample IDs, while the second to Nth columns represent taxonomy (S2A Fig). For MEGAN input files, the first seven columns (Level_1 to Level_7) are followed by sample names (S2B Fig). If users wish to upload their files, the first seven columns should contain Kingdom, Phylum, Class, Order, Family, Genus, and Species, followed by sample names (S2C Fig). Metadata files must have two columns. The first column should list sample IDs that match those in the count data input, while the second column, Condition, indicates a user-defined categorical variable, such as “case” and “control” (for two or more groups) (S2D Fig). Users can also refer to our example count data and metadata files available on the tool’s upload page (S3 Fig) or example datasets from our GitHub repository (https://github.com/GudaLab/MetaDAVis). Guidelines Once the Input files are uploaded, each of the six modues in MetaDAVis can be used independently in any order. This tool was tested in Linux (RedHat and Ubuntu) and Windows 10 and 11. A user’s manual can be accessed at https://www.gudalab-rtools.net/MetaDAVis/manual/MetaDAVis_manual.pdf to aid in the installation and usage of the application. Summary tables were developed using the DataTables (DT) package to display results in up to 100 rows, while the entire tables can be downloaded as.csv files. Similarly, graphical plots were downloaded in multiple formats using the downloadHandler function from shiny packages. We implemented custom color options using the RColorBrewer package, allowing users to download the figures in the same color format coding format in all the modules except for correlation analysis and MaAsLin3 results. We also provided example data (‘Example Data (To test our tool)’ under the ‘Select Input Format’ section. This feature helps demonstrate our tool’s performance, offering reassurance to users Results MetaDAVis is a versatile program that accepts the output of several primary data analysis tools such as MEGAN or Qiime2 as input, performs interactive downstream analyses, and generates a variety of visually appealing plots in different formats that can be directly used for presentations and publications. We designed MetaDAVis to include six functional modules that cover the commonly used tasks in metagenomic data analysis. These include 1) Taxonomic distribution, 2) Taxonomic diversity, 3) Dimension reduction, 4) Correlation analysis, 5) Heatmap generation, and 6) Differential abundance analysis (between two or more groups). Each module performs a distinct task in the workflow, where users have the ability to select various thresholds and algorithmic and visualization parameters to generate custom plots. Fig 1 illustrates the outputs from the tasks performed by MetaDAVis. All R packages with corresponding references, github links, and the task performed by each module are summarized in S1 Table. Publishing quality results from MetaDAVis analysis can be downloaded in seven different formats such as JPG, TIFF, PDF, SVG, BMP, EPS, and PS. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Workflow and example output of MetaDAVis. (a) Group and sample-based abundance stacked plot, (b) Alpha diversity and beta diversity box plot, (c) 3D PCA, t-SNE and UMAP orientation plots, (d) taxa and sample-based correlation plot, (e) heatmap of the abundance values, (f) differential abundance analysis generates boxplots for grouped and individual significant taxa, volcano plot and heatmap; two groups implemented with Wilcoxon Rank Sum, t-test, methagenomeSeq, DESeq2, Limma-Voom, edgeR, lefser, MaAsLin3; Multiple groups were implemented with Kruskal-Wallis test and ANOVA. https://doi.org/10.1371/journal.pone.0319949.g001 Data summary and distribution analysis MetaDAVis accepts the output formats of Qiime2 (generated using Greengenes or Silva) or MEGAN (.csv &.tsv) as input to generate relative abundance and taxonomic distribution plots as per the user selected options (Fig 1A). Relative abundance plots can be generated at seven hierarchical taxonomic levels (Kingdom, Phylum, Class, Order, Family, Genus, and Species) and results can be visualized in multiple plots and tables. For example, a distribution box plot for individual samples and their comparison groups was shown in S4A–S4C Fig. Diversity analysis Alpha diversity is calculated using read count or relative abundance data within a sample and compared between groups. We have implemented seven different methods including Observed, Chao1, ACE, Shannon, Simpson, Inverse Simpson, and Fisher from the phyloseq package [18] for α-diversity calculation and the results can be visualized as box or violin plots (Fig 1B) with their summary table (S5A–S5D Fig). Users can also perform the Wilcoxon test to display the statistical significance (p-value) using the microbiomeutilities package [25]. Beta diversity was calculated using phyloseq and vegan [13] packages (Fig 1B). We have incorporated the adonis2 function and defined the parameters such as diversity methods, number of permutations, square root of dissimilarities. Users can choose any one of the methods (bray-curtis, jaccard, manhattan, euclidean, canberra, kulczynski, gower, altGower, morisita, horn, clark, mountford, raup, binomial, chao, cao, mahalanobis, chisq, chord, hellinger, aitchison, and robust.aitchison) for beta diversity calculation using the integrated distance matrices. In addition, users can select any machine learning algorithm including PCoA, NMDS, DCA, RDA, and MDS for assessing the between-sample microbial diversity (S5E–S5I Fig). Dimension reduction A critical step in any data analysis is visualizing and summarizing highly variable data in a lower-dimensional space. We implement two and three commonly used dimensionality reduction techniques (Fig 1C) including principal components analysis (PCA) in 2D and 3D with coord_equal(ratio = 1) to get the consistent scale [26], t-distributed stochastic neighbor embedding (t-SNE) [27], and uniform manifold approximation and projection (UMAP) [28]. PCA is a linear dimensionality reduction method where the first three axes explain maximum amount of variation. In contrast, t-SNE and UMAP are non-linear methods for mapping data to a lower-dimensional embedding. We have incorporated six methods from the scater package [29] to plot the t-SNE and UMAP: counts, rclr, hellinger, pa, rank, and relabundance to plot the dimension reduction. The plotted dimension reduction values were provided in separate tables in (S6A–S6L Fig). Correlation analysis We implemented both taxon-based and sample-based correlations using GGally, which is an extension to [30] using ggcorr function to call Pearson, Kendall, and Spearman methods (Fig 1D). Users can check the correlation for each condition separately or select multiple options together using the dropdown menu. Similarly, sample-based correlations can be calculated separately for each group of samples under specific conditions or combined across conditions. Correlation plots and summary matrices can be generated by the user with their method of choice (S7A–S7C Fig). Generating heatmaps To visualize the relative abundance diversity among the samples, we have implemented a heatmap using ComplexHeatmap [31] and scales [32], with multiple options to display or hide the row and column names and cladograms, clustering methods for rows and columns using options such as single, complete, average (UPGMA), mcquitty (WPGMA), median (WPGMC), and centroid (UPGMC), normalization methods, such as scale, minmax, log, row normalization, column normalization, and none (Fig 1E; S8A and S8B Fig). Differential abundance analysis For pair-wise comparison, the generalized linear model-based methods, including DESeq2 [33] and edgeR [34], Two-sample t-test, Wilcoxon Rank Sum test, metagenomeSeq [35], limma-voom [36], Linear Discriminant Analysis Effect Size (LEfSe) lefser [37] and (Microbiome Multivariable Association with Linear Models) MaAsLin3 [38] were used to identify the taxa with different abundances in two different groups. We converted the raw count value to relative frequency using the formula (Relative Frequency = (Subgroup frequency/ Total frequency) * 100)) for the Wilcoxon Rank Sum test (wilcox.test) and t-test (t.test) statistical analyses. However, metagenomeSeq, DESeq2, Limma-Voom, edgeR lefser and MaAsLin3 have in-built algorithms to find statistically significant biomarkers. For multiple testing, biomarker candidates can be filtered using user-specified p-value or false discovery rate (FDR, q-value) from the Benjamini-Hochberg procedure [39]. Users have the flexibility to adjust the FDR or p-value based on their needs (default is < 0.05). Results can be downloaded either as the grouped or individual box plot for each taxon, or as volcano plots or heatmaps of significantly identified taxa (Fig 1F; S9A–S9F Fig) with summary tables (S2 Table). MaAsLin3 generates multiple tables and figures, and we provide these result files in a compressed zip format for ease of access. For analyses involving multiple-group comparisons such as control, case 1 and case 2, we implemented the Kruskal-Wallis (kruskal.test) and ANOVA (Analysis of variance) to identify differentially abundant taxonomic markers. In addition, post-hoc test that calculates p-values for pairwise comparisons among the members of the group was implemented using dunn.test package in the Kruskal-Wallis test. Likewise, TukeyHSD was used under ANOVA testing. We have applied the Benjamini-Hochberg FDR or p-value and the post hoc test. These results can be downloaded similar to those from the two-group comparisons. MetaDAVis provides a graphical user interface through R/Shiny, which can be used even by those without prior programming knowledge. The tools such as vegan, Mothur, MicrobiomeR, Microbiomehelper, and Qiime2 are only command-line interfaces, which limits their usage without prior programming experience. Furthermore, several tools (as presented in Table 1) impose the burden of importing/exporting additional packages, which also requires programming skills. MetaDAVis can be installed locally for standalone use or accessed via an user-friendly web interface to analyze both 16S rRNA and WGS data (Table 1). Hence, it offers more flexibility for use by both seasoned programmers and non-programmers. MataDAVis application is embedded with rich sets of options in each module to choose a variety of methods and perform highly customizable analyses for microbiome sequencing data. For example, it allows users to analyze data at seven different multiple taxonomic levels, provides multiple options for data normalization and distribution analysis, facilitates visualization of data using PCA, t-SNE, UMAP with multiple methods within each approach, and similarly, offers multiple methods to carry out differential abundance analysis, and supports differential analysis and visualization for pairwise and group comparisons. Each of the six modules in MetaDAVis can be used in no particular order using outputs from other methods as inputs, which adds a lot flexibility for users to build and carryout customizable data analysis pipelines. The primary advantage of using MetaDAVis over existing methods is the ease of accessing many independent statistical and machine-learning tools all in one platform, seamlessely, to carryout highly specialized and refined microbiome data analyses using 16S rRNA or WGS datasets. Another advantage is its rich set of visualization tools and graphical outputs. Each module generates publishing quality plots and summary tables, where the images can be downloaded in seven different formats and the tables are downloaded as.CSV files for further use. MetaDAVis is highly flexible for customization of data pipelines and it can be broadly used without any programming background. We believe that MetaDAVis tool is a unique and highly versatile platform that broadly supports microbiome research. Data summary and distribution analysis MetaDAVis accepts the output formats of Qiime2 (generated using Greengenes or Silva) or MEGAN (.csv &.tsv) as input to generate relative abundance and taxonomic distribution plots as per the user selected options (Fig 1A). Relative abundance plots can be generated at seven hierarchical taxonomic levels (Kingdom, Phylum, Class, Order, Family, Genus, and Species) and results can be visualized in multiple plots and tables. For example, a distribution box plot for individual samples and their comparison groups was shown in S4A–S4C Fig. Diversity analysis Alpha diversity is calculated using read count or relative abundance data within a sample and compared between groups. We have implemented seven different methods including Observed, Chao1, ACE, Shannon, Simpson, Inverse Simpson, and Fisher from the phyloseq package [18] for α-diversity calculation and the results can be visualized as box or violin plots (Fig 1B) with their summary table (S5A–S5D Fig). Users can also perform the Wilcoxon test to display the statistical significance (p-value) using the microbiomeutilities package [25]. Beta diversity was calculated using phyloseq and vegan [13] packages (Fig 1B). We have incorporated the adonis2 function and defined the parameters such as diversity methods, number of permutations, square root of dissimilarities. Users can choose any one of the methods (bray-curtis, jaccard, manhattan, euclidean, canberra, kulczynski, gower, altGower, morisita, horn, clark, mountford, raup, binomial, chao, cao, mahalanobis, chisq, chord, hellinger, aitchison, and robust.aitchison) for beta diversity calculation using the integrated distance matrices. In addition, users can select any machine learning algorithm including PCoA, NMDS, DCA, RDA, and MDS for assessing the between-sample microbial diversity (S5E–S5I Fig). Dimension reduction A critical step in any data analysis is visualizing and summarizing highly variable data in a lower-dimensional space. We implement two and three commonly used dimensionality reduction techniques (Fig 1C) including principal components analysis (PCA) in 2D and 3D with coord_equal(ratio = 1) to get the consistent scale [26], t-distributed stochastic neighbor embedding (t-SNE) [27], and uniform manifold approximation and projection (UMAP) [28]. PCA is a linear dimensionality reduction method where the first three axes explain maximum amount of variation. In contrast, t-SNE and UMAP are non-linear methods for mapping data to a lower-dimensional embedding. We have incorporated six methods from the scater package [29] to plot the t-SNE and UMAP: counts, rclr, hellinger, pa, rank, and relabundance to plot the dimension reduction. The plotted dimension reduction values were provided in separate tables in (S6A–S6L Fig). Correlation analysis We implemented both taxon-based and sample-based correlations using GGally, which is an extension to [30] using ggcorr function to call Pearson, Kendall, and Spearman methods (Fig 1D). Users can check the correlation for each condition separately or select multiple options together using the dropdown menu. Similarly, sample-based correlations can be calculated separately for each group of samples under specific conditions or combined across conditions. Correlation plots and summary matrices can be generated by the user with their method of choice (S7A–S7C Fig). Generating heatmaps To visualize the relative abundance diversity among the samples, we have implemented a heatmap using ComplexHeatmap [31] and scales [32], with multiple options to display or hide the row and column names and cladograms, clustering methods for rows and columns using options such as single, complete, average (UPGMA), mcquitty (WPGMA), median (WPGMC), and centroid (UPGMC), normalization methods, such as scale, minmax, log, row normalization, column normalization, and none (Fig 1E; S8A and S8B Fig). Differential abundance analysis For pair-wise comparison, the generalized linear model-based methods, including DESeq2 [33] and edgeR [34], Two-sample t-test, Wilcoxon Rank Sum test, metagenomeSeq [35], limma-voom [36], Linear Discriminant Analysis Effect Size (LEfSe) lefser [37] and (Microbiome Multivariable Association with Linear Models) MaAsLin3 [38] were used to identify the taxa with different abundances in two different groups. We converted the raw count value to relative frequency using the formula (Relative Frequency = (Subgroup frequency/ Total frequency) * 100)) for the Wilcoxon Rank Sum test (wilcox.test) and t-test (t.test) statistical analyses. However, metagenomeSeq, DESeq2, Limma-Voom, edgeR lefser and MaAsLin3 have in-built algorithms to find statistically significant biomarkers. For multiple testing, biomarker candidates can be filtered using user-specified p-value or false discovery rate (FDR, q-value) from the Benjamini-Hochberg procedure [39]. Users have the flexibility to adjust the FDR or p-value based on their needs (default is < 0.05). Results can be downloaded either as the grouped or individual box plot for each taxon, or as volcano plots or heatmaps of significantly identified taxa (Fig 1F; S9A–S9F Fig) with summary tables (S2 Table). MaAsLin3 generates multiple tables and figures, and we provide these result files in a compressed zip format for ease of access. For analyses involving multiple-group comparisons such as control, case 1 and case 2, we implemented the Kruskal-Wallis (kruskal.test) and ANOVA (Analysis of variance) to identify differentially abundant taxonomic markers. In addition, post-hoc test that calculates p-values for pairwise comparisons among the members of the group was implemented using dunn.test package in the Kruskal-Wallis test. Likewise, TukeyHSD was used under ANOVA testing. We have applied the Benjamini-Hochberg FDR or p-value and the post hoc test. These results can be downloaded similar to those from the two-group comparisons. MetaDAVis provides a graphical user interface through R/Shiny, which can be used even by those without prior programming knowledge. The tools such as vegan, Mothur, MicrobiomeR, Microbiomehelper, and Qiime2 are only command-line interfaces, which limits their usage without prior programming experience. Furthermore, several tools (as presented in Table 1) impose the burden of importing/exporting additional packages, which also requires programming skills. MetaDAVis can be installed locally for standalone use or accessed via an user-friendly web interface to analyze both 16S rRNA and WGS data (Table 1). Hence, it offers more flexibility for use by both seasoned programmers and non-programmers. MataDAVis application is embedded with rich sets of options in each module to choose a variety of methods and perform highly customizable analyses for microbiome sequencing data. For example, it allows users to analyze data at seven different multiple taxonomic levels, provides multiple options for data normalization and distribution analysis, facilitates visualization of data using PCA, t-SNE, UMAP with multiple methods within each approach, and similarly, offers multiple methods to carry out differential abundance analysis, and supports differential analysis and visualization for pairwise and group comparisons. Each of the six modules in MetaDAVis can be used in no particular order using outputs from other methods as inputs, which adds a lot flexibility for users to build and carryout customizable data analysis pipelines. The primary advantage of using MetaDAVis over existing methods is the ease of accessing many independent statistical and machine-learning tools all in one platform, seamlessely, to carryout highly specialized and refined microbiome data analyses using 16S rRNA or WGS datasets. Another advantage is its rich set of visualization tools and graphical outputs. Each module generates publishing quality plots and summary tables, where the images can be downloaded in seven different formats and the tables are downloaded as.CSV files for further use. MetaDAVis is highly flexible for customization of data pipelines and it can be broadly used without any programming background. We believe that MetaDAVis tool is a unique and highly versatile platform that broadly supports microbiome research. Supporting information S1 Table. List of R packages used to develop MetaDAVis. https://doi.org/10.1371/journal.pone.0319949.s001 (DOCX) S2 Table. Output table for the significant taxa by using various methods. https://doi.org/10.1371/journal.pone.0319949.s002 (DOCX) S1 Fig. The MetaDAVis web application. https://doi.org/10.1371/journal.pone.0319949.s003 (TIF) S2 Fig. Supported input file formats for MetaDAVis. (A) Qiime 2 output (Level 7), (B) MEGAN output file, (C) user-defined file format, and (D) metadata file applicable to all three formats. https://doi.org/10.1371/journal.pone.0319949.s004 (TIF) S3 Fig. Data upload page with display options and example summary display. Example data were provided for Qiime2, MEGAN output format. If users have a different output format, they should be prepared according to the taxa count file format. https://doi.org/10.1371/journal.pone.0319949.s005 (TIF) S4 Fig. Distribution plots. (A) Choice of distribution plot and output format; (B) Box plot for comparison groups; and (C) Box plots for individual samples. https://doi.org/10.1371/journal.pone.0319949.s006 (TIF) S5 Fig. Diversity analysis. (A) Choice of alpha diversity method from seven different methods such as Observed, Chao1, ACE, Shannon, Simpson, Inverse Simpson, Fisher or All_combined; (B) Violin plot showing the Simpson diversity; (C and D) Shannon diversity plot with corresponding values in a table; (E) Selected choice of beta diversity methods (bray-curtis) with other options;; corresponding (F) bar plot (G) dot plot (H) values in a table and (I) adonis2 function table. https://doi.org/10.1371/journal.pone.0319949.s007 (TIF) S6 Fig. Orientation analysis. (A-C) The choice of PCA-2D and the plot with frames and summary table of sample coordinate positions shown for PC1 and PC2; (D-F) PCA 3-D selection and the 3-D plot and summary table of sample coordinate positions shown for PC1, PC2, and PC3; (G-I) t-SNE with selected options, two-dimension plots with the selected rcl method, and corresponding summary table; (J-L) UMAP with selected options, condition-based and cluster-based (K = 5) UMAP plots with selected rcl method, and corresponding summary table. https://doi.org/10.1371/journal.pone.0319949.s008 (TIF) S7 Fig. Correlation analysis. (A) Input selection for taxa-based correlation analysis using the condition option; (B) Taxa-based correlation plot using Pearson method; and (C) summary table. A similar type of method selection and results were implemented in sample-based correlation analysis. https://doi.org/10.1371/journal.pone.0319949.s009 (TIF) S8 Fig. Heatmap generation. (A) Input selection for heatmap analysis, user can adjust the row and column text size and cladograms; and (B) Heatmap for the selected taxonomy level shows sample names in rows and family names in columns with a cladogram. Scale values represent colors in the heatmap and condition groups. https://doi.org/10.1371/journal.pone.0319949.s010 (TIF) S9 Fig. Differential abundance analysis. (A) Input selection of Wilcoxon Rank Sum test; (B) Grouped box plot, x-axis represents taxa and y-axis represents log10(relative frequency); (C) An individual box plot for each taxon, x-axis represents the condition and y-axis represents relative frequency; (D) Volcano plot, x-axis represents log10(mean relative abundance) and y-axis represents Log2FC; (E) Heatmap for significantly identified taxa; (F) Summary table for the Wilcoxon Rank Sum test. Similar input is needed for the remaining pairwise methods such as, metagenomeSeq, DESeq2, Limma-Voom and edgeR and multiple group comparison Kruskal-Wallis test and ANOVA. https://doi.org/10.1371/journal.pone.0319949.s011 (TIF) Acknowledgments Authors would like to thank the Bioinformatics and Systems Biology Core (BSBC) facility at UNMC for providing the computational infrastructure and support. TI - MetaDAVis: An R shiny application for metagenomic data analysis and visualization JF - PLoS ONE DO - 10.1371/journal.pone.0319949 DA - 2025-04-07 UR - https://www.deepdyve.com/lp/public-library-of-science-plos-journal/metadavis-an-r-shiny-application-for-metagenomic-data-analysis-and-6M4Qr3OFnr SP - e0319949 VL - 20 IS - 4 DP - DeepDyve ER -