Habitat suitability maps for Australian flora and fauna under CMIP6 climate scenariosArchibald, Carla L; Summers, David M; Graham, Erin M; Bryan, Brett A
doi: 10.1093/gigascience/giae002pmid: 38442145
BackgroundSpatial information about the location and suitability of areas for native plant and animal species under different climate futures is an important input to land use and conservation planning and management. Australia, renowned for its abundant species diversity and endemism, often relies on modeled data to assess species distributions due to the country’s vast size and the challenges associated with conducting on-ground surveys on such a large scale. The objective of this article is to develop habitat suitability maps for Australian flora and fauna under different climate futures.ResultsUsing MaxEnt, we produced Australia-wide habitat suitability maps under RCP2.6-SSP1, RCP4.5-SSP2, RCP7.0-SSP3, and RCP8.5-SSP5 climate futures for 1,382 terrestrial vertebrates and 9,251 vascular plants vascular plants at 5 km2 for open access. This represents 60% of all Australian mammal species, 77% of amphibian species, 50% of reptile species, 71% of bird species, and 44% of vascular plant species. We also include tabular data, which include summaries of total quality-weighted habitat area of species under different climate scenarios and time periods.ConclusionsThe spatial data supplied can help identify important and sensitive locations for species under various climate futures. Additionally, the supplied tabular data can provide insights into the impacts of climate change on biodiversity in Australia. These habitat suitability maps can be used as input data for landscape and conservation planning or species management, particularly under different climate change scenarios in Australia.
Multi-omic dataset of patient-derived tumor organoids of neuroendocrine neoplasmsAlcala, Nicolas; Voegele, Catherine; Mangiante, Lise; Sexton-Oates, Alexandra; Clevers, Hans; Fernandez-Cuesta, Lynnette; Dayton, Talya L; Foll, Matthieu
doi: 10.1093/gigascience/giae008pmid: 38451475
BackgroundOrganoids are 3-dimensional experimental models that summarize the anatomical and functional structure of an organ. Although a promising experimental model for precision medicine, patient-derived tumor organoids (PDTOs) have currently been developed only for a fraction of tumor types.ResultsWe have generated the first multi-omic dataset (whole-genome sequencing [WGS] and RNA-sequencing [RNA-seq]) of PDTOs from the rare and understudied pulmonary neuroendocrine tumors (n = 12; 6 grade 1, 6 grade 2) and provide data from other rare neuroendocrine neoplasms: small intestine (ileal) neuroendocrine tumors (n = 6; 2 grade 1 and 4 grade 2) and large-cell neuroendocrine carcinoma (n = 5; 1 pancreatic and 4 pulmonary). This dataset includes a matched sample from the parental sample (primary tumor or metastasis) for a majority of samples (21/23) and longitudinal sampling of the PDTOs (1 to 2 time points), for a total of n = 47 RNA-seq and n = 33 WGS. We here provide quality control for each technique and the raw and processed data as well as all scripts for genomic analyses to ensure an optimal reuse of the data. In addition, we report gene expression data and somatic small variant calls and describe how they were generated, in particular how we used WGS somatic calls to train a random forest classifier to detect variants in tumor-only RNA-seq. We also report all histopathological images used for medical diagnosis: hematoxylin and eosin–stained slides, brightfield images, and immunohistochemistry images of protein markers of clinical relevance.ConclusionsThis dataset will be critical to future studies relying on this PDTO biobank, such as drug screens for novel therapies and experiments investigating the mechanisms of carcinogenesis in these understudied diseases.
Data processing solutions to render metabolomics more quantitative: case studies in food and clinical metabolomics using Metabox 2.0Wanichthanarak, Kwanjeera; In-on, Ammarin; Fan, Sili; Fiehn, Oliver; Wangwiwatsin, Arporn; Khoomrung, Sakda
doi: 10.1093/gigascience/giae005pmid: 38488666
In classic semiquantitative metabolomics, metabolite intensities are affected by biological factors and other unwanted variations. A systematic evaluation of the data processing methods is crucial to identify adequate processing procedures for a given experimental setup. Current comparative studies are mostly focused on peak area data but not on absolute concentrations. In this study, we evaluated data processing methods to produce outputs that were most similar to the corresponding absolute quantified data. We examined the data distribution characteristics, fold difference patterns between 2 metabolites, and sample variance. We used 2 metabolomic datasets from a retail milk study and a lupus nephritis cohort as test cases. When studying the impact of data normalization, transformation, scaling, and combinations of these methods, we found that the cross-contribution compensating multiple standard normalization (ccmn) method, followed by square root data transformation, was most appropriate for a well-controlled study such as the milk study dataset. Regarding the lupus nephritis cohort study, only ccmn normalization could slightly improve the data quality of the noisy cohort. Since the assessment accounted for the resemblance between processed data and the corresponding absolute quantified data, our results denote a helpful guideline for processing metabolomic datasets within a similar context (food and clinical metabolomics). Finally, we introduce Metabox 2.0, which enables thorough analysis of metabolomic data, including data processing, biomarker analysis, integrative analysis, and data interpretation. It was successfully used to process and analyze the data in this study. An online web version is available at http://metsysbio.com/metabox.
A reference genome of Commelinales provides insights into the commelinids evolution and global spread of water hyacinth (Pontederia crassipes)Huang, Yujie; Guo, Longbiao; Xie, Lingjuan; Shang, Nianmin; Wu, Dongya; Ye, Chuyu; Rudell, Eduardo Carlos; Okada, Kazunori; Zhu, Qian-Hao; Song, Beng-Kah; Cai, Daguang; Junior, Aldo Merotto; Bai, Lianyang; Fan, Longjiang
doi: 10.1093/gigascience/giae006pmid: 38486346
Commelinales belongs to the commelinids clade, which also comprises Poales that includes the most important monocot species, such as rice, wheat, and maize. No reference genome of Commelinales is currently available. Water hyacinth (Pontederia crassipes or Eichhornia crassipes), a member of Commelinales, is one of the devastating aquatic weeds, although it is also grown as an ornamental and medical plant. Here, we present a chromosome-scale reference genome of the tetraploid water hyacinth with a total length of 1.22 Gb (over 95% of the estimated size) across 8 pseudochromosome pairs. With the representative genomes, we reconstructed a phylogeny of the commelinids, which supported Zingiberales and Commelinales being sister lineages of Arecales and shed lights on the controversial relationship of the orders. We also reconstructed ancestral karyotypes of the commelinids clade and confirmed the ancient commelinids genome having 8 chromosomes but not 5 as previously reported. Gene family analysis revealed contraction of disease-resistance genes during polyploidization of water hyacinth, likely a result of fitness requirement for its role as a weed. Genetic diversity analysis using 9 water hyacinth lines from 3 continents (South America, Asia, and Europe) revealed very closely related nuclear genomes and almost identical chloroplast genomes of the materials, as well as provided clues about the global dispersal of water hyacinth. The genomic resources of P. crassipes reported here contribute a crucial missing link of the commelinids species and offer novel insights into their phylogeny.