Background: Feed intake and body weight gain are economically important inputs and outputs of beef production systems. The purpose of this study was to discover differentially expressed genes that will be robust for feed intake and gain across a large segment of the cattle industry. Transcriptomic studies often suffer from issues with reproducibility and cross-validation. One way to improve reproducibility is by integrating multiple datasets via meta-analysis. RNA sequencing (RNA-Seq) was performed on longissimus dorsi muscle from 80 steers (5 cohorts, each with 16 animals) selected from the outside fringe of a bivariate gain and feed intake distribution to understand the genes and pathways involved in feed efficiency. In each cohort, 16 steers were selected from one of four gain and feed intake phenotypes (n = 4 per phenotype) in a 2 × 2 factorial arrangement with gain and feed intake as main effect variables. Each cohort was analyzed as a single experiment using a generalized linear model and results from the 5 cohort analyses were combined in a meta-analysis to identify differentially expressed genes (DEG) across the cohorts. Results: A total of 51 genes were differentially expressed for the main effect of gain, 109 genes for the intake main effect, and 11 genes for the gain x intake interaction (P < 0.05). A jackknife sensitivity analysis showed that, in corrected general, the meta-analysis produced robust DEGs for the two main effects and their interaction. Pathways identified from over-represented genes included mitochondrial energy production and oxidative stress pathways for the main effect of gain due to DEG including GPD1, NDUFA6, UQCRQ, ACTC1, and MGST3. For intake, metabolic pathways including amino acid biosynthesis and degradation were identified, and for the interaction analysis the pathways identified included GADD45, pyridoxal 5’phosphate salvage, and caveolar mediated endocytosis signaling. Conclusions: Variation among DEG identified by cohort suggests that environment and breed may play large roles in the expression of genes associated with feed efficiency in the muscle of beef cattle. Meta-analyses of transcriptome data from groups of animals over multiple cohorts may be necessary to elucidate the genetics contributing these types of biological phenotypes. Keywords: Beef cattle, Differential expression, Feed efficiency, RNA-Seq, Meta-analysis, Transcriptome * Correspondence: email@example.com Brittney N. Keel and Christina M. Zarek contributed equally to this work. USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE 68933, USA Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Keel et al. BMC Genomics (2018) 19:430 Page 2 of 11 Background and genetic effects) also contributes to reproducibility Feed costs are the major component of production costs issues. One way to improve reproducibility is by integrat- in the beef cattle industry, accounting for 55–75% of ing multiple datasets via meta-analysis. Meta-analysis total production costs [1–3]. One way to potentially re- procedures have been previously shown to produce duce these costs is to improve efficiency of beef cattle. results that are more likely to be valid in independent Feed intake and weight gain are two measureable compo- datasets [13–15]. nent phenotypes that are often used to characterize the A major aim of this study was to discover differentially feed efficiency of an animal. Dry matter intake (DMI), re- expressed genes that will be robust across a large seg- sidual feed intake (RFI), average daily gain (ADG), and ment of the cattle industry. As such, crossbred steers feed conversion ratio (FCR) have been shown to be under from a population representing 19 Bos taurus and Bos genetic control, with heritabilities estimated between 0.2 indicus breeds were used in this study. Many of the pre- and 0.5 [3, 4], indicating that these traits could be im- vious studies in cattle have used RFI as the phenotype, proved through selection. which is the difference between actual and expected feed National cattle evaluations routinely include informa- intake. The calculation of RFI is based on several factors tion from gain to predict genetic merit for growth. How- including, ADG, average daily feed intake (ADFI), ever, individual feed intake measurements are difficult growth rate, weight, and efficiency of growth . In and expensive to obtain. Hence, information is lacking order to gain a more detailed understanding of the role for generation of predictions of total genetic merit for of DEGs in feed efficiency we incorporated two compo- feed intake and efficiency in major cattle breeds. An im- nents of RFI: ADG and ADFI. Samples were collected proved understanding of the regulation of genes under- from animals in five separate feeding trials, and a meta- lying efficiency could improve the effectiveness of analysis procedure was implemented to identify DEGs selection for efficiency as well as significantly reduce the across the cohorts that could be attributed to gain and cost of doing so. feed intake, as well as the gain by intake interaction. Skeletal muscle is responsible for approximately 25% of an animal’s requirements for maintenance energy due to Methods its size and involvement with energy production [5, 6]. Bo- Animal care and use vine skeletal muscle has been the subject of several tar- The U.S. Meat Animal Research Center (USMARC) geted gene expression studies, attributable to its link with Animal Care and Use Committee reviewed and ap- feed efficiency via roles in mitochondrial energy produc- proved all animal procedures. The procedures for tion [5–8]. To date, a small number of studies have exam- handling cattle complied with the Guide for the Care and ined the bovine skeletal muscle transcriptome and its role Use of Agricultural Animals in Agricultural Research and in feed efficiency [9, 10]; however, none of these encom- Teaching . pass more than one contemporary group (or cohort) of cattle. Weber et al.  conducted a study comparing dif- Population ferential gene expression of low and high RFI animals A total of 80 steers, originating from the continuous using RNA sequencing (RNA-Seq) from skeletal muscle of phase of the USMARC Germplasm Evaluation project 16 Angus steers sired by one high RFI bull and one low , were used in this study. The Germplasm Evaluation RFI bull. Genes involved in fat deposition, immune/in- project is a breeding program intended to develop sev- flammatory function, and cell damage were identified as eral populations of cattle with a high percentage of top differentially expressed among these animals. Another U.S. beef breeds, including Angus, Beefmaster, Brahman, skeletal muscle transcriptome study was performed by Brangus, Brown Swiss, Charolais, Chiangus, Gelbvieh, Guo et al.  on 48 Brahman steers that identified cell Hereford, Limousin, Maine Anjou, Red Angus, Romosi- cycle, extracellular matrix and fat deposition genes in- nuano, Bonsmara, South Devon, Salers, Santa Gertrudis, volved in gain. Shorthorn, and Simmental. Non-reproducibility of results is a major problem in high-dimensional experiments such as gene expression Gain and feed intake analysis, where thousands of hypotheses are being tested Crossbred steers were collected as groups of 16 animals simultaneously . Due to the cost of sequencing, from 5 cohorts. Feed intake and body weight gain were RNA-Seq experiments are typically performed on a small measured on steers from 2012 to 2014 (Additional number of biological replicates, which limits their power file 1(A)). Steers were 350 ± 54 days of age at the begin- for detecting differentially expressed genes (DEG). Add- ning of the feeding trial and trials lasted 64–92 days, itionally, variability between studies due to technical differ- during which theywerefed acorn-basedfinishing diet ences (e.g., sample preparation, library protocols, batch (Tables 1 and 2). Steers were housed in pens (15.2 × 45.7 m) effects) as well as biological differences (e.g. environmental in a facility that was equipped with an Insentec Roughage Keel et al. BMC Genomics (2018) 19:430 Page 3 of 11 Table 1 Length of time and diets used for feed trials Year Born Time of Slaughter Start Date End Date Days on Study Diet 2011 Spring 2012 04/11/12 06/14/12 64 ME02 2011 Fall 2012 07/10/12 10/10/12 92 ME01 2012 Spring 2013 04/16/13 06/19/13 64 ME01 2012 Fall 2013 07/24/13 10/16/13 84 TM01 2013 Fall 2014 07/29/14 10/15/14 78 TM01 Diet composition provided in Table 2 Intake Control Feeding System (Insentec B.V., Marknesse, Animals with medical or health issues that might affect The Netherlands). Each pen housed approximately 50 gain or intake were also excluded from selection steers and had 8 feed bunks. Steers had access to all of the (Additional file 1(C)). A summary of the gain and feed in- feed bunks in the pen and the bunks measured individual take means, minimums, and maximums for each quadrant feed intake. Feeders and waterers were located inside an in each cohort is provided in Table 3. open-sided barn. The barn was 4.8 m high at the front After the feed trial ended, selected animals were and covered 162 m of the pen. comingled in a pen with ad libitum access to the same Steers were weighed on the first two and last two days diet. Length of time between the feed trial and slaughter and every three weeks during the study. Total gain was varied slightly by cohort: cohort 1 was 12–18 days, co- calculated by regressing body weight on days on study hort 2 was 19–22 days, cohort 3 was 5–8 days, cohort 4 using a quadratic polynomial. Average daily gain was cal- was 20–28 days, and cohort 5 was 11–14 days. Animals culated as total body weight gained divided by days on were allowed to consume feed and water until they were study. Sixteen steers were selected from each cohort by weighed on the day of slaughter and transported to the ranking them based on their standardized distance from US Meat Animal Center abattoir (under 6.4 km). Cohort the bivariate mean (of ADG and ADFI) assuming a bi- 5 was transported to a small commercial abattoir and pro- variate normal distribution with calculated correlation cessing plant approximately 20 miles away. Aside from between ADG and ADFI. The four steers with the great- transport location and distance, all other parameters est deviation from the bivariate mean in each Cartesian remained the same. On each day of slaughter, four of the quadrant were sampled. This resulted in the selection of animals selected (one from each phenotypic group) were 16 steers, 4 steers from each of 4 quadrants: the high stunned with captive bolt, exsanguinated, and processed gain-high intake quadrant, the high gain-low intake serially within a three-hour time frame. quadrant, the low gain-high intake quadrant, and the low gain-low intake quadrant (Fig. 1). To ensure that Tissue collection and RNA isolation breed composition was not confounded with quadrant Tissue collection and RNA extraction were performed within cohort, steers were selected to ensure that each using the same procedures in each cohort. A longissimus quadrant had multiple breeds represented. In the event a dorsi sample between the sixth and seventh rib from the sire breed was over represented within a quadrant, a steer right side of the carcass was taken 25–30 min post with the next highest ranking of a different breed was se- lected. Breed percentages for the selected animals in each quadrant in each cohort are shown in Additional file 1(B). Table 2 Composition of diets used in feed trials ME01 ME02 TM01 Dry rolled corn 57.35% 0% 57.35% Ground alfalfa hay 8% 8% 8% High moisture corn 0% 57.75% 0% Steakmaker® 4.25% 4.25% 0% Steakmaker with Tylan 0% 0% 4.25% Urea 0.4% 0% 0.4% Fig. 1 Total gain versus total dry matter intake over the trial period Wet distiller’s grains with solubles 30% 30% 30% was plotted for all animals (n = 80) used in this study. Cohorts are Manufactured by Land O’Lakes (Arden Hills, MN) represented by the color of the dots Tylan manufactured by Elanco Animal Health (Greenfield, IN) Keel et al. BMC Genomics (2018) 19:430 Page 4 of 11 Table 3 Summary statistics for ADG and ADFI (Kg/day) in the animals selected from each of the cohorts Mean ADG Min. ADG Max. ADG Mean ADFI Min. ADFI Max. ADFI Spring 2012 High Gain-High Intake 2.22 1.98 2.49 26.83 22.08 31.10 Low Gain-High Intake 1.55 1.42 1.64 23.48 21.83 25.61 Low Gain-Low Intake 1.21 1.05 1.52 14.54 12.25 18.97 High Gain-Low Intake 1.85 1.75 2.05 15.29 13.91 16.65 Fall 2012 High Gain-High Intake 2.26 2.09 2.36 8.49 8.00 8.91 Low Gain-High Intake 1.68 1.53 1.90 8.24 7.8 8.55 Low Gain-Low Intake 1.54 1.43 1.70 5.81 5.30 6.20 High Gain-Low Intake 2.02 1.35 2.31 7.31 6.19 9.70 Spring 2013 High Gain-High Intake 2.06 1.90 2.26 13.82 12.40 14.65 Low Gain-High Intake 1.28 1.14 1.51 12.22 11.41 13.40 Low Gain-Low Intake 0.90 0.86 0.99 8.48 7.39 9.22 High Gain-Low Intake 1.70 1.56 1.79 9.35 8.60 10.04 Fall 2013 High Gain-High Intake 2.23 1.93 2.43 14.40 12.09 16.53 Low Gain-High Intake 1.48 1.02 1.84 13.51 12.06 14.93 Low Gain-Low Intake 1.61 1.33 1.88 8.84 8.52 9.43 High Gain-Low Intake 1.98 1.90 2.07 9.08 7.51 9.98 Fall 2014 High Gain-High Intake 2.43 2.20 2.63 12.27 10.92 13.67 Low Gain-High Intake 1.65 1.51 1.76 10.52 10.10 10.79 Low Gain-Low Intake 1.47 1.10 1.73 7.12 6.74 7.37 High Gain-Low Intake 2.24 2.19 2.31 9.13 9.06 9.19 exsanguination. Sample collection time frame was consist- RNA sequencing ent across cohorts. The sample was diced into approxi- Samples were prepared for RNA sequencing with the Illu- mately one cubic centimeter pieces, flash frozen in liquid mina TruSeq Stranded mRNA High Throughput Sample nitrogen, and stored at − 80 °C until RNA isolation. Muscle kit and protocol (Illumina Inc., San Diego, CA, USA). The samples (50 to 100 mg) from the 80 animals were homoge- libraries were quantified with qRT-PCR using the NEBNext nized with a six station Omni Prep homogenizer (Omni Library Quant Kit (New England Biolabs, Inc., Beverly, International, Kennesaw, GA, USA) in one milliliter MA, USA) on a CFX384 thermal cycler (Bio-Rad, Hercules, of Trizol. Total RNA was isolated according to the manu- CA, USA), and the quality of the library was determined facturer’s protocol and was resuspended in 50–100 μL with an Agilent Bioanalyzer DNA kit (Santa Clara, CA, RNase-free water. USA). The libraries were diluted with Tris-HCL 10 mM, Genomic DNA was removed from the total RNA with pH 8.5 with 0.1% Tween 20 to 10 nM samples (Teknova, the Qiagen RNeasy mini-kit (Valenci, CA, USA), accord- Hollister, CA. USA). The libraries were pooled into eight ing to the manufacturer’s protocol. The concentration of pools of 12 libraries in each according to Illumina’sdual- the RNA was determined with a Nanodrop 8000 spectro- index protocol (samples were submitted for sequencing photometer (Thermo Scientific, Wilmington, DE, USA). with 16 libraries from another RNA-Seq study). All 80 sam- The average 260/280 ratio was 2.04, with a range of ples were paired-end sequenced with 150 cycle high output 1.95–2.13. An Agilent Bioanalyzer RNA 6000 nano kit sequencing kits for the Illumina NextSeq instrument. (Santa Clara, CA, USA) was used to determine the RNA integrity number (RIN). Only samples with a RIN of 7.0 Processing RNA-Seq data and higher were used for the RNA sequencing. The aver- Read alignment of the RNA-Seq data was carried out as age RIN was 7.8, with a range of 7.0–8.5. follows. First, quality of the raw paired-end sequence Keel et al. BMC Genomics (2018) 19:430 Page 5 of 11 reads in individual fastq files was assessed using FastQC This test provides a meta P-value, and classical procedures (Version 0.11.5; www.bioinformatics.babraham.ac.uk/ for multiple testing correction can be applied to obtain P- projects/fastqc), and reads were trimmed to remove values adjusted to control the false discovery rate. The adapter sequences and low-quality bases using the Trim- Benjamini-Hochberg method  was used to correct for momatic software (Version 0.35) . The remaining multiple testing. Genes with adjusted meta P-value ≤0.05 reads were mapped to the UMD 3.1 genome assembly were considered statistically significant. DEGs associated using Tophat2 (Version 2.1.1) , and the NCBI anno- with a main effect and also the interaction term were tation for UMD 3.1 was used to guide the alignment. excluded from the main effect list of DEGs since this We used the HTSeq package  to estimate the count indicates that the main effect is dependent on the of uniquely mapped reads for each of the 28,451 anno- interaction term. tated genes in the NCBI UMD 3.1 gene transfer format (GTF) file. Genes with low read counts, < 15 reads in at Jackknife reproducibility analysis least 16 samples, were removed resulting in a set of Robustness of the results was evaluated using a jackknife 13,511 genes to be used in our downstream analysis. sensitivity analysis; i.e. the meta-analysis procedure was repeated multiple times, each time with removal of a Meta-analysis of differential gene expression single cohort from the baseline group of cohorts . Recently, generalized linear models (GLMs) have been utilized for analyzing differential gene expression in Functional analysis of DEGs RNA-Seq experimental designs involving multiple ex- Functions of DEGs were determined using the PAN- planatory factors [22–24]. In such a GLM, analysis of THER classification system (Version 11.0) . Enrich- deviance (ANODEV) is used to identify DEGs associated ment analysis of gene function was performed using with individual factors and their interactions. We used PANTHER’s implementation of the binomial test of the following GLM with a negative binomial link func- overrepresentation. Significance of gene ontology (GO) tion, which simultaneously considers two explanatory terms was assessed using the default Ensembl Bos taurus variables, gain (H vs. L) and intake (H vs. L), to analyze GO annotation as background for the enrichment ana- differential expression: lysis. Data from PANTHER was considered statistically significant at Bonferroni corrected P ≤ 0.05. Y ¼ Gain þ Intake þ Gain Intake ð1Þ QIAGEN ingenuity® pathway analysis We used the R package DESeq2  to perform our Ingenuity® Pathway Analysis (IPA®, QIAGEN Redwood differential expression analysis. DESeq2 uses a negative City, CA; https://www.qiagenbioinformatics.com/products/ binomial distribution to model gene read counts and ingenuity-pathway-analysis/) was used to deduce direct shrinkage estimators to estimate the per-gene negative and indirect molecular relationships among differentially binomial dispersion parameters. expressed genes for the gain main effect, intake main ef- The function nbinomLRT, which performs a likelihood- fect, and gain x intake interaction. Each of the data sets ratio test between a full and a reduced negative binomial was imported with a Flexible Format using Gene symbol as GLM, was used to test three separate null hypotheses. the identifier. A core analysis was performed on genes in Null hypothesis 1 tested whether each gene was signifi- each set, where a P-value for each network is calculated ac- cantly affected by gain, null hypothesis 2 tested whether cording to the fit of the users set of significant genes and each gene was significantly affected by intake, and null hy- the size of the network. P-values were considered statisti- pothesis 3 tested whether each gene was significantly af- cally significant at Benjamini-Hochberg adjusted P ≤ 0.05. fected by the interaction between gain and intake. In the meta-analysis, each cohort was analyzed separ- Results ately using the GLM in Eq. (1). The raw P-values for Sequencing throughput, read mapping, and read counts each gene from each of the five analyses were combined RNA-Seq libraries from the longissimus dorsi muscle tis- using Fisher’s method , which combines P-values sue of 80 steers with divergent components of feed effi- from each experiment into one test statistic defined as ciency were sequenced. We generated over 6 billion 75- bp paired-end reads using an Illumina NextSeq instru- X ¼ −2 ln p ; ð2Þ gs ment. The range of raw sequence reads per sample was s¼1 23.4 million to 194.7 million, with an average of 79.9 where p denotes the raw P-value obtained from gene g in million reads per sample (Additional file 1(D)). Adapter gs experiment s and S is the number of experiments being sequences and low quality bases were trimmed with the combined. Under the null hypothesis, the test statistic Trimmomatic software, which resulted in approximately X follows a χ distribution with 2S degrees of freedom. a 0.02% reduction in the number of reads across the 80 Keel et al. BMC Genomics (2018) 19:430 Page 6 of 11 samples. The resulting high quality reads were mapped to Additional file 5). The number of DEGs identified in the the Bos taurus UMD 3.1 genome assembly with an aver- jackknife analyses for the intake main effect varied more age 93.4% overall mapping rate. Computing read counts than for gain, with 219, 60, 56, 59, and 38 DEGs for the for each gene and filtering out genes with low read counts jackknife analysis that removed Cohort 1, 2, 3, 4, and 5, re- resulted in a set of 13,511 genes to be used for down- spectively (Jacknife P < 0.05 in Additional file 6). The jack- stream analysis. knife analyses for the interaction effect identified 21, 16, 2, 3, and 6 DEGs for the jackknife analysis that removed Meta-analysis of DEGs associated with gain and feed Cohort 1, 2, 3, 4, and 5, respectively (Jacknife P <0.05 in intake across cohorts Additional file 7), which was highly similar to the number Selection of individuals with extreme gain and feed in- of DEGs identified in the full meta-analysis (11 DEGs). take phenotypes to be used in this study was done For the gain main effect, there were no DEGs that within cohort. Figure 1 shows ADG versus ADFI over were robust enough to pass all five jackknife analyses. the feeding period for all animals (n = 80) used in this Eleven DEGs failed only one jackknife analysis, while 19, study. This plot clearly illustrates that there is a segrega- 18, 3, and 0 DEGs failed to pass 2, 3, 4, and 5 jackknife tion of phenotypes across cohorts. analyses, respectively. For the intake main effect, there Since our goal was to identify DEGs that could explain were 2 DEGs, ZNF775 and CST6, that passed all five the overall variation in gain and feed intake across all co- jackknife analyses, and there were 2 DEGs, TMEM120A horts, we performed a meta-analysis of differential expres- and LOC508916, that failed all five jackknife analyses. sion across the 5 cohorts. In this procedure differential Thirty DEGs failed only one jackknife analysis, while 32, gene expression analysis was performed independently for 32, and 11 DEGs failed to pass 2, 3, and 4 jackknife ana- each cohort using the GLM shown in Eq. (1). There was lyses, respectively. For the interaction effect, there were variation among the cohorts in the number of genes iden- no DEGs that passed all five tests and none that failed tified as differentially expressed with P ≤ 0.05 after FDR all five, while 3, 6, 1, and 1 DEGs failed to pass 1, 2, 3, correction (data not shown). For the gain main effect, and 4 tests, respectively. there were 4, 0, 14, 10, and 0 DEGs in Cohort 1, 2, 3, 4, and 5, respectively. For the intake main effect, 0, 0, 14, 10, Function of DEGs and 0 genes were identified as differentially expressed. PANTHER gene ontology analysis of the DEGs indicated The analysis of the gain by intake interaction produced 0, that genes that were significant for the gain main effect 0, 14, 10, and 0 DEGs. were involved in catalytic activity (47.4%), binding (31.6%), Raw P-values for each gene from each individual cohort structural molecule activity (15.8%), and antioxidant activ- analysis were then combined using Fisher’smethod. After ity (5.3%). No GO terms were significantly over- or under- multiple testing correction, we identified 51 significant represented in this gene set. genes for the gain main effect, 109 genes that were signifi- Similar to the gain main effect genes, genes that were cant for the intake main effect, and 11 significant genes for significant for the intake main effect were involved in the gain x intake interaction (Additional files 2, 3 and 4). binding (38.8%), catalytic activity (36.7%), transporter ac- Significant genes were inspected for consistency, defined as tivity (10.2%), receptor activity (8.2%), structural mol- having the same log-fold change direction across all 5 co- ecule activity (4.1%), and signal transducer activity (2%). horts. We found that only 4 significant genes (LOC515676, Again, no GO terms were significantly over- or under- UQCRQ, NPR3, C5H12orf5) were consistent for the gain represented in this set. main effect, while 8 DEGs (IQANK1, LOC101904159, Genes that were significant for the gain x intake inter- LOC101904117, CD163, MCHR1, MFSD4, OAT, TNNI1) action were involved in catalytic activity (50%), binding were consistent for the intake main effect. No DEGs were (16.7%), transporter activity (16.7%), and structural mol- consistent for the gain x intake interaction. ecule activity (16.7%). Enrichment analysis of GO terms did not identify any over- or under-represented GO Jackknife analysis terms in these genes. Robustness of the results were assessed using a jackknife sensitivity analysis, where for each term in the model Ingenuity pathway analysis (IPA) five separate meta-analyses were performed each omit- Ingenuity® Pathway Analysis was performed in order to ting a single cohort. The results are shown in Additional characterize the functional consequences of gene expres- files 5, 6 and 7. For the gain main effect, the jackknife sion differences for the gain main effect, intake main ef- analyses produced similar numbers of DEGs to that of fect, and gain x intake interaction. IPA identified thirteen the original meta-analysis (51 DEGs): 69, 25, 17, 44, and significant canonical pathways for the DEGs associated 51 DEGs for the jackknife analysis that removed Co- with the gain main effect (Table 4). The top five canonical hort 1, 2, 3, 4, and 5, respectively (Jacknife P <0.05 in pathways included mitochondrial dysfunction represented Keel et al. BMC Genomics (2018) 19:430 Page 7 of 11 Table 4 Significant pathways for DEGs associated with the gain main effect identified using IPA Pathway P-value DEGs in Pathway Mitochondrial Dysfunction 0.00296 BCL2, NDUFA6, UQCRQ Glycerol-3-phosphate shuttle 0.00672 GPD1 Glycerol degradation I 0.0101 GPD1 Death receptor signaling 0.0107 ACTC1, BCL2 VEGF signaling 0.013 ACTC1, BCL2 Oxidative phosphorylation 0.0145 NDUFA6, UQCRQ Phosphatidylethanolamine biosynthesis II 0.0151 ETNK2 Pancreatic adenocarcinoma signaling 0.0169 BCL2, RALGDS Gai signaling 0.0174 NPR3, RALGDS nNOS signaling in skeletal muscle cells 0.025 SNTA1 Glutathione redox reactions I 0.0397 MGST3 NRF2-mediated oxidative stress response 0.0418 ACTC1, MGST3 ILK signaling 0.043 ACTC1, LIMS2 Corrected for multiple testing using Benjamini Hochberg method by the genes BCL2, NDUFA6, UQCRQ (P < 0.01), these genes included cellular movement, cell cycle, cellular glycerol-3-phosphate shuttle due to GPD1 (P <0.01), gly- development, cellular growth and proliferation, and cell cerol degradation I from gene GPD1 (P = 0.0101), death death and survival. receptor signaling from DEGs BCL2, ACTC1 (P = 0.0107), Lastly, for the interaction effect there were four signifi- and VEGF signaling due to ACTC1, BCL2 (P = 0.013). Mo- cant pathways (Table 6), GADD45 signaling due to lecular and cellular functions related to these genes in- GADD45B (P <0.01), pyridoxal 5′-phosphate salvage cluded amino acid metabolism, molecular transport, small pathway due to PDXK (P < 0.001), caveolar-mediated molecule biochemistry, cellular growth and proliferation, endocytosis signaling due to ALB (P = 0.0327), and ATM and cell death and survival. signaling due to GADD45B (P = 0.0368). Molecular and For the intake main effect, IPA identified twelve sig- cellular functions related to these genes included cellular nificant pathways (Table 5), including 4-hydroxyproline movement, amino acid metabolism, antigen presentation, degradation I due to HOGA1 (P < 0.01), methyglyoxal carbohydrate metabolism, and cell death and survival. degradation I because of HAGHL (P =0.012), D- glucuronate degradation I due to DCXR (P = 0.0120), Discussion glycerol-3-phosphate shuttle because of GPD1 (P =0.016), To date, there have been very few transcriptome meta- and arginine degradation (arginase pathway) due to OAT analyses for livestock species reported in the literature. (P = 0.016). Molecular and cellular functions related to RNA-Seq experiments, especially those performed in Table 5 Significant pathways for DEGs associated with the intake main effect identified using IPA Pathway P-value DEGs in Pathway 4-hydroxyproline degradation I 0.00803 HOGA1 Methylglyoxal degradation I 0.012 HAGHL D-glucuronate degradation I 0.012 DCXR Glycerol-3-phosphate shuttle 0.016 GPD1 Arginine degradation I (arginase pathway) 0.016 OAT Arginine biosynthesis IV 0.0239 OAT Proline biosynthesis II (from arginine) 0.0239 OAT Arginine degradation VI (arginase 2 pathway) 0.0239 OAT Glycerol degradation I 0.0239 GPD1 CNTF signaling 0.0267 CNTFR, HRAS Citrulline biosynthesis 0.0318 OAT PPARa/RXRa activation 0.0362 CHD5, GPD1, HRAS Corrected for multiple testing using Benjamini Hochberg method Keel et al. BMC Genomics (2018) 19:430 Page 8 of 11 Table 6 Significant pathways for DEGs associated with the gain Genes passing all five jackknife analyses can be consid- by intake interaction effect identified using IPA ered highly robust, as they are not dependent on any Pathway P-value DEGs in Pathway one cohort. Genes that fail multiple jackknife tests can also be interpreted as robust, where the higher number GADD45 signaling 0.00886 GADD45B of failed tests indicates greater robustness. This inter- Pyridoxal 5′-phosphate salvage pathway 0.03 PDXK pretation can be derived as follows. If a gene fails only Caveolar-mediated endocytosis signaling 0.0327 ALB one jackknife test, this indicates that the meta P-value is ATM signaling 0.0368 GADD45B being driven by the P-value arising from this single co- Corrected for multiple testing using Benjamini Hochberg method hort. Hence, there may be some cohort bias for that gene. On the other hand, if a gene fails multiple jack- knife tests, then the meta P-value is being driven by the livestock, are routinely performed on a small number of P-values of multiple cohorts, i.e. there is a reduced level biological replicates due to the cost of sequencing. The of cohort bias. limited power in these studies coupled with both tech- We saw that only two genes passed all five jackknife nical and biological variability between studies can lead tests and two genes failed all five jackknife tests for the to issues with reproducibility and cross-validation. Meta- intake main effect, and none passed or failed all five tests analysis can help improve research findings in these for gain and interaction. That is, there were more highly types of studies by eliminating false-positive findings robust genes observed for the intake main effect than pertaining to experimental and design conditions. More- the gain main effect. It has been shown that DMI is a over, integrating data across multiple experiments may moderately repeatable trait, while ADG exhibits low re- enable extraction of deeper biological insights compared peatability [29, 30]. In general, DEGs associated with to that achieved through single-study analysis. To our both main effects tended to be moderately robust, with knowledge, this is the first RNA-Seq study in livestock 41.1 and 41.3% of DEGs failing at least 3 jackknife ana- where data was collected over multiple cohorts, each co- lyses. Genes being driven by a single cohort in the meta- hort serving as a separate experiment, and analyzed analysis (i.e. those failing exactly one jackknife test) rep- using a meta-analysis procedure. resent potential false-positives. The addition of more co- The purpose of this study was to identify genes differ- horts to the meta-analysis should efficiently remove entially expressed in the muscle of beef cattle associated those that are false findings, as increasing the number of with gain and feed intake that will be robust across the large P-values in the multiplication performed in Fisher’s cattle industry for selection of more feed efficient ani- method will increase the meta P-value. Moreover, adding mals. There were a total of 19 breeds (plus MARC II more cohorts to the meta-analysis will increase the ro- and MARC III composites) represented in the study with bustness of the results. all but three of them, Bonsmara, Romosinuano, and Some of the robust genes in this study have been pre- South Devon, represented in multiple cohorts and in viously identified as candidate genes for feed efficiency more than one phenotypic class (for example, animals or as DEG associated with feed efficiency in livestock. with Chiangus as a portion of their breed-of-origin are For example, CST6, which passed all 5 jackknife tests for represented in the low gain-high intake class in Fall intake was also found to be differentially expressed in 2012 and the low gain- high intake and low gain-low in- the muscle tissue of pigs with variation in RFI . In take classes in Fall 2013). Moreover, both fall and spring addition, one of the genes that failed all 5 jackknife tests seasons over 3 years are represented among the five co- for intake was LOC508916, which is potentially a car- horts. The rationale for this design was to generate data boxylesterase 1-like gene. Two carboxylesterase genes that would include the most robust drivers of gene ex- (CES1, CES3) were identified as two of the most highly pression affecting feed intake and gain. differentially expressed genes in the adipose tissue of The variation among the lists of genes identified as dif- pigs with low RFI . ferentially expressed by cohort in this study underscores One of the four genes identified in the interaction ana- the importance of including animals from more than one lysis was albumin (ALB). Plasma and serum levels of al- cohort of livestock to obtain biologically relevant data for bumin have been associated previously with feed complex traits. Validation of transcriptomic or proteomic efficiency in steers and lambs. Paula et al.  demon- data is likely to produce poor reproducibility from study strated that the most efficient lambs had lower serum al- to study due to the large amount of biological variation bumin concentrations. This phenomenon was also from sources that include breed and environmental fac- identified in the plasma of beef steers . Another tors. For this reason, we chose to measure reproducibility study by Connell et al.  showed a relationship be- by replication validity rather than in independent data, tween serum albumin levels and DMI in sheep. In this such as cross-validation. study, the transcript abundance of ALB is higher in Keel et al. BMC Genomics (2018) 19:430 Page 9 of 11 animals with low gain and high intake. Our analysis in- pathways, and cell signaling pathways. This work is a cluded gain and intake phenotypes, suggesting that the first step in integrating sequence data from multiple co- level of albumin transcripts may either be influencing or horts to identify potential biomarkers related to the gain responding to both phenotypes, rather than intake alone. and feed intake of beef cattle. Further study is needed to The gene UQCRQ was identified as being associated understand the role of natural variation in the skeletal with gain. This gene had the same direction of expres- muscle and its contribution to feed efficiency. sion in all five cohorts of animals (i.e., lower transcript abundance with higher gain). The UQCRQ gene is in- Additional files volved in mitochondrial energy production and has also Additional file 1: (A) Cohort and age of animals selected for this study. been associated with feed efficiency in previous studies. (B) Breed percentages of phenotypic quadrants for each cohort (HH = high Kong et al.  identified higher transcript abundance gain, high intake; LH = low gain, high intake; LL = low gain, low intake; of UQCRQ in the rumen tissue of animals with low RFI. HL = high gain, low intake). (C) Rationale for exclusion of animals within each cohort. (D) Sequencing statistics. (XLSX 27 kb) While it is difficult to cross compare these two pheno- Additional file 2: DEGs associated with the gain main effect in the types, it is of interest that this gene and the mitochondrial individual cohort and meta analyses. Genes are ordered by adjusted energy pathways were identified in both studies. One im- meta-P-value. The individual cohort cells for DEGs identified in the portant detail to note about the UQCRQ gene is that it meta-analysis are colored according to the sign of their log2 fold change, where green indicates up-regulation and red indicates down-regulation was one of the genes that failed only one of the jackknife in high gain. Genes with all gray cell indicate those that were excluded tests, indicating potential cohort bias. As mentioned be- because they were also significant for the gain by intake interaction term. fore, data from additional cohorts is needed to determine (XLSX 1224 kb) if this gene is indeed a robust biomarker of gain. Additional file 3: DEGs associated with the intake main effect in the individual cohort and meta analyses. Genes are ordered by adjusted Recently, lower expression of MYOZ2 was detected in meta-P-value. The individual cohort cells for DEGs identified in the the muscle of chickens with high feed efficiency , meta-analysis are colored according to the sign of their log2 fold change, while another study found the expression of MYOZ2 in where green indicates up-regulation and red indicates down-regulation in high intake. Genes with all gray cell indicate those that were excluded a different population of chickens to be opposite in dir- because they were also significant for the gain by intake interaction term. ection, with lower expression among birds with higher (XLSX 1241 kb) feed efficiency . We found MYOZ2 to be expressed Additional file 4: DEGs associated with the gain by intake interaction in higher transcript abundance in four of the five cohorts effect in the individual cohort and meta analyses. Genes are ordered by adjusted meta-P-value. The individual cohort cells for DEGs identified in of steers with higher gain. Similar to the two studies in the meta-analysis are colored according to the sign of their log2 fold which MYOZ2 was associated with feed efficiency in change, where green indicates up-regulation and red indicates down- chickens, this gene does not show 100% concordance in regulation in low gain, high intake. (XLSX 1218 kb) its direction of gene expression. The differences in expres- Additional file 5: Jackknife sensitivity analysis results for the DEGs associated with the gain main effect. Genes with all gray cell indicate sion studies in chickens were potentially attributed to dif- those that were excluded because they were also significant for the gain ferences in genetics or other factors . In our study, by intake interaction term. Jackknife 1 P-value gives the adjusted P-value there is some variation in breed representation among our for the meta-analysis with Cohort 1 removed, Jackknife 2 P-value gives the adjusted P-value for the meta-analysis with Cohort 2 removed, and so on. phenotypic groups, but there is also variation in environ- Yellow cells indicate jackknife analyses where the P-value was insignificant, i.e. ment for each of the 5 cohorts, which may also be contrib- the gene failed to pass the jackknife analysis. (XLSX 987 kb) uting to the variation in the direction of gene expression. Additional file 6: Jackknife sensitivity analysis results for the DEGs Most of the DEGs identified in our analysis exhibited associated with the intake main effect. Genes with all gray cell indicate those that were excluded because they were also significant for the gain variation in the direction of expression among cohorts. by intake interaction term. Jackknife 1 P-value gives the adjusted P-value This supports the hypothesis that the expression of for the meta-analysis with Cohort 1 removed, Jackknife 2 P-value gives DEGs may be environmentally driven. It is also possible the adjusted P-value for the meta-analysis with Cohort 2 removed, and so on. Yellow cells indicate jackknife analyses where the P-value was insignificant, i.e. that other genes that were not significant in our differ- the gene failed to pass the jackknife analysis. (XLSX 1095 kb) ential expression analysis are involved in the regulation Additional file 7: Jackknife sensitivity analysis results for the DEGs of these DEGs and pathways. Future work will focus on associated with the gain by intake interaction. Jackknife 1 P-value gives the using gene expression profiles and clustering analysis to adjusted P-value for the meta-analysis with Cohort 1 removed, Jackknife 2 P-value gives the adjusted P-value for the meta-analysis with Cohort 2 identify additional regulatory genes that may play a role removed, and so on. Yellow cells indicate jackknife analyses where the in ADG and ADFI phenotypes. P-value was insignificant, i.e. the gene failed to pass the jackknife analysis. (XLSX 1030 kb) Conclusions Data presented here demonstrate that finishing beef cat- Abbreviations tle with divergent ADFI and ADG phenotypes have gene ADFI: Average daily feed intake; ADG: Average daily gain; ANODEV: Analysis of deviance; DEG: Differentially expressed gene(s); DMI: Dry matter intake; expression differences that indicate that there are poten- DNA: Deoxyribonucleic acid; FCR: Feed conversion ratio; GLM: Generalized tially differences in mitochondrial energy production linear model; GO: Gene ontology; IPA: Ingenuity pathway analysis; and oxidative stress pathways, amino acid metabolism NCBI: National Center for Biotechnology Information; PANTHER: Protein Keel et al. BMC Genomics (2018) 19:430 Page 10 of 11 analysis through evolutionary relationships; RFI: Residual feed intake; muscle of beef heifers phenotypically divergent for residual feed intake. J RNA: Ribonucleic acid; RNA-Seq: Ribonucleic acid sequencing Anim Sci. 2013;91:159–67. 7. Fonseca LFS, Gimenez DFJ, Mercadante MEZ, Boniha SFM, Ferro JA, Baldi F, Acknowledgements Souza FRP, Albuquerque LG. Expression of genes related to mitochondrial The authors wish to acknowledge Linda Flathman for her outstanding function in Nellore cattle divergently ranked on residual feed intake. Mol technical and laboratory assistance; the USMARC Core Laboratory for Biol Rep. 2015;42:559–65. performing the sequencing; Troy Gramke and Sam Nejezchleb for their 8. Lindholm-Perry AK, Kuehn LA, Oliver WT, Sexten AK, Miles JR, Rempel LA, assistance with sample collection; and the USMARC Cattle Operations and Cushman RA, Freetly HC. Adipose and muscle tissue gene expression of Abattoir staff. Mention of a trade name, proprietary product, or specified two genes (NCAPG nad LCORL) located in a chromosomal region equipment does not constitute a guarantee or warranty by the USDA and associated with cattle feed intake and gain. PLoS One. 2013;8(11):e80882. does not imply approval to the exclusion of other products that may be 9. Tizioto PC, Coutino LL, Oliveira PSN, Cesar ASM, Diniz WJS, Lima AO, suitable. The USDA is an equal opportunity provider and employer. Rocha MI, Decker JE, Schnabel RD, Mourão GB, Tullio RR, Zerlotini A, Taylor JF, Regitano LCA. Gene expression differences in longissimus Funding muscle of Nelore steers genetically divergent for residual feed intake. This work was funded by CRIS #3040–31000-097-00D of the Agricultural Sci Rep. 2016;6:39493. Research Service, a division of the US Department of Agriculture. 10. Weber KL, Welly BT, Van Eenennaam AL, Young AE, Porto-Neto LR, Reverter A, Rincon G. Identification of gene networks for residual feed intake in Angus Availability of data and materials cattle using genomic prediction and RNA-seq. PLoS One. 2016;11:e0152274. Raw RNA sequence data for each the 80 animals used in this study was 11. Guo B, Greenwood PL, Café LM, Zhou G, Zhang W, Dalrymple BP. submitted to the National Center for Biotechnology Information Sequence Transcriptome analysis of cattle muscle identifies potential markers for skeletal Read Archive (SRA) with Accession Number SRP092904. muscle growth rate and major cell types. BMC Genomics. 2015;16:177. 12. Shi L, Jones WD, Jensen RV, Harris SC, Perkins RG, Goodsaid FM, Guo L, Authors’ contributions Croner LJ, Boysen C, Fang H, et al. The balance of reproducibility, sensitivity, BNK: Processed and analyzed RNA sequencing data, designed and implemented and specificity of lists of differentially expressed genes in microarray studies. the meta-analysis procedure, and drafted the manuscript; CMZ: Assisted with BMC Bioinformatics. 2008;9(Suppl 9):S10. design of the experiment, collected samples, performed laboratory work and 13. Li MD, Burns TC, Morgan AA, Khatri P. Integrated multi-cohort drafted the manuscript; JWK: Participated in discussions regarding data analyses; transcriptional meta-analysis of neurodegenerative diseases. Acta LAK: Participated in discussions regarding data analyses; WMS: Participated in Neuropathol Commun. 2014;2:93. discussions regarding data analyses; WTO: Participated in discussions regarding 14. Andres-Terre M, McGuire HM, Pouliot Y, Bongen E, Sweeney TE, Tato CM, interpretation of results; HCF: Performed animal phenotyping and analyses for Khatri P. Integrated, multi-cohort analysis identifies conserved transcriptional animal selection, collected samples; AKL-P: Conceived and designed experiments, signatures across multiple respiratory viruses. Immunity. 2015;43:1199–211. participated in discussions regarding data analyses, and drafted the manuscript. 15. Sweeney TE, Braviak L, Tato CM, Khatri P. Genome-wide expression for All authors made significant contributions editing the manuscript. All authors read diagnosis of pulmonary tuberculosis: a multicohort analysis. Lancet Respir and approved the final manuscript. Med. 2016;4:213–24. 16. Arthur JPF, Herd RM. Residual feed intake in beef cattle. Rev Bras Zootec. Ethics approval 2008;37:269–79. The U.S. Meat Animal Research Center (USMARC) Animal Care and Use 17. FASS. Guide for care and use of agricultural animals in agricultural research Committee reviewed and approved all animal procedures. and teaching. Savoy, IL: Fed. Animal Science Society; 1999. 18. Schiermiester LN, Thallman RM, Kuehn LA, Kachman SD, Spangler ML. Competing interests Estimation of breed-specific heterosis effects for birth, weaning, and An author of this manuscript, Warren Snelling, is a member of the BMC yearling weight in cattle. J Anim Sci. 2015;93:46–52. Genomics editorial board as an Associate Editor. 19. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20. 20. Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with Publisher’sNote RNA-Seq. Bioinformatics. 2009;25:1105–11. Springer Nature remains neutral with regard to jurisdictional claims in published 21. Anders S, Pyl PT, Huber W. HTSeq- a Python framework to work with high- maps and institutional affiliations. throughput sequencing data. Bioinformatics. 2015;31:166–9. 22. Anders S, Huber W. Differential expression of RNA-Seq data at the gene Author details level- the DESeq package. Heidelberg, Germany: European Molecular USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE 68933, USA. Biology Laboratory (EMBL); 2012. Current Affiliation: UT Southwestern Medical Center, Dallas, TX 75390, USA. 23. Love M, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:1. Received: 9 February 2017 Accepted: 9 May 2018 24. Robinson MD, McCarthy DJ, Smyth GK. edgeR: s Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–40. References 25. Fisher RA. Statistical methods for research workers. Edinburgh: Oliver and 1. Moore SS, Mujibi FD, Sherman EL. Molecular basis for residual feed intake in Boyd; 1932. beef cattle. J Anim Sci. 2009;87(Suppl):E41–7. 26. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and 2. Trenkle A, Willham RL. Beef production efficiency. Science. 1977;198:1009–15. powerful approach to multiple testing. J R Stat Soc Ser B. 1995;57:289–300. 3. Aloha JK, Hill RA. Input Factors Affecting Profitability: A changing paradigm 27. Miller RG. The jackknife:a review. Biometrika. 1974;61:1–15. and a challenging time. 2012: Feed Efficiency in the Beef Industry, First 28. Mi H, Poudel S, Muruganujan A, Casagrande JT, Thomas PD. PANTHER Edition. John Wiley & Sons. version 10: expanded protein families and functions, and analysis tools. 4. Rolfe KM, Snelling WM, Nielsen MK, Freetly HC, Ferrell CL, Jenkins TG. Nucleic Acids Res. 2016;44(D1):D336–42. Genetic and phenotypic parameter estimates for feed intake and other 29. Culbertson MM, Speidel SE, Peel RK, Cockrum RR, Thomas MG, Enns RM. traits in growing beef cattle, and opportunities for selection. J Anim Sci. Optimum measurement period for evaluating feed intake traits in beef 2011;89(11):3452–9. 5. Kelly AK, Waters SM, McGee M, Fonseca RG, Carberry C, Kenny DA. mRNA cattle. J Anim Sci. 2015;93:2482–7. expression of genes regulating oxidative phosphorylation in the muscle of 30. Wang Z, Nkrumah JD, Basarab JA, Goonewardene LA, Okine EK, Crews DH beef cattle divergently ranked on residual feed intake. Phsyiological Jr, Moore SS. Test duration for growth, feed intake, and feed efficiency in Genomics. 2011;42:12–23. beef cattle using the GrowSafe system. J Anim Sci. 2006;84:2289–98. 6. Kelly AK, Waters SM, McGee M, Browne JA, Magee DA, Kenny DA. 31. Jing L, Hou Y, Wu H, Miao Y, Li X, Cao J, Brameld JM, Parr T, Zhao S. Expression of key genes of the somatotropic axis in longissimus dorsi Transcriptome analysis of mRNA and miRNA in skeletal muscle indicates an Keel et al. BMC Genomics (2018) 19:430 Page 11 of 11 important network for differential residual feed intake in pigs. Sci Rep. 2015;5: 32. Louveau I, Vincent A, Tacher S, Glibert H, Gondret F. Increased expressions of genes and proteins involved in mitochondrial oxidation and antioxidant pathway in adipose tissue of pigs selected for low residual feed intake. J Anim Sci. 2016;94:5042–54. 33. Paula EFE, Souza DF, Monteiro ALG, Almeida Santana MH, Gilaverte S, Junior PR, Dittrich L. Residual feed intake and hematological and metabolic blood profiles of Ile de France lamb. R Bras Zootec. 2013;42:806–12. 34. Montanholi YR, Haas LS, Swanson KC, Coomber BL, Yamashiro S, Miller SP. Liver morphometrics and metabolic blood profile across divergent phenotypes for feed efficiency in the bovine. Acta Vet Scand. 2017;59:24. 35. Connell A, Calder AG, Anderson SE, Lobley GE. Hepatic protein synthesis in he sheep:effect of intake as monitored by use of stable-isotope-labelled glycine, leucine and phenylalanine. Brit J Nutr. 1997;77:255–71. 36. Kong RSG, Liang G, Chen Y, Stothard P, Guan LL. Transcriptome profiling of the rumen epithelium of beef cattle differing in residual feed intake. BMC Genomics. 2016;17:592. 37. Bottje W, Kong B-W, Reverter A, Waardenberg AJ, Lassiter K, Hudson NJ. Progesterone signaling in broiler skeletal muscle is associated with divergent feed efficiency. BMC Systems Biol. 2017;11:29. 38. Zhou N, Lee WR, Abasht B. Messenger RNA sequencing and pathway analysis provide novel insights into the biological basis of chickens’ feed efficiency. BMC Genomics. 2015;16:195.
– Springer Journals
Published: Jun 4, 2018