Genome sequencing and protein domain annotations of Korean Hanwoo cattle identify Hanwoo-specific immunity-related and other novel genes

Genome sequencing and protein domain annotations of Korean Hanwoo cattle identify Hanwoo-specific... Background: Identification of genetic mechanisms and idiosyncrasies at the breed-level can provide valuable information for potential use in evolutionary studies, medical applications, and breeding of selective traits. Here, we analyzed genomic data collected from 136 Korean Native cattle, known as Hanwoo, using advanced statistical methods. Results: Results revealed Hanwoo-specific protein domains which were largely characterized by immunoglobulin function. Furthermore, domain interactions of novel Hanwoo-specific genes reveal additional links to immunity. Novel Hanwoo-specific genes linked to muscle and other functions were identified, including protein domains with functions related to energy, fat storage, and muscle function that may provide insight into the mechanisms behind Hanwoo cattle’s uniquely high percentage of intramuscular fat and fat marbling. Conclusion: The identification of Hanwoo-specific genes linked to immunity are potentially useful for future medical research and selective breeding. The significant genomic variations identified here can crucially identify genetic novelties that are arising from useful adaptations. These results will allow future researchers to compare and classify breeds, identify important genetic markers, and develop breeding strategies to further improve significant traits. Keywords: Cattle, Hanwoo, Genome sequencing, Protein domain, Unaligned read assembly, DNA-Seq Background Consequently, one of the main goals of the meat pro- Hanwoo is a Korean native taurine breed of cattle that duction industry worldwide is to increase the incidence has been around since 2000 BC. Although their original of this trait [2]. Given this focus, several studies have in- primary purpose was to serve as farming and transporta- vestigated gene expression patterns with the primary tion cattle, the rapid growth of the Korean economy that goal of determining which genes are responsible for occurred in the 1960’s and its associated food demands Hanwoo-specific high fat concentration [3–7]. led to this breed being used as a main source of meat Here we gathered genomic data from 136 Hanwoo cat- [1]. Since then, the demand for this product in Korea tle that we analyzed using advanced statistical methods. has skyrocketed. This is due to the high percentage of We show that investigation of the genome of this unique fat marbling in Hanwoo meat, a characteristic that is set of cattle individuals with the general goal of identify- unique to the breed. Hanwoo loin muscles have approxi- ing breed-level idiosyncrasies can provide valuable infor- mately 24% intramuscular fat content [2]. The quality mation for potential use in evolutionary studies, medical and price of meat is often determined by fat marbling. applications, and breeding of selective traits. The goal is to enhance our understanding of characteristics of beef * Correspondence: lim.dj@korea.kr cattle breeds with unique adaptations and beneficial Animal Genomics & Bioinformatics Division, National Institute of Animal traits that have not yet been well elucidated. This would Science, RDA, 77 Chuksan-gil, Kwonsun-gu, Suwon 441-706, Republic of Korea make it possible to selectively breed for these traits in Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 2 of 13 other breeds of cattle worldwide to improve meat quality LEADING:10, TRAILING:10, MINLEN:80), respectively. and revolutionize the field of meat production. Then, high-quality sequence reads were mapped to the Bos taurus reference genome (UMD 3.1) using Bow- Methods tie2.2.6 [10] with default settings in order to extract un- Alignment of unaligned reads for the detection for novel aligned reads. Removal of duplicate reads was performed genes using the Hanwoo whole genome using Picard (ver 1.06) and indexing, sorting, and un- Blood samples for whole genome sequencing were aligned read extraction was performed using Samtools obtained from 136 Korean beef cattle (Hanwoo) individ- v1.3.1 [11]. GATK v3.4.46 [12–14] was used for local uals reared at the Hanwoo Improvement Center of the realignment and recalibration of the alignment (blue National Agricultural Cooperative Federation (Seosan, boxes on the pipeline figure; Fig. 1). A summary of se- Chungnam, Korea). Indexed shotgun paired-end (PE) quencing data is provided in Additional file 1: Table S1. libraries with 500 bp average length inserts were Since we are interested in information originating generated from these samples using the TruSeq Nano from the sample itself and not detected from the refer- DNA Library Prep Kit (Illumina, USA) following the ence sequence, we created an assembled genome at the standard Illumina sample-preparation protocol. Briefly, scaffold level to discover whether unaligned reads actu- 200 ng of gDNA was fragmented using a Covaris M220 ally constitute functional units (genes) on their own focused-ultrasonicator (Woburn, MA, USA) to produce genome. This scaffold was created from one randomly fragments with a median size of ~ 500 bp. The frag- selected sample from our pool of samples. The Broad In- mented DNA was subjected to end repair, A-tailing, stitute’s stand-alone ALLPATHS-LG fragment read error and indexed adapter ligation (~ 125 bp adapter). correction module [15, 16] was used for error correction Adapter-ligated DNA of 550 to 650 bp in length was as a precursor to de novo assembly. De novo assembly amplified using PCR for 8 cycles. The size-selected was performed using an Iterative De Bruijn Assembler libraries were analyzed using the Agilent 2100 Bioana- of Uneven Depth (IDBA_UD: [17, 18], an iterative De lyzer (Agilent Technologies) to determine the size Bruijn graph de novo assembler for short reads distribution and to check for adapter contamination. sequencing data that utilizes paired-end reads to The resulting libraries were sequenced using the assemble highly uneven low-depth regions. This tool Illumina HiSeq 2500 (2x125bp paired-end sequences) is useful for optimizing the length gap problem and and NextSeq500 (2x150bp paired-end sequences) iterating different K-mer length (green boxes on the Next-Gen sequencers. pipeline figure; Fig. 1). The bioinformatics pipeline used in this study is For unaligned read alignments, we extracted reads for described in Figs. 1 and 2. Quality control for per-base each sample that was not aligned to the reference quality of reads and removal of potential adaptor se- genome. Using the extracted unaligned reads (blue boxes quences was performed using fastQC v0.11.4 [8] and on the pipeline figure; Fig. 1) and the assembled Trimmomatic v0.36 [9] software (seed mismatches:2, scaffold-level genome (green boxes on the pipeline palindrome clip threshold:30, simple clip threshold:10, figure; Fig. 1) of each sample, alignment of unaligned Fig. 1 Detailed unaligned read assembly pipeline. Green squares represent the first stage of analysis- assembly of scaffold-level genome; blue squares represent the second stage of analysis- extraction of unaligned reads; yellow squares represent the third and final stage of analysis- gene prediction and functional annotation Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 3 of 13 areas of the genome; and (3) Mining the uncovered genes and associated domains to identify important gene functions and networks involved in positive traits. A summary of representative reference genome builds via short read assembly is presented in Table 1.We mapped unaligned reads against the reference genome and extracted information to a depth of 10× (meaning Fig. 2 Simplified pipeline of unaligned read assembly that each base was sequenced an average of 10 times). We predicted a total of 614 gene regions using scaffolds reads to the scaffold was carried out using Bowtie2 containing locations higher than depth of 10×. Of the (remapping). The identified remapped sequences 614 genes, 283 genes were covered by unaligned reads throughout the sequence were assumed to represent with at least depth of 10×. Hanwoo-specific sequences. These resulting regions Cross-referencing of protein sequences from the 283 constitute regions that are distinctive from the reference. genes against the Pfam database identified associated We performed depth profiling to diminish the possibility protein domains covering a total of 168 scaffolds. Over- of false positives. We identified scaffolds containing all, 311 Pfam protein domains were identified when locations meeting our depth cutoff of 10× (an arbitrary using data filtered for sequences with an average cutoff selected for result filtering), and used the collected mapped base depth coverage of less than 10×. These scaffolds for gene prediction using the gene prediction numbers suggest that there was more than one affili- program Augustus 3.1.0. Out of the resulting 614 pre- ated domain identified for some gene regions. Due to dicted genes, we extracted protein sequences covered by space limitations, Table 2 lists significantly identified unaligned reads with at least depth of 10×. (E- value <1XE-100) Pfam protein family domain ana- The resulting total of 283 protein sequences were lysis results. An extended list of significantly identi- cross-referenced against the Pfam database of protein fied Pfam domains with E-value <1E-40 is presented families (pfam.xfam.org;[19]) using the protein domain in Additional file 2: Table S2. detection program InterProScan-5.15-54.0 in order to identify protein domains affiliated with those areas of Hanwoo-specific genes linked to immunity the genome. In order to assign meaning and infer the A number of domains were largely characterized by im- function of these domains, we searched for these identi- mune system function. Selected immune system-related fied domains within DOMINE (http://domine.utdalla- genes are shown in Table 3. Six of the seven domains s.edu/cgi-bin/Domine), a database of known and shown are associated to the immunoglobulin function, predicted protein domain interactions [20, 21]. Using while the remaining domain is associated with the inter- Interpro [22], we obtained GO (Gene Ontology) Cellular feron group of signaling proteins, which is crucial for Component (CC), Molecular Function (MF), and Bio- the immune system response as well. logical Process (BP) terms for each individual domain The interferon-alpha/beta receptor is a cell surface re- [23]. Next, gene ontology results were summarized and ceptor made up of one chain with two subunits, IFNAR1 visualized with the online tool REVIGO (http://revi- and IFNAR2. The interferon receptors have antiviral, go.irb.hr;[24]) to better interpret our results. Next, using Table 1 Summary of the results of representative reference REVIGO’s Interactive Graph tool [24] and exporting genome build via short read assembly (> = 1 kb) results into the Cytoscape software package [25], we Base pairs Percent (%) created a graph-based visualization of the identified Number of scaffolds 295,265 100 terms for each GO category. Using the above described methodologies and annota- Residue counts A 701,475,984 29.24 tions we were able to align and map genome sequences C 498,799,879 20.79 as well as predict genes that may be related to G 498,441,379 20.78 Hanwoo-specific characteristics. T 692,999,844 28.89 N 7,063,142 0.29 Results and discussion Total 2,398,780,228 100 Research objectives and genome build summary Our main research objectives included: (1) Assembling Sequence lengths Minimum 1000 and mapping unaligned reads in order to identify and Maximum 136,625 predict genes in Hanwoo cattle; (2) Cross-referencing re- Average 8124.16 sults against a comprehensive protein domain database N50 13,528 in order to identify protein domains affiliated with those Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 4 of 13 Table 2 Significantly identified (E- value <1XE-100) Pfam protein family domain analysis results Gene name Length Source Accession Description Start Stop E-value scaffold_2197.g59.t1 581 Pfam PF00063 Myosin head (motor domain) 30 575 5.60E-207 scaffold_1285.g30.t1 417 Pfam PF15718 Domain of unknown function (DUF4673) 116 412 4.50E-154 scaffold_6851.g129.t1 391 Pfam PF03028 Dynein heavy chain and region D6 of dynein motor 2 390 1.50E-120 scaffold_13817.g209.t1 758 Pfam PF01403 Sema domain 59 467 2.30E-117 scaffold_29068.g344.t1 348 Pfam PF16021 Programmed cell death protein 7 33 344 3.00E-114 scaffold_15941.g224.t1 887 Pfam PF04849 HAP1 N-terminal conserved region 1 249 2.60E-108 scaffold_5769.g113.t1 246 Pfam PF00244 14–3-3 protein 5 238 3.60E-107 scaffold_1936.g56.t1 564 Pfam PF08235 LNS2 (Lipin/Ned1/Smp2) 300 525 1.30E-104 antiproliferative, and immunomodulatory functions, as that an understanding of the evolution and expression of well as being highly involved in pregnancy [26, 27]. mammalian immune system genes has important impli- Interferon-τ, a type I interferon, has been shown to pre- cations for human health. Bovine antibodies have been vent a return to ovarian cyclicity after conception to of particular interest, as they exhibit prophylactic and ensure the continuation of the pregnancy in ruminant therapeutic properties in response to several human and ungulate species; this interferon appears to be the main animal infectious diseases [33–36]. Additionally, re- factor responsible for prevention of degradation of the searchers have recently developed transgenic calves that corpus luteum [28, 29]. produce human immunoglobulin, speaking to the in- In addition to these reproductive roles, this receptor is credible importance of cattle as model organisms for the responsible for binding type 1 interferons interferon–α study of human immunity and disease [37]. Secondly, and –β and activating the JAK-STAT signaling pathway, understanding the molecular and genetic basis of im- which is associated with DNA-transcription and the munity in cattle breeds can not only serve to further our expression of genes related to immunity, proliferation, understanding of the breeds, but also to provide genetic and differentiation, among others [30]. The JAK-STAT information which can be used for selective breeding in pathway has primary functions related to immunity. In order to improve performance and survival of livestock. fact, drug therapies that aim to turn down the immune Immunity in cattle varies vastly by breed. For example, response of the body and modulate host responses to African cattle are known for their incredible resistance disease and infection target this pathway [31]. The to tick and gastrointestinal parasite infestations, traits expression of the interferon group of signaling proteins that have developed in response to thousands of years of in our Hanwoo cattle samples suggests that Hanwoo evolution in the harsh environments of Africa. A par- may have breed-specific immune system functions that ticularly amazing adaptation is the resistance of several are not yet well understood. African breeds to trypanosomiasis, also known as sleeping Our analysis also identified associated protein domains sickness [38]. Identification of genes responsible for which are largely characterized by the immunoglobulin immunity and introduction of identified immunity-related function. These results are particularly salient given the genes in cattle breeds that are productive but highly significance of these kinds of results for medical research susceptible to disease may improve their resistance, sur- and selective breeding. The bovine immune system has vival, and productivity. Understanding genetic features been a topic of interest to researchers for quite some controlling these mechanisms will allow researchers to de- time now, mainly due to two reasons [32]. The first is velop appropriate breeding strategies. Table 3 Selected immune system-related genes and affiliated protein domains Gene name Length Source Accession Description Start Stop E-value scaffold_2520.g67.t1 1075 Pfam PF13895 Immunoglobulin domain 425 491 5.40E-09 scaffold_19370.g263.t1 508 Pfam PF07679 Immunoglobulin I-set domain 381 454 8.30E-07 scaffold_8624.g151.t1 512 Pfam PF13895 Immunoglobulin domain 10 73 3.90E-08 scaffold_14147.g214.t1 159 Pfam PF07679 Immunoglobulin I-set domain 27 112 3.40E-24 scaffold_13817.g209.t1 758 Pfam PF00047 Immunoglobulin domain 550 628 5.10E-09 scaffold_5779.g114.t1 142 Pfam PF07679 Immunoglobulin I-set domain 35 70 6.20E-07 scaffold_46987.g437.t1 460 Pfam PF09294 Interferon-alpha/beta receptor, fibronectin type III 44 142 1.20E-17 Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 5 of 13 More generally, research in immunoglobulin genetics provide insight into the mechanisms behind Hanwoo is particularly salient for several reasons. Although re- cattle’s uniquely high percentage of intramuscular fat search into the genetic aspects of and expression of and fat marbling. For example, LNS2 (Lipin/Ned1/ genes related to immunoglobulin has been widely con- Smp2) domain, which includes Lipin, was significantly ducted in humans and mice, research in this field is lack- identified (E-value = 1.30E-104) in our data (Table 2). ing when it comes to livestock breeds, particularly cattle. Lipin, encoded by the Lpin1 gene, is a powerful gene Information is still needed to complete previous infor- which largely controls how the body produces, stores, mation, including the number of available gene segments and uses fat. Mice deficient in Lipin do not develop ei- and gene families. This kind of information can be used ther diet-induced or genetic obesity [47]. Additionally, in the future to study and create synthetic recombinant enhanced Lipin expression has been shown to promote species-specific antibodies, which could be used to treat adiposity in mice [48]. and prevent infectious diseases. Additionally, the Myosin head (motor domain) protein domain, which is associated with muscle function, was Domain interactions of Hanwoo-specific genes reveal significantly identified (E-value = 5.60E-207, Table 2). additional links to immunity Myosin is a chief component of myofibril filaments, Additionally, more general consideration of significantly which are responsible for muscle contraction. Myosin identified protein family domains from the Pfam database also actively participates in the conversion of ATP chem- provided information needed to further understanding the ical energy to mechanical energy through its interaction breed-specific molecular mechanisms of Hanwoo cattle. with Actin [49]. Additionally, the Dynein heavy chain Table 2 lists highly significantly identified (E-value and region D6 of the dynein motor domain and 14– <1XE-100) Pfam domains. In order to assign meaning and 3-3 protein domain were significantly identified infer the function of these domains, which include several (E-values = 1.50E-120,3.60E-107 respectively), both of not well understood but highly significant protein domains, which are also largely responsible for ATP energy we searched for these identified domains within DOMINE conversion [50–52]. These results suggest that these (http://domine.utdallas.edu/cgi-bin/Domine), adatabaseof proteins domains are those which are primarily re- known and predicted protein domain interactions [20, 21]. sponsible for providing energy to the muscle and pos- Among these, several interesting results reveal the genetic sibly causing the breed-specific high percentage of intricacies of the Hanwoo genome and its functions. intramuscular fat that is observed in Hanwoo cattle. Several of the most significantly identified protein Several of the other identified domains, such as the domains appear to be closely linked with immune sys- HAP1 N-terminal conserved region domain, were tem function, further supporting our previous findings. found to lack interactions with any other domains For example, the significantly identified Sema domain and their specific roles in cattle have not been well (E-value = 2.30E-117) appears to be primarily associated established. As we learn more about these proteins with immune system function. The Sema domain not and their functions in the future, we may be able to only forms interactions with the Immunoglobulin do- better interpret these results. main, but also interacts with the Thrombospondin type 1 (TSP-1) domain, which has been shown to control im- Interpretation of gene ontology terms associated with mune regulation. Thrombospondin, an extremely large the entire set of Pfam domains multi-domain glycoprotein, is crucial to certain mecha- As previously discussed, we were able to identify 311 nisms related to angiogenesis, cell proliferation, and im- Pfam domains mapping to 168 scaffolds not shared with mune responses [39] such as the chemotactic response common cattle. We then filtered that list and kept only to tissue damage and the facilitation of phagocytosis of the highest hits. Within that short list, we revealed high damaged cells [40–42]. Mice deficient in TSP-1 are more enrichment for muscle and immunology genes. However, susceptible to inflammation and injury, either as a side this approach provides a very limited look at our results. effect of drugs or as a result of gene activation [43–46]. Thus, we aimed to further explore Hanwoo-specific do- Given the strong role of this protein domain in immun- mains by analyzing the enrichment of functional cat- ity, our identification of this pathway here once again egories associated with each individual domain of the confirms that there are unique functions of immunity at entire list. Using Interpro [22], we obtained GO (Gene play operating specifically in the Hanwoo genome. Ontology) Cellular Component (CC), Molecular Func- tion (MF), and Biological Process (BP) terms for each in- Hanwoo-specific genes linked to muscle and other dividual domain [23]. Next, gene ontology results were functions summarized and visualized with the online tool REVIGO Significantly identified protein domains with functions (http://revigo.irb.hr;[24] to better interpret our results. related to energy, fat storage, and muscle function may Tables 4, 5, and 6 summarize the BP, CC, and MF GO Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 6 of 13 terms, respectively. REVIGO calculates “frequency” and between the nodes of our graph (GO terms) represent the “uniqueness” values, with frequency representing the top 3% strongest pairwise similarities between terms [24]. proportion of the specified GO term within the entire The BP GO term visualization (Fig. 3) can be charac- Bos taurus species-specific Uniprot protein annotation terized by a large number of un-connected solo terms database, and uniqueness determining within the input- and shows a large diversity of biological processes being ted list whether a term is an outlier when compared affected, meaning that a large rewiring of functionality is semantically to the list as a whole [24]. embedded in the new genes acquired by Hanwoo cattle. Next, using REVIGO’s Interactive Graph tool [24] and Note that the most significant term is the most specific, exporting results into the Cytoscape software package ‘centriole replication’, which is also connected to the [25], we created a graph-based visualization of the iden- general term ‘microtubule-based movement’; dynein tified terms for each GO category. Figures 3, 4, and 5 (significantly identified from our data) moves along mi- display visualizations of BP, CC, and MF GO terms, crotubules, so this term may reflect the biological pro- respectively. The radius of the bubbles represents the cesses responsible for dynein’s role in ATP energy generality of the specified term; a small bubble implies conversion. This is quite unique and unexpected, since it higher specificity. The p-value of each GO term is repre- signals an important role of cell division [53]. The sec- sented by the color shading of each bubble, with darker ond group of more significant terms are less specific but colors representing higher significance. The edges all related to transport, particularly ‘anion transport’, Table 4 Summary of enriched Gene Ontology (GO) biological process (BP) terms among total identified Pfam protein family domains a b term_ID description Frequency log10 p-value Uniqueness GO:0008152 metabolic process 62.92% −4.2076 0.974 GO:0007154 cell communication 28.75% −5.9208 0.866 GO:0006139 nucleobase-containing compound metabolic process 28.16% −31.0655 0.776 GO:0007165 signal transduction 26.76% −18.9208 0.794 GO:0006810 transport 19.48% −32.6383 0.765 GO:0006355 regulation of transcription, DNA-templated 14.27% −8.9586 0.608 GO:0006464 cellular protein modification process 14.26% −11.7212 0.62 GO:0007186 G-protein coupled receptor signaling pathway 8.87% −25.4318 0.803 GO:0006508 proteolyis 7.74% −32.6778 0.741 GO:0006811 ion transport 7.05% −12.2076 0.744 GO:0055114 oxidation-reduction process 6.85% −11.699 0.831 GO:0055085 transmembrane transport 6.54% −32.6383 0.714 GO:0016192 vesicle-mediated transport 4.60% −30.7447 0.785 GO:0006886 intracelular protein transport 3.09% −30.7447 0.792 GO:0016567 protein ubiquitination 2.40% −14.3468 0.673 GO:0006820 anion transport 2.07% −61.2147 0.769 GO:0006457 protein folding 1.00% −19.9208 0.736 GO:0007018 microtubule-based movement 1.00% −45.6021 0.843 GO:0035023 regulation of Rho protein signal transduction 0.92% −11.1675 0.832 GO:0016573 histone acetylation 0.60% −51.3372 0.635 GO:0043401 steroid hormone mediated signaling pathway 0.46% −28.6383 0.839 GO:0045454 cell redox homeostasis 0.40% −27.0269 0.865 GO:0000413 protein peptidyl-prolyl isomerization 0.24% −19.9208 0.709 GO:0006400 tRNA modification 0.20% −34.7447 0.692 GO:0018149 peptide cross-linking 0.17% −13.4318 0.731 GO:0007099 centriole replication 0.08% − 153.347 0.832 GO:0009396 folic acid-containing compound biosynthetic process 0.05% −11.699 0.786 Frequency represents the proportion of the specified GO term within the entire Bos Taurus species-specific Uniprot protein annotation database. Higher frequencies represent more general and common terms, while terms with a lower frequency are rare and specific Uniqueness represents whether a term is an outlier when compared semantically to the list as a whole Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 7 of 13 Table 5 Summary of enriched Gene Ontology (GO) cellular component (CC) terms among total identified Pfam protein family domains a b term_ID description Frequency log10 p-value Uniqueness GO:0005622 intracellular 63.18% −4.2076 0.875 GO:0016020 membrane 47.23% −5.7959 0.872 GO:0016021 integral component of membrane 29.77% −10.8239 0.832 GO:0005634 nucleus 27.70% −10.6576 0.688 GO:0005739 mitochondrion 9.22% −12.5686 0.636 GO:0005740 mitochondrial envelope 3.09% −32.3565 0.576 GO:0031012 extracellular matrix 2.00% −15 0.77 GO:0016459 myosin complex 0.38% − 206.252 0.589 GO:0030286 dynein complex 0.20% −45.6021 0.594 Frequency represents the proportion of the specified GO term within the entire Bos Taurus species-specific Uniprot protein annotation database. Higher frequencies represent more general and common terms, while terms with a lower frequency are rare and specific Uniqueness represents whether a term is an outlier when compared semantically to the list as a whole Fig. 3 Visualization of significantly identified Gene Ontology (GO) Biological Process (BP) terms. The radius of the bubbles represents the generality of the specified term (a small bubble implies higher specificity). The p-value of each GO term is represented by the color shading of each bubble (darker colors representing higher significance). The edges between the nodes of our graph (GO terms) represent the top 3% strongest pairwise similarities between terms Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 8 of 13 Table 6 Summary of enriched Gene Ontology (GO) molecular function (MF) terms among total identified Pfam protein family domains a b term_ID description Frequency log10 p-value Uniqueness GO:0003700 sequence-specific DNA binding transcription factor activity 5.30% −10.6576 0.958 GO:0003712 transcription cofactor activity 1.78% −39.8861 0.957 GO:0003824 catalytic activity 37.22% −13.4559 0.972 GO:0004871 signal transducer activity 11.94% −25.4318 0.927 GO:0004930 G-protein coupled receptor activity 7.87% −27.6198 0.925 GO:0005089 Rho guanyl-nucleotide exchange factor activity 0.48% −11.699 0.956 GO:0005216 ion channel activity 2.49% −13.9208 0.896 GO:0015075 ion transmembrane transporter activity 5.28% −5.7959 0.895 GO:0016773 phosphotransferase activity, alcohol group as acceptor 4.60% −58.7447 0.771 GO:0004672 protein kinase activity 3.87% − 14.1308 0.772 GO:0043015 gamma-tubulin binding 0.10% −25.8539 0.878 GO:0008017 microtubule binding 1.06% −16.4815 0.848 GO:0019001 guanyl nucleotide binding 2.48% −25.4318 0.846 GO:0005509 calcium ion binding 3.94% −19.0655 0.863 GO:0005515 protein binding 26.71% −7.3665 0.92 GO:0004488 methylenetetrahydrofolate dehydrogenase (NADP+) activity 0.02% −13.4559 0.878 GO:0003755 peptidyl-prolyl cis-trans isomerase activity 0.26% −19.9208 0.876 GO:0003777 microtubule motor activity 0.51% −45.6021 0.814 GO:0005544 calcium-dependent phospholipid binding 0.17% −19.0655 0.844 GO:0042802 identical protein binding 4.77% −25.8539 0.878 GO:0031683 G-protein beta/gamma-subunit complex binding 0.16% −25.4318 0.883 GO:0043565 sequence-specific DNA binding 4.31% −10.6576 0.855 GO:0016787 hydrolase activity 15.05% −7.8861 0.841 GO:0008484 sulfuric ester hydrolase activity 0.11% −31.7447 0.832 GO:0004181 metallocarboxypeptidase activity 0.15% −32.6778 0.809 GO:0004222 metalloendopeptidase activity 0.79% −15 0.79 GO:0019901 protein kinase binding 1.80% −7.2007 0.885 GO:0008479 queuine tRNA-ribosyltransferase activity 0.02% −34.7447 0.849 GO:0003676 nucleic acid binding 21.33% −8.6021 0.849 GO:0005102 receptor binding 6.56% −62.1871 0.875 GO:0004402 histone acetyltransferase activity 0.24% −51.3372 0.812 GO:0003677 DNA binding 10.28% −9.5376 0.843 GO:0008408 3′-5′ exonuclease activity 0.18% −31.0655 0.827 GO:0016746 transferase activity, transferring acyl groups 1.42% −30.9208 0.805 GO:0003723 RNA binding 7.68% −5.9208 0.847 GO:0046872 metal ion binding 20.96% −4.5376 0.845 GO:0004550 nucleoside diphosphate kinase activity 0.14% −55.585 0.816 GO:0008236 serine-type peptidase activity 1.32% −7.1739 0.793 GO:0008270 zinc ion binding 6.73% −4.2076 0.856 GO:0004129 cytochrome-c oxidase activity 0.43% −12.5686 0.809 GO:0003924 GTPase activity 1.03% −25.4318 0.806 GO:0005524 ATP binding 8.83% −14.1308 0.751 GO:0000166 nucleotide binding 14.40% −7.3565 0.823 GO:0003810 protein-glutamine gamma-glutamyltransferase activity 0.07% −14.7696 0.823 Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 9 of 13 Table 6 Summary of enriched Gene Ontology (GO) molecular function (MF) terms among total identified Pfam protein family domains (Continued) a b term_ID description Frequency log10 p-value Uniqueness GO:0005543 phospholipid binding 1.33% −16.6021 0.829 GO:0035091 phosphatidylinositol binding 0.84% −17 0.829 Frequency represents the proportion of the specified GO term within the entire Bos Taurus species-specific Uniprot protein annotation database. Higher frequencies represent more general and common terms, while terms with a lower frequency are rare and specific Uniqueness represents whether a term is an outlier when compared semantically to the list as a whole which may be associated with ATP energetics. Another chemical energy to mechanical energy and serve crucial uniqueness is the steroid hormone mediated signaling functions for muscle function. The connectivity of these pathway. Sex steroid hormones play a critical role in the nodes within our network visualization signifies that regulation of muscle, muscle strength, and growth and these two components work together and are potentially maintenance of muscle mass [54]. While identification significant in Hanwoo-specific characteristics, such as of this GO term most likely can be attributed to the their high percentage of intramuscular fat. The rest of aforementioned relationship between steroid hor- the terms are generic, independent CC terms that mones and muscle development, as a result of the include nucleus and membrane. breed-specific unique high-fat muscle development, it The MF GO term visualization (Fig. 5) can be charac- may also be due to the practices under which Han- terized by high connectivity, with the most significant woo are reared in order to enhance the natural fat values grouped together. Microtubule motor activity, marbling in their meat, such as feeding time and diet. another microtubule function related term, was also For example, cattle are fed a high-concentration grain identified at the molecular function level, once again diet as opposed to grass-feeding [55]. Diet has been suggesting ATP energetics at play. A unique feature of shown to have an effect on steroid hormones [56], this visualization, compared to the BP and CC visualiza- which may also in part explain the identification of tions, is the presence of 4 unconnected graphs as this GO term here. opposed to many unconnected terms or a single con- The CC GO term visualization (Fig. 4) can be charac- nected group. The first group features solely terms terized by a single connected group consisting of four related to binding. This group contains the following terms: dynein complex, myosin complex, mitochondrial terms: Sequence-specific DNA binding, DNA binding, envelope, and mitochondrion. As previously mentioned, RNA binding, Nucleic acid binding, ATP binding, the Myosin Head and Dynein heavy chain protein do- Phospholipid binding, Calcium-dependent phospholipid mains were found significantly identified in our results- binding, Guanyl nucleotide binding, Metal ion binding, both of which participate in the conversion of ATP Zinc ion binding, and Calcium ion binding. The second Fig. 4 Visualization of significantly identified Gene Ontology (GO) Cellular Component (CC) terms. The radius of the bubbles represents the generality of the specified term (a small bubble implies higher specificity). The p-value of each GO term is represented by the color shading of each bubble (darker colors representing higher significance). The edges between the nodes of our graph (GO terms) represent the top 3% strongest pairwise similarities between terms Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 10 of 13 Fig. 5 Visualization of significantly identified Gene Ontology (GO) Molecular Function (MF) terms. The radius of the bubbles represents the generality of the specified term (a small bubble implies higher specificity). The p-value of each GO term is represented by the color shading of each bubble (darker colors representing higher significance). The edges between the nodes of our graph (GO terms) represent the top 3% strongest pairwise similarities between terms group consists of three connected terms: Ion Channel ac- activity, alcohol group as acceptor. Transferases are tivity, Methylenetetrahydrofolate dehydrogenase (NADP+) enzymes which are responsible for catalyzation of the activity, and Cytochrome-c oxidase activity. The third transfer of certain functional groups from one molecule to group consists of six connected terms: Sulfuric ester another. They are essential for countless biochemical hydrolase activity, 3′-5′ exonuclease activity, Microtubule processes throughout the body. In cattle specifically, it has motor activity, GTPase activity, Serine-type peptidase been shown that the activity of transferases is critical for activity, and Metallocarboxypeptidase activity. embryo development [57]. The expression of genes with The fourth and final group consists of 5 terms related to transferase activity function varies between abnormal and the activity of transferases: Nucleoside diphosphate kinase normal pregnancies [58, 59]. Therefore, the expression of activity, Transferase activity, transferring acyl groups, these transferase GO terms may be due to their role in Protein-glutamine gamma-glutamyltransferase activity, healthy pregnancy and development. However, interest- Histone acetyltransferase activity, and Phosphotransferase ingly, results of previous studies have demonstrated a Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 11 of 13 correlation between certain transferase activity genes, such Availability of data and materials All information supporting the results of this manuscript are included within the as GPAT1 and ATGL, and intramuscular fat content in article and additional files. The sequences of genes predicted through assembly Korean Cattle [60]. These previously identified results, of unaligned reads have been uploaded as a Additional files 1, 2, 3 and 4. when taken along with the comparatively high expression Authors’ contributions and connectivity of GO terms related to transferase KCA conceived the project and designed scientific objectives. DL performed activity, suggests that there may be unique mechanisms of Hanwoo sample collection and data generation. BHC conducted transferase activity in Hanwoo cattle which influences experimental design of the Hanwoo population. WK, SS, and KK performed genome annotation and data analysis. KCA performed comparative analysis their development and may perhaps be a factor impacting using bioinformatics tools. KCA interpreted data and wrote the manuscript. their species-specific high percentage of intramuscular fat. HK and DL organized and supervised the project. All authors read and approved the final manuscript. Conclusions Ethics approval and consent to participate The information unearthed from the comparison of No ethics statement was required for the collection of DNA samples. DNA was extracted either from artificial insemination bull semen straws or from breeds and identification of genetic variation in this blood samples obtained from the Hanwoo Improvement Center of the study will be invaluable for future research on the mo- National Agricultural Cooperative Federation (HICNACF) with the permission lecular determinants that have been bred in Hanwoo of the owners. The protocol was approved by the Committee on the Ethics of Animal Experiments of the National Institute of Animal Science (Permit cattle. Results revealed Hanwoo-specific protein domains Number: NIAS2015–774). which were largely characterized by immunoglobulin function. Furthermore, domain interactions of Competing interests The authors declare that they have no competing interests. Hanwoo-specific genes reveal additional links to immun- ity. Hanwoo-specific genes linked to muscle and other Publisher’sNote functions were identified, including protein domains Springer Nature remains neutral with regard to jurisdictional claims in with functions related to energy, fat storage, and muscle published maps and institutional affiliations. function that may provide insight into the mechanisms Author details behind Hanwoo cattle’s uniquely high percentage of Interdisciplinary Program in Bioinformatics, Seoul National University, intramuscular fat and fat marbling. Analyzing the whole 2 Kwan-ak St. 599, Kwan-ak Gu, Seoul 151-741, Republic of Korea. Department Hanwoo genome and reporting significant genomic vari- of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Republic of Korea. ations is crucial to identifying genetic novelties that are CHO&KIM genomics, Main Bldg. #514, SNU Research Park, Seoul National arising from useful adaptations. Similarly, such analysis University Mt.4-2, NakSeoungDae, Gwanakgu, Seoul 151-919, Republic of will allow future researchers to compare and classify Korea. Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA, 77 Chuksan-gil, Kwonsun-gu, Suwon 441-706, Republic breeds, identify important genetic markers, and develop of Korea. breeding strategies to further improve traits of economic value and biological significance. Received: 15 November 2017 Accepted: 14 May 2018 Additional files References 1. Lee S-H, Park B-H, Sharma A, Dang C-G, Lee S-S, Choi T-J, Choy Y-H, Kim H- C, Jeon K-J, Kim S-D, et al. Hanwoo cattle: origin, domestication, breeding Additional file 1: Table S1. Summary of sequencing data (DOCX 28 kb) strategies and genomic selection. J Anim Sci Technol. 2014;56(1):2. Additional file 2: Table S2. Significantly identified (E- value <1XE-40) 2. Jo C, Cho SH, Chang J, Nam KC. Keys to production and processing of Hanwoo Pfam protein family domain analysis results. (DOCX 17 kb) beef: a perspective of tradition and science. Anim Front. 2012;2(4):32–8. 3. Kim HJ, Sharma A, Lee SH, Lee DH, Lim DJ, Cho YM, Yang BS, Lee SH. Additional file 3: FASTA sequences for scaffolds which have locations Genetic association of PLAG1, SCD, CYP7B1 and FASN SNPs and their effects with depth > 10×. (XLSX 9907 kb) on carcass weight, intramuscular fat and fatty acid composition in Hanwoo Additional file 4: Protein sequences which have locations with depth > steers (Korean cattle). Anim Genet. 2016;48. https://doi.org/10.1111/age. 10×. (XLSX 101 kb) 4. Hwang YH, Joo ST. Fatty acid profiles of ten muscles from high and low marbled (quality grade 1++ and 2) Hanwoo steers. Korean J Food Sci Anim Abbreviations Resour. 2016;36(5):679–88. BP: Biological process; CC: Cellular component; GO: Gene ontology; 5. Sudrajad P, Sharma A, Dang CG, Kim JJ, Kim KS, Lee JH, Kim S, Lee SH. MF: Molecular function Validation of single nucleotide polymorphisms associated with carcass traits in a commercial Hanwoo population. Asian Australas J Anim Sci. 2016; Acknowledgements 29(11):1541–6. We acknowledge the support from different institutions and their personnel 6. Li XZ, Park BK, Hong BC, Ahn JS, Shin JS. Effect of soy lecithin on total providing help for the sampling of cattle (Hanwoo Improvement Center of cholesterol content, fatty acid composition and carcass characteristics in the the National Agricultural Cooperative Federation, HICNACF) and cattle Longissimus dorsi of Hanwoo steers (Korean native cattle). Anim Sci J. 2016; keepers for their assistance and permission to sample their herds. 7. Cho SH, Kang G, Seong P, Kang S, Sun C, Jang S, Cheong JH, Park B, Hwang I. Meat quality traits as a function of cow maturity. Anim Sci J. 2016; Funding 8. Andrews S: FastQC: a quality control tool for high throughput sequence This work was supported by Agenda (PJ01251902) of the National Institute data. Reference Source 2010. of Animal Science, Rural Development Administration (RDA), Republic of 9. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina Korea. sequence data. Bioinformatics 2014:30(15):2114-20. Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 12 of 13 10. Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat 33. Casswall TH, Nilsson HO, Bjorck L, Sjostedt S, Xu L, Nord CK, Boren T, Methods. 2012;9(4):357–9. Wadstrom T, Hammarstrom L. Bovine anti-helicobacter pylori antibodies for 11. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis oral immunotherapy. Scand J Gastroenterol. 2002;37(12):1380–5. G, Durbin R. Genome project data processing S: the sequence alignment/ 34. Hammarstrom L, Gardulf A, Hammarstrom V, Janson A, Lindberg K, Smith CI. map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. Systemic and topical immunoglobulin treatment in immunocompromised patients. Immunol Rev. 1994;139:43–70. 12. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, 35. Weiner C, Pan Q, Hurtig M, Boren T, Bostwick E, Hammarstrom L. Passive Garimella K, Altshuler D, Gabriel S, Daly M, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA immunity against human pathogens using bovine antibodies. Clin Exp Immunol. 1999;116(2):193–205. sequencing data. Genome Res. 2010;20(9):1297–303. 13. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis 36. Lilius EM, Marnila P. The role of colostral antibodies in prevention of AA, del Angel G, Rivas MA, Hanna M, et al. A framework for variation microbial infections. Curr Opin Infect Dis. 2001;14(3):295–300. discovery and genotyping using next-generation DNA sequencing data. Nat 37. Kuroiwa Y, Kasinathan P, Choi YJ, Naeem R, Tomizuka K, Sullivan EJ, Knott JG, Genet. 2011;43(5):491–8. Duteau A, Goldsby RA, Osborne BA, et al. Cloned transchromosomic calves 14. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy- producing human immunoglobulin. Nat Biotechnol. 2002;20(9):889–94. Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J et al: From FastQ data 38. Roelants GE, Fumoux F, Pinder M, Queval R, Bassinga A, Authie E. to high confidence variant calls: the genome analysis toolkit best practices Identification and selection of cattle naturally resistant to African pipeline. Curr Protoc Bioinformatics 2013, 43:11.10.11–11.10.33. trypanosomiasis. Acta Trop. 1987;44(1):55–66. 15. Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe 39. Lopez-Dee Z, Pidcock K, Gutierrez LS. Thrombospondin-1: multiple paths to T, Hall G, Shea TP, Sykes S, et al. High-quality draft assemblies of mammalian inflammation. Mediat Inflamm. 2011;2011:296069. genomes from massively parallel sequence data. Proc Natl Acad Sci. 2011; 40. Wight TN, Raugi GJ, Mumby SM, Bornstein P. Light microscopic 108(4):1513–8. immunolocation of thrombospondin in human tissues. J Histochem 16. Ribeiro FJ, Przybylski D, Yin S, Sharpe T, Gnerre S, Abouelleil A, Berlin AM, Cytochem. 1985;33(4):295–302. Montmayeur A, Shea TP, Walker BJ, et al. Finished bacterial genomes from 41. Grimbert P, Bouguermouh S, Baba N, Nakajima T, Allakhverdi Z, Braun D, shotgun sequence data. Genome Res. 2012;22(11):2270–7. Saito H, Rubio M, Delespesse G, Sarfati M. Thrombospondin/CD47 17. Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA – a practical iterative de Bruijn interaction: a pathway to generate regulatory T cells from human CD4+ graph De novo assembler. In: Berlin BB, editor. Research in computational CD25- T cells in response to inflammation. J Immunol. 2006;177(6):3534–41. molecular biology: 14th annual international conference, RECOMB 2010, 42. Doyen V, Rubio M, Braun D, Nakajima T, Abe J, Saito H, Delespesse G, Sarfati Lisbon, Portugal, April 25–28, 2010 proceedings. Heidelberg: Springer Berlin M. Thrombospondin 1 is an Autocrine negative regulator of human Heidelberg; 2010. p. 426–40. dendritic cell activation. J Exp Med. 2003;198(8):1277–83. 18. Peng Y, Leung HC, Yiu SM, Chin FY. IDBA-UD: a de novo assembler for 43. Contreras-Ruiz L, Regenfuss B, Mir FA, Kearns J, Masli S. Conjunctival single-cell and metagenomic sequencing data with highly uneven depth. inflammation in thrombospondin-1 deficient mouse model of Sjogren's Bioinformatics. 2012;28(11):1420–8. syndrome. PLoS One. 2013;8(9):e75937. 19. Finn RD, Coggill P, Eberhardt RY, EddySR,MistryJ,MitchellAL, Potter 44. Ezzie ME, Piper MG, Montague C, Newland CA, Opalek JM, Baran C, Ali N, SC, Punta M, Qureshi M, Sangrador-Vegas A, et al. The Pfam protein Brigstock D, Lawler J, Marsh CB. Thrombospondin-1-deficient mice are not families database: towards a more sustainable future. Nucleic Acids Res. protected from bleomycin-induced pulmonary fibrosis. Am J Respir Cell Mol 2016;44(D1):D279–85. Biol. 2011;44(4):556–61. 20. Raghavachari B, Tasneem A, Przytycka TM, Jothi R. DOMINE: a database of 45. Zhao Y, Xiong Z, Lechner EJ, Klenotic PA, Hamburg BJ, Hulver M, Khare A, protein domain interactions. Nucleic Acids Res. 2008;36(Database issue): Oriss T, Mangalmurti N, Chan Y, et al. Thrombospondin-1 triggers D656–61. macrophage IL-10 production and promotes resolution of experimental 21. Yellaboina S, Tasneem A, Zaykin DV, Raghavachari B, Jothi R. DOMINE: a lung injury. Mucosal Immunol. 2014;7(2):440–8. comprehensive collection of known and predicted domain-domain 46. Punekar S, Zak S, Kalter VG, Dobransky L, Punekar I, Lawler JW, Gutierrez LS. interactions. Nucleic Acids Res. 2011;39(Database issue):D730–5. Thrombospondin 1 and its mimetic peptide ABT-510 decrease angiogenesis 22. Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, Chang H-Y, and inflammation in a murine model of inflammatory bowel disease. Dosztányi Z, El-Gebali S, Fraser M, et al. InterPro in 2017—beyond protein Pathobiology. 2008;75(1):9–21. family and domain annotations. Nucleic Acids Res. 2017;45(D1):D190–9. 47. Phan J, Peterfy M, Reue K. Lipin expression preceding peroxisome proliferator-activated receptor-gamma is critical for adipogenesis in vivo 23. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, and in vitro. J Biol Chem. 2004;279(28):29558–64. Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet. 2000;25(1):25–9. 48. Phan J, Reue K. Lipin, a lipodystrophy and obesity gene. Cell Metab. 2005; 24. Supek F, Bošnjak M, Škunca N, Šmuc T. REVIGO summarizes and visualizes 1(1):73–83. long lists of gene ontology terms. PLoS One. 2011;6(7):e21800. 49. Rayment I, Holden HM, Whittaker M, Yohn CB, Lorenz M, Holmes KC, 25. Killcoyne S, Carter GW, Smith J, Boyle J. Cytoscape: a community-based Milligan RA. Structure of the actin-myosin complex and its implications for framework for network modeling. Methods Mol Bio. 2009;563:219–39. muscle contraction. Science. 1993;261(5117):58–65. 26. Pestka S, Langer JA, Zoon KC, Samuel CE. Interferons and their actions. 50. Roberts AJ. Functions and mechanics of dynein motor. Proteins. 2013;14(11): Annu Rev Biochem. 1987;56:727–77. 713–26. 51. Bunney TD, van Walraven HS, de Boer AH. 14-3-3 protein is a regulator of 27. Balkwill F. Interferons and other regulatory cytokines. Immunology. 1989; the mitochondrial and chloroplast ATP synthase. Proc Natl Acad Sci. 2001; 66(4):634. 98(7):4249–54. 28. Roberts RM. Interferon-tau, a type 1 interferon involved in maternal recognition of pregnancy. Cytokine Growth Factor Rev. 2007;18(5–6):403–8. 52. Berg D, Holzmann C, Riess O. 14-3-3 proteins in the nervous system. Nat 29. Han CS, Mathialagan N, Klemann SW, Roberts RM. Molecular cloning of Rev Neurosci. 2003;4(9):752–62. ovine and bovine type I interferon receptor subunits from uteri, and 53. Nigg EA, Stearns T. The centrosome cycle: centriole biogenesis, duplication endometrial expression of messenger ribonucleic acid for ovine and inherent asymmetries. Nat Cell Biol. 2011;13(10):1154–60. receptors during the estrous cycle and pregnancy. Endocrinology. 1997; 54. McClung JM, Davis JM, Wilson MA, Goldsmith EC, Carson JA. Estrogen status 138(11):4757–67. and skeletal muscle recovery from disuse atrophy. J Appl Physiol. 2006; 30. Platanias LC. Mechanisms of type-I- and type-II-interferon-mediated 100(6):2012–23. signalling. Nat Rev Immunol. 2005;5(5):375–86. 55. Gotoh T, Joo S-T. Characteristics and health benefit of highly marbled Wagyu and Hanwoo beef. Korean J Food Sci Anim Resour. 2016;36(6):709–18. 31. de Souza JAC, Rossa Junior C, Garlet GP, Nogueira AVB, Cirelli JA. Modulation of host cell signaling pathways as a therapeutic approach in 56. Chesworth JM, Easdon MP. Effect of diet and season on steroid hormones periodontal disease. J Appl Oral Sci. 2012;20(2):128–38. in the ruminant. J Steroid Biochem. 1983;19(1c):715–23. 32. Zhao Y, Kacskovics I, Rabbani H, Hammarstrom L. Physical mapping of the 57. Adams HA, Southey BR, Everts RE, Marjani SL,TianCX, LewinHA, Rodriguez-Zas bovine immunoglobulin heavy chain constant region gene locus. J Biol SL. Transferase activity function and system development process are critical in Chem. 2003;278(37):35024–32. cattle embryo development. Funct Integr Genomics. 2011;11(1):139–50. Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 13 of 13 58. Mirlesse V, Jacquemard F, Daffos F, Forestier F. Fetal gammaglutamyl transferase activity: clinical implication in fetal medicine. Biol Neonate. 1996; 70(4):193–8. 59. Gibbs DA, McFadyen IR, Crawfurd MD, De Muinck Keizer EE, Headhouse- Benson CM, Wilson TM, Farrant PH. First-trimester diagnosis of Lesch-Nyhan syndrome. Lancet. 1984;2(8413):1180–3. 60. Jeong J, Kwon EG, Im SK, Seo KS, Baik M. Expression of fat deposition and fat removal genes is associated with intramuscular fat content in longissimus dorsi muscle of Korean cattle steers. J Anim Sci. 2012;90(6):2044–53. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Genetics Springer Journals

Genome sequencing and protein domain annotations of Korean Hanwoo cattle identify Hanwoo-specific immunity-related and other novel genes

Free
13 pages

Loading next page...
 
/lp/springer_journal/genome-sequencing-and-protein-domain-annotations-of-korean-hanwoo-y0v7RA9eQa
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s).
Subject
Life Sciences; Life Sciences, general; Animal Genetics and Genomics; Microbial Genetics and Genomics; Plant Genetics and Genomics; Genetics and Population Dynamics
eISSN
1471-2156
D.O.I.
10.1186/s12863-018-0623-x
Publisher site
See Article on Publisher Site

Abstract

Background: Identification of genetic mechanisms and idiosyncrasies at the breed-level can provide valuable information for potential use in evolutionary studies, medical applications, and breeding of selective traits. Here, we analyzed genomic data collected from 136 Korean Native cattle, known as Hanwoo, using advanced statistical methods. Results: Results revealed Hanwoo-specific protein domains which were largely characterized by immunoglobulin function. Furthermore, domain interactions of novel Hanwoo-specific genes reveal additional links to immunity. Novel Hanwoo-specific genes linked to muscle and other functions were identified, including protein domains with functions related to energy, fat storage, and muscle function that may provide insight into the mechanisms behind Hanwoo cattle’s uniquely high percentage of intramuscular fat and fat marbling. Conclusion: The identification of Hanwoo-specific genes linked to immunity are potentially useful for future medical research and selective breeding. The significant genomic variations identified here can crucially identify genetic novelties that are arising from useful adaptations. These results will allow future researchers to compare and classify breeds, identify important genetic markers, and develop breeding strategies to further improve significant traits. Keywords: Cattle, Hanwoo, Genome sequencing, Protein domain, Unaligned read assembly, DNA-Seq Background Consequently, one of the main goals of the meat pro- Hanwoo is a Korean native taurine breed of cattle that duction industry worldwide is to increase the incidence has been around since 2000 BC. Although their original of this trait [2]. Given this focus, several studies have in- primary purpose was to serve as farming and transporta- vestigated gene expression patterns with the primary tion cattle, the rapid growth of the Korean economy that goal of determining which genes are responsible for occurred in the 1960’s and its associated food demands Hanwoo-specific high fat concentration [3–7]. led to this breed being used as a main source of meat Here we gathered genomic data from 136 Hanwoo cat- [1]. Since then, the demand for this product in Korea tle that we analyzed using advanced statistical methods. has skyrocketed. This is due to the high percentage of We show that investigation of the genome of this unique fat marbling in Hanwoo meat, a characteristic that is set of cattle individuals with the general goal of identify- unique to the breed. Hanwoo loin muscles have approxi- ing breed-level idiosyncrasies can provide valuable infor- mately 24% intramuscular fat content [2]. The quality mation for potential use in evolutionary studies, medical and price of meat is often determined by fat marbling. applications, and breeding of selective traits. The goal is to enhance our understanding of characteristics of beef * Correspondence: lim.dj@korea.kr cattle breeds with unique adaptations and beneficial Animal Genomics & Bioinformatics Division, National Institute of Animal traits that have not yet been well elucidated. This would Science, RDA, 77 Chuksan-gil, Kwonsun-gu, Suwon 441-706, Republic of Korea make it possible to selectively breed for these traits in Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 2 of 13 other breeds of cattle worldwide to improve meat quality LEADING:10, TRAILING:10, MINLEN:80), respectively. and revolutionize the field of meat production. Then, high-quality sequence reads were mapped to the Bos taurus reference genome (UMD 3.1) using Bow- Methods tie2.2.6 [10] with default settings in order to extract un- Alignment of unaligned reads for the detection for novel aligned reads. Removal of duplicate reads was performed genes using the Hanwoo whole genome using Picard (ver 1.06) and indexing, sorting, and un- Blood samples for whole genome sequencing were aligned read extraction was performed using Samtools obtained from 136 Korean beef cattle (Hanwoo) individ- v1.3.1 [11]. GATK v3.4.46 [12–14] was used for local uals reared at the Hanwoo Improvement Center of the realignment and recalibration of the alignment (blue National Agricultural Cooperative Federation (Seosan, boxes on the pipeline figure; Fig. 1). A summary of se- Chungnam, Korea). Indexed shotgun paired-end (PE) quencing data is provided in Additional file 1: Table S1. libraries with 500 bp average length inserts were Since we are interested in information originating generated from these samples using the TruSeq Nano from the sample itself and not detected from the refer- DNA Library Prep Kit (Illumina, USA) following the ence sequence, we created an assembled genome at the standard Illumina sample-preparation protocol. Briefly, scaffold level to discover whether unaligned reads actu- 200 ng of gDNA was fragmented using a Covaris M220 ally constitute functional units (genes) on their own focused-ultrasonicator (Woburn, MA, USA) to produce genome. This scaffold was created from one randomly fragments with a median size of ~ 500 bp. The frag- selected sample from our pool of samples. The Broad In- mented DNA was subjected to end repair, A-tailing, stitute’s stand-alone ALLPATHS-LG fragment read error and indexed adapter ligation (~ 125 bp adapter). correction module [15, 16] was used for error correction Adapter-ligated DNA of 550 to 650 bp in length was as a precursor to de novo assembly. De novo assembly amplified using PCR for 8 cycles. The size-selected was performed using an Iterative De Bruijn Assembler libraries were analyzed using the Agilent 2100 Bioana- of Uneven Depth (IDBA_UD: [17, 18], an iterative De lyzer (Agilent Technologies) to determine the size Bruijn graph de novo assembler for short reads distribution and to check for adapter contamination. sequencing data that utilizes paired-end reads to The resulting libraries were sequenced using the assemble highly uneven low-depth regions. This tool Illumina HiSeq 2500 (2x125bp paired-end sequences) is useful for optimizing the length gap problem and and NextSeq500 (2x150bp paired-end sequences) iterating different K-mer length (green boxes on the Next-Gen sequencers. pipeline figure; Fig. 1). The bioinformatics pipeline used in this study is For unaligned read alignments, we extracted reads for described in Figs. 1 and 2. Quality control for per-base each sample that was not aligned to the reference quality of reads and removal of potential adaptor se- genome. Using the extracted unaligned reads (blue boxes quences was performed using fastQC v0.11.4 [8] and on the pipeline figure; Fig. 1) and the assembled Trimmomatic v0.36 [9] software (seed mismatches:2, scaffold-level genome (green boxes on the pipeline palindrome clip threshold:30, simple clip threshold:10, figure; Fig. 1) of each sample, alignment of unaligned Fig. 1 Detailed unaligned read assembly pipeline. Green squares represent the first stage of analysis- assembly of scaffold-level genome; blue squares represent the second stage of analysis- extraction of unaligned reads; yellow squares represent the third and final stage of analysis- gene prediction and functional annotation Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 3 of 13 areas of the genome; and (3) Mining the uncovered genes and associated domains to identify important gene functions and networks involved in positive traits. A summary of representative reference genome builds via short read assembly is presented in Table 1.We mapped unaligned reads against the reference genome and extracted information to a depth of 10× (meaning Fig. 2 Simplified pipeline of unaligned read assembly that each base was sequenced an average of 10 times). We predicted a total of 614 gene regions using scaffolds reads to the scaffold was carried out using Bowtie2 containing locations higher than depth of 10×. Of the (remapping). The identified remapped sequences 614 genes, 283 genes were covered by unaligned reads throughout the sequence were assumed to represent with at least depth of 10×. Hanwoo-specific sequences. These resulting regions Cross-referencing of protein sequences from the 283 constitute regions that are distinctive from the reference. genes against the Pfam database identified associated We performed depth profiling to diminish the possibility protein domains covering a total of 168 scaffolds. Over- of false positives. We identified scaffolds containing all, 311 Pfam protein domains were identified when locations meeting our depth cutoff of 10× (an arbitrary using data filtered for sequences with an average cutoff selected for result filtering), and used the collected mapped base depth coverage of less than 10×. These scaffolds for gene prediction using the gene prediction numbers suggest that there was more than one affili- program Augustus 3.1.0. Out of the resulting 614 pre- ated domain identified for some gene regions. Due to dicted genes, we extracted protein sequences covered by space limitations, Table 2 lists significantly identified unaligned reads with at least depth of 10×. (E- value <1XE-100) Pfam protein family domain ana- The resulting total of 283 protein sequences were lysis results. An extended list of significantly identi- cross-referenced against the Pfam database of protein fied Pfam domains with E-value <1E-40 is presented families (pfam.xfam.org;[19]) using the protein domain in Additional file 2: Table S2. detection program InterProScan-5.15-54.0 in order to identify protein domains affiliated with those areas of Hanwoo-specific genes linked to immunity the genome. In order to assign meaning and infer the A number of domains were largely characterized by im- function of these domains, we searched for these identi- mune system function. Selected immune system-related fied domains within DOMINE (http://domine.utdalla- genes are shown in Table 3. Six of the seven domains s.edu/cgi-bin/Domine), a database of known and shown are associated to the immunoglobulin function, predicted protein domain interactions [20, 21]. Using while the remaining domain is associated with the inter- Interpro [22], we obtained GO (Gene Ontology) Cellular feron group of signaling proteins, which is crucial for Component (CC), Molecular Function (MF), and Bio- the immune system response as well. logical Process (BP) terms for each individual domain The interferon-alpha/beta receptor is a cell surface re- [23]. Next, gene ontology results were summarized and ceptor made up of one chain with two subunits, IFNAR1 visualized with the online tool REVIGO (http://revi- and IFNAR2. The interferon receptors have antiviral, go.irb.hr;[24]) to better interpret our results. Next, using Table 1 Summary of the results of representative reference REVIGO’s Interactive Graph tool [24] and exporting genome build via short read assembly (> = 1 kb) results into the Cytoscape software package [25], we Base pairs Percent (%) created a graph-based visualization of the identified Number of scaffolds 295,265 100 terms for each GO category. Using the above described methodologies and annota- Residue counts A 701,475,984 29.24 tions we were able to align and map genome sequences C 498,799,879 20.79 as well as predict genes that may be related to G 498,441,379 20.78 Hanwoo-specific characteristics. T 692,999,844 28.89 N 7,063,142 0.29 Results and discussion Total 2,398,780,228 100 Research objectives and genome build summary Our main research objectives included: (1) Assembling Sequence lengths Minimum 1000 and mapping unaligned reads in order to identify and Maximum 136,625 predict genes in Hanwoo cattle; (2) Cross-referencing re- Average 8124.16 sults against a comprehensive protein domain database N50 13,528 in order to identify protein domains affiliated with those Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 4 of 13 Table 2 Significantly identified (E- value <1XE-100) Pfam protein family domain analysis results Gene name Length Source Accession Description Start Stop E-value scaffold_2197.g59.t1 581 Pfam PF00063 Myosin head (motor domain) 30 575 5.60E-207 scaffold_1285.g30.t1 417 Pfam PF15718 Domain of unknown function (DUF4673) 116 412 4.50E-154 scaffold_6851.g129.t1 391 Pfam PF03028 Dynein heavy chain and region D6 of dynein motor 2 390 1.50E-120 scaffold_13817.g209.t1 758 Pfam PF01403 Sema domain 59 467 2.30E-117 scaffold_29068.g344.t1 348 Pfam PF16021 Programmed cell death protein 7 33 344 3.00E-114 scaffold_15941.g224.t1 887 Pfam PF04849 HAP1 N-terminal conserved region 1 249 2.60E-108 scaffold_5769.g113.t1 246 Pfam PF00244 14–3-3 protein 5 238 3.60E-107 scaffold_1936.g56.t1 564 Pfam PF08235 LNS2 (Lipin/Ned1/Smp2) 300 525 1.30E-104 antiproliferative, and immunomodulatory functions, as that an understanding of the evolution and expression of well as being highly involved in pregnancy [26, 27]. mammalian immune system genes has important impli- Interferon-τ, a type I interferon, has been shown to pre- cations for human health. Bovine antibodies have been vent a return to ovarian cyclicity after conception to of particular interest, as they exhibit prophylactic and ensure the continuation of the pregnancy in ruminant therapeutic properties in response to several human and ungulate species; this interferon appears to be the main animal infectious diseases [33–36]. Additionally, re- factor responsible for prevention of degradation of the searchers have recently developed transgenic calves that corpus luteum [28, 29]. produce human immunoglobulin, speaking to the in- In addition to these reproductive roles, this receptor is credible importance of cattle as model organisms for the responsible for binding type 1 interferons interferon–α study of human immunity and disease [37]. Secondly, and –β and activating the JAK-STAT signaling pathway, understanding the molecular and genetic basis of im- which is associated with DNA-transcription and the munity in cattle breeds can not only serve to further our expression of genes related to immunity, proliferation, understanding of the breeds, but also to provide genetic and differentiation, among others [30]. The JAK-STAT information which can be used for selective breeding in pathway has primary functions related to immunity. In order to improve performance and survival of livestock. fact, drug therapies that aim to turn down the immune Immunity in cattle varies vastly by breed. For example, response of the body and modulate host responses to African cattle are known for their incredible resistance disease and infection target this pathway [31]. The to tick and gastrointestinal parasite infestations, traits expression of the interferon group of signaling proteins that have developed in response to thousands of years of in our Hanwoo cattle samples suggests that Hanwoo evolution in the harsh environments of Africa. A par- may have breed-specific immune system functions that ticularly amazing adaptation is the resistance of several are not yet well understood. African breeds to trypanosomiasis, also known as sleeping Our analysis also identified associated protein domains sickness [38]. Identification of genes responsible for which are largely characterized by the immunoglobulin immunity and introduction of identified immunity-related function. These results are particularly salient given the genes in cattle breeds that are productive but highly significance of these kinds of results for medical research susceptible to disease may improve their resistance, sur- and selective breeding. The bovine immune system has vival, and productivity. Understanding genetic features been a topic of interest to researchers for quite some controlling these mechanisms will allow researchers to de- time now, mainly due to two reasons [32]. The first is velop appropriate breeding strategies. Table 3 Selected immune system-related genes and affiliated protein domains Gene name Length Source Accession Description Start Stop E-value scaffold_2520.g67.t1 1075 Pfam PF13895 Immunoglobulin domain 425 491 5.40E-09 scaffold_19370.g263.t1 508 Pfam PF07679 Immunoglobulin I-set domain 381 454 8.30E-07 scaffold_8624.g151.t1 512 Pfam PF13895 Immunoglobulin domain 10 73 3.90E-08 scaffold_14147.g214.t1 159 Pfam PF07679 Immunoglobulin I-set domain 27 112 3.40E-24 scaffold_13817.g209.t1 758 Pfam PF00047 Immunoglobulin domain 550 628 5.10E-09 scaffold_5779.g114.t1 142 Pfam PF07679 Immunoglobulin I-set domain 35 70 6.20E-07 scaffold_46987.g437.t1 460 Pfam PF09294 Interferon-alpha/beta receptor, fibronectin type III 44 142 1.20E-17 Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 5 of 13 More generally, research in immunoglobulin genetics provide insight into the mechanisms behind Hanwoo is particularly salient for several reasons. Although re- cattle’s uniquely high percentage of intramuscular fat search into the genetic aspects of and expression of and fat marbling. For example, LNS2 (Lipin/Ned1/ genes related to immunoglobulin has been widely con- Smp2) domain, which includes Lipin, was significantly ducted in humans and mice, research in this field is lack- identified (E-value = 1.30E-104) in our data (Table 2). ing when it comes to livestock breeds, particularly cattle. Lipin, encoded by the Lpin1 gene, is a powerful gene Information is still needed to complete previous infor- which largely controls how the body produces, stores, mation, including the number of available gene segments and uses fat. Mice deficient in Lipin do not develop ei- and gene families. This kind of information can be used ther diet-induced or genetic obesity [47]. Additionally, in the future to study and create synthetic recombinant enhanced Lipin expression has been shown to promote species-specific antibodies, which could be used to treat adiposity in mice [48]. and prevent infectious diseases. Additionally, the Myosin head (motor domain) protein domain, which is associated with muscle function, was Domain interactions of Hanwoo-specific genes reveal significantly identified (E-value = 5.60E-207, Table 2). additional links to immunity Myosin is a chief component of myofibril filaments, Additionally, more general consideration of significantly which are responsible for muscle contraction. Myosin identified protein family domains from the Pfam database also actively participates in the conversion of ATP chem- provided information needed to further understanding the ical energy to mechanical energy through its interaction breed-specific molecular mechanisms of Hanwoo cattle. with Actin [49]. Additionally, the Dynein heavy chain Table 2 lists highly significantly identified (E-value and region D6 of the dynein motor domain and 14– <1XE-100) Pfam domains. In order to assign meaning and 3-3 protein domain were significantly identified infer the function of these domains, which include several (E-values = 1.50E-120,3.60E-107 respectively), both of not well understood but highly significant protein domains, which are also largely responsible for ATP energy we searched for these identified domains within DOMINE conversion [50–52]. These results suggest that these (http://domine.utdallas.edu/cgi-bin/Domine), adatabaseof proteins domains are those which are primarily re- known and predicted protein domain interactions [20, 21]. sponsible for providing energy to the muscle and pos- Among these, several interesting results reveal the genetic sibly causing the breed-specific high percentage of intricacies of the Hanwoo genome and its functions. intramuscular fat that is observed in Hanwoo cattle. Several of the most significantly identified protein Several of the other identified domains, such as the domains appear to be closely linked with immune sys- HAP1 N-terminal conserved region domain, were tem function, further supporting our previous findings. found to lack interactions with any other domains For example, the significantly identified Sema domain and their specific roles in cattle have not been well (E-value = 2.30E-117) appears to be primarily associated established. As we learn more about these proteins with immune system function. The Sema domain not and their functions in the future, we may be able to only forms interactions with the Immunoglobulin do- better interpret these results. main, but also interacts with the Thrombospondin type 1 (TSP-1) domain, which has been shown to control im- Interpretation of gene ontology terms associated with mune regulation. Thrombospondin, an extremely large the entire set of Pfam domains multi-domain glycoprotein, is crucial to certain mecha- As previously discussed, we were able to identify 311 nisms related to angiogenesis, cell proliferation, and im- Pfam domains mapping to 168 scaffolds not shared with mune responses [39] such as the chemotactic response common cattle. We then filtered that list and kept only to tissue damage and the facilitation of phagocytosis of the highest hits. Within that short list, we revealed high damaged cells [40–42]. Mice deficient in TSP-1 are more enrichment for muscle and immunology genes. However, susceptible to inflammation and injury, either as a side this approach provides a very limited look at our results. effect of drugs or as a result of gene activation [43–46]. Thus, we aimed to further explore Hanwoo-specific do- Given the strong role of this protein domain in immun- mains by analyzing the enrichment of functional cat- ity, our identification of this pathway here once again egories associated with each individual domain of the confirms that there are unique functions of immunity at entire list. Using Interpro [22], we obtained GO (Gene play operating specifically in the Hanwoo genome. Ontology) Cellular Component (CC), Molecular Func- tion (MF), and Biological Process (BP) terms for each in- Hanwoo-specific genes linked to muscle and other dividual domain [23]. Next, gene ontology results were functions summarized and visualized with the online tool REVIGO Significantly identified protein domains with functions (http://revigo.irb.hr;[24] to better interpret our results. related to energy, fat storage, and muscle function may Tables 4, 5, and 6 summarize the BP, CC, and MF GO Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 6 of 13 terms, respectively. REVIGO calculates “frequency” and between the nodes of our graph (GO terms) represent the “uniqueness” values, with frequency representing the top 3% strongest pairwise similarities between terms [24]. proportion of the specified GO term within the entire The BP GO term visualization (Fig. 3) can be charac- Bos taurus species-specific Uniprot protein annotation terized by a large number of un-connected solo terms database, and uniqueness determining within the input- and shows a large diversity of biological processes being ted list whether a term is an outlier when compared affected, meaning that a large rewiring of functionality is semantically to the list as a whole [24]. embedded in the new genes acquired by Hanwoo cattle. Next, using REVIGO’s Interactive Graph tool [24] and Note that the most significant term is the most specific, exporting results into the Cytoscape software package ‘centriole replication’, which is also connected to the [25], we created a graph-based visualization of the iden- general term ‘microtubule-based movement’; dynein tified terms for each GO category. Figures 3, 4, and 5 (significantly identified from our data) moves along mi- display visualizations of BP, CC, and MF GO terms, crotubules, so this term may reflect the biological pro- respectively. The radius of the bubbles represents the cesses responsible for dynein’s role in ATP energy generality of the specified term; a small bubble implies conversion. This is quite unique and unexpected, since it higher specificity. The p-value of each GO term is repre- signals an important role of cell division [53]. The sec- sented by the color shading of each bubble, with darker ond group of more significant terms are less specific but colors representing higher significance. The edges all related to transport, particularly ‘anion transport’, Table 4 Summary of enriched Gene Ontology (GO) biological process (BP) terms among total identified Pfam protein family domains a b term_ID description Frequency log10 p-value Uniqueness GO:0008152 metabolic process 62.92% −4.2076 0.974 GO:0007154 cell communication 28.75% −5.9208 0.866 GO:0006139 nucleobase-containing compound metabolic process 28.16% −31.0655 0.776 GO:0007165 signal transduction 26.76% −18.9208 0.794 GO:0006810 transport 19.48% −32.6383 0.765 GO:0006355 regulation of transcription, DNA-templated 14.27% −8.9586 0.608 GO:0006464 cellular protein modification process 14.26% −11.7212 0.62 GO:0007186 G-protein coupled receptor signaling pathway 8.87% −25.4318 0.803 GO:0006508 proteolyis 7.74% −32.6778 0.741 GO:0006811 ion transport 7.05% −12.2076 0.744 GO:0055114 oxidation-reduction process 6.85% −11.699 0.831 GO:0055085 transmembrane transport 6.54% −32.6383 0.714 GO:0016192 vesicle-mediated transport 4.60% −30.7447 0.785 GO:0006886 intracelular protein transport 3.09% −30.7447 0.792 GO:0016567 protein ubiquitination 2.40% −14.3468 0.673 GO:0006820 anion transport 2.07% −61.2147 0.769 GO:0006457 protein folding 1.00% −19.9208 0.736 GO:0007018 microtubule-based movement 1.00% −45.6021 0.843 GO:0035023 regulation of Rho protein signal transduction 0.92% −11.1675 0.832 GO:0016573 histone acetylation 0.60% −51.3372 0.635 GO:0043401 steroid hormone mediated signaling pathway 0.46% −28.6383 0.839 GO:0045454 cell redox homeostasis 0.40% −27.0269 0.865 GO:0000413 protein peptidyl-prolyl isomerization 0.24% −19.9208 0.709 GO:0006400 tRNA modification 0.20% −34.7447 0.692 GO:0018149 peptide cross-linking 0.17% −13.4318 0.731 GO:0007099 centriole replication 0.08% − 153.347 0.832 GO:0009396 folic acid-containing compound biosynthetic process 0.05% −11.699 0.786 Frequency represents the proportion of the specified GO term within the entire Bos Taurus species-specific Uniprot protein annotation database. Higher frequencies represent more general and common terms, while terms with a lower frequency are rare and specific Uniqueness represents whether a term is an outlier when compared semantically to the list as a whole Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 7 of 13 Table 5 Summary of enriched Gene Ontology (GO) cellular component (CC) terms among total identified Pfam protein family domains a b term_ID description Frequency log10 p-value Uniqueness GO:0005622 intracellular 63.18% −4.2076 0.875 GO:0016020 membrane 47.23% −5.7959 0.872 GO:0016021 integral component of membrane 29.77% −10.8239 0.832 GO:0005634 nucleus 27.70% −10.6576 0.688 GO:0005739 mitochondrion 9.22% −12.5686 0.636 GO:0005740 mitochondrial envelope 3.09% −32.3565 0.576 GO:0031012 extracellular matrix 2.00% −15 0.77 GO:0016459 myosin complex 0.38% − 206.252 0.589 GO:0030286 dynein complex 0.20% −45.6021 0.594 Frequency represents the proportion of the specified GO term within the entire Bos Taurus species-specific Uniprot protein annotation database. Higher frequencies represent more general and common terms, while terms with a lower frequency are rare and specific Uniqueness represents whether a term is an outlier when compared semantically to the list as a whole Fig. 3 Visualization of significantly identified Gene Ontology (GO) Biological Process (BP) terms. The radius of the bubbles represents the generality of the specified term (a small bubble implies higher specificity). The p-value of each GO term is represented by the color shading of each bubble (darker colors representing higher significance). The edges between the nodes of our graph (GO terms) represent the top 3% strongest pairwise similarities between terms Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 8 of 13 Table 6 Summary of enriched Gene Ontology (GO) molecular function (MF) terms among total identified Pfam protein family domains a b term_ID description Frequency log10 p-value Uniqueness GO:0003700 sequence-specific DNA binding transcription factor activity 5.30% −10.6576 0.958 GO:0003712 transcription cofactor activity 1.78% −39.8861 0.957 GO:0003824 catalytic activity 37.22% −13.4559 0.972 GO:0004871 signal transducer activity 11.94% −25.4318 0.927 GO:0004930 G-protein coupled receptor activity 7.87% −27.6198 0.925 GO:0005089 Rho guanyl-nucleotide exchange factor activity 0.48% −11.699 0.956 GO:0005216 ion channel activity 2.49% −13.9208 0.896 GO:0015075 ion transmembrane transporter activity 5.28% −5.7959 0.895 GO:0016773 phosphotransferase activity, alcohol group as acceptor 4.60% −58.7447 0.771 GO:0004672 protein kinase activity 3.87% − 14.1308 0.772 GO:0043015 gamma-tubulin binding 0.10% −25.8539 0.878 GO:0008017 microtubule binding 1.06% −16.4815 0.848 GO:0019001 guanyl nucleotide binding 2.48% −25.4318 0.846 GO:0005509 calcium ion binding 3.94% −19.0655 0.863 GO:0005515 protein binding 26.71% −7.3665 0.92 GO:0004488 methylenetetrahydrofolate dehydrogenase (NADP+) activity 0.02% −13.4559 0.878 GO:0003755 peptidyl-prolyl cis-trans isomerase activity 0.26% −19.9208 0.876 GO:0003777 microtubule motor activity 0.51% −45.6021 0.814 GO:0005544 calcium-dependent phospholipid binding 0.17% −19.0655 0.844 GO:0042802 identical protein binding 4.77% −25.8539 0.878 GO:0031683 G-protein beta/gamma-subunit complex binding 0.16% −25.4318 0.883 GO:0043565 sequence-specific DNA binding 4.31% −10.6576 0.855 GO:0016787 hydrolase activity 15.05% −7.8861 0.841 GO:0008484 sulfuric ester hydrolase activity 0.11% −31.7447 0.832 GO:0004181 metallocarboxypeptidase activity 0.15% −32.6778 0.809 GO:0004222 metalloendopeptidase activity 0.79% −15 0.79 GO:0019901 protein kinase binding 1.80% −7.2007 0.885 GO:0008479 queuine tRNA-ribosyltransferase activity 0.02% −34.7447 0.849 GO:0003676 nucleic acid binding 21.33% −8.6021 0.849 GO:0005102 receptor binding 6.56% −62.1871 0.875 GO:0004402 histone acetyltransferase activity 0.24% −51.3372 0.812 GO:0003677 DNA binding 10.28% −9.5376 0.843 GO:0008408 3′-5′ exonuclease activity 0.18% −31.0655 0.827 GO:0016746 transferase activity, transferring acyl groups 1.42% −30.9208 0.805 GO:0003723 RNA binding 7.68% −5.9208 0.847 GO:0046872 metal ion binding 20.96% −4.5376 0.845 GO:0004550 nucleoside diphosphate kinase activity 0.14% −55.585 0.816 GO:0008236 serine-type peptidase activity 1.32% −7.1739 0.793 GO:0008270 zinc ion binding 6.73% −4.2076 0.856 GO:0004129 cytochrome-c oxidase activity 0.43% −12.5686 0.809 GO:0003924 GTPase activity 1.03% −25.4318 0.806 GO:0005524 ATP binding 8.83% −14.1308 0.751 GO:0000166 nucleotide binding 14.40% −7.3565 0.823 GO:0003810 protein-glutamine gamma-glutamyltransferase activity 0.07% −14.7696 0.823 Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 9 of 13 Table 6 Summary of enriched Gene Ontology (GO) molecular function (MF) terms among total identified Pfam protein family domains (Continued) a b term_ID description Frequency log10 p-value Uniqueness GO:0005543 phospholipid binding 1.33% −16.6021 0.829 GO:0035091 phosphatidylinositol binding 0.84% −17 0.829 Frequency represents the proportion of the specified GO term within the entire Bos Taurus species-specific Uniprot protein annotation database. Higher frequencies represent more general and common terms, while terms with a lower frequency are rare and specific Uniqueness represents whether a term is an outlier when compared semantically to the list as a whole which may be associated with ATP energetics. Another chemical energy to mechanical energy and serve crucial uniqueness is the steroid hormone mediated signaling functions for muscle function. The connectivity of these pathway. Sex steroid hormones play a critical role in the nodes within our network visualization signifies that regulation of muscle, muscle strength, and growth and these two components work together and are potentially maintenance of muscle mass [54]. While identification significant in Hanwoo-specific characteristics, such as of this GO term most likely can be attributed to the their high percentage of intramuscular fat. The rest of aforementioned relationship between steroid hor- the terms are generic, independent CC terms that mones and muscle development, as a result of the include nucleus and membrane. breed-specific unique high-fat muscle development, it The MF GO term visualization (Fig. 5) can be charac- may also be due to the practices under which Han- terized by high connectivity, with the most significant woo are reared in order to enhance the natural fat values grouped together. Microtubule motor activity, marbling in their meat, such as feeding time and diet. another microtubule function related term, was also For example, cattle are fed a high-concentration grain identified at the molecular function level, once again diet as opposed to grass-feeding [55]. Diet has been suggesting ATP energetics at play. A unique feature of shown to have an effect on steroid hormones [56], this visualization, compared to the BP and CC visualiza- which may also in part explain the identification of tions, is the presence of 4 unconnected graphs as this GO term here. opposed to many unconnected terms or a single con- The CC GO term visualization (Fig. 4) can be charac- nected group. The first group features solely terms terized by a single connected group consisting of four related to binding. This group contains the following terms: dynein complex, myosin complex, mitochondrial terms: Sequence-specific DNA binding, DNA binding, envelope, and mitochondrion. As previously mentioned, RNA binding, Nucleic acid binding, ATP binding, the Myosin Head and Dynein heavy chain protein do- Phospholipid binding, Calcium-dependent phospholipid mains were found significantly identified in our results- binding, Guanyl nucleotide binding, Metal ion binding, both of which participate in the conversion of ATP Zinc ion binding, and Calcium ion binding. The second Fig. 4 Visualization of significantly identified Gene Ontology (GO) Cellular Component (CC) terms. The radius of the bubbles represents the generality of the specified term (a small bubble implies higher specificity). The p-value of each GO term is represented by the color shading of each bubble (darker colors representing higher significance). The edges between the nodes of our graph (GO terms) represent the top 3% strongest pairwise similarities between terms Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 10 of 13 Fig. 5 Visualization of significantly identified Gene Ontology (GO) Molecular Function (MF) terms. The radius of the bubbles represents the generality of the specified term (a small bubble implies higher specificity). The p-value of each GO term is represented by the color shading of each bubble (darker colors representing higher significance). The edges between the nodes of our graph (GO terms) represent the top 3% strongest pairwise similarities between terms group consists of three connected terms: Ion Channel ac- activity, alcohol group as acceptor. Transferases are tivity, Methylenetetrahydrofolate dehydrogenase (NADP+) enzymes which are responsible for catalyzation of the activity, and Cytochrome-c oxidase activity. The third transfer of certain functional groups from one molecule to group consists of six connected terms: Sulfuric ester another. They are essential for countless biochemical hydrolase activity, 3′-5′ exonuclease activity, Microtubule processes throughout the body. In cattle specifically, it has motor activity, GTPase activity, Serine-type peptidase been shown that the activity of transferases is critical for activity, and Metallocarboxypeptidase activity. embryo development [57]. The expression of genes with The fourth and final group consists of 5 terms related to transferase activity function varies between abnormal and the activity of transferases: Nucleoside diphosphate kinase normal pregnancies [58, 59]. Therefore, the expression of activity, Transferase activity, transferring acyl groups, these transferase GO terms may be due to their role in Protein-glutamine gamma-glutamyltransferase activity, healthy pregnancy and development. However, interest- Histone acetyltransferase activity, and Phosphotransferase ingly, results of previous studies have demonstrated a Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 11 of 13 correlation between certain transferase activity genes, such Availability of data and materials All information supporting the results of this manuscript are included within the as GPAT1 and ATGL, and intramuscular fat content in article and additional files. The sequences of genes predicted through assembly Korean Cattle [60]. These previously identified results, of unaligned reads have been uploaded as a Additional files 1, 2, 3 and 4. when taken along with the comparatively high expression Authors’ contributions and connectivity of GO terms related to transferase KCA conceived the project and designed scientific objectives. DL performed activity, suggests that there may be unique mechanisms of Hanwoo sample collection and data generation. BHC conducted transferase activity in Hanwoo cattle which influences experimental design of the Hanwoo population. WK, SS, and KK performed genome annotation and data analysis. KCA performed comparative analysis their development and may perhaps be a factor impacting using bioinformatics tools. KCA interpreted data and wrote the manuscript. their species-specific high percentage of intramuscular fat. HK and DL organized and supervised the project. All authors read and approved the final manuscript. Conclusions Ethics approval and consent to participate The information unearthed from the comparison of No ethics statement was required for the collection of DNA samples. DNA was extracted either from artificial insemination bull semen straws or from breeds and identification of genetic variation in this blood samples obtained from the Hanwoo Improvement Center of the study will be invaluable for future research on the mo- National Agricultural Cooperative Federation (HICNACF) with the permission lecular determinants that have been bred in Hanwoo of the owners. The protocol was approved by the Committee on the Ethics of Animal Experiments of the National Institute of Animal Science (Permit cattle. Results revealed Hanwoo-specific protein domains Number: NIAS2015–774). which were largely characterized by immunoglobulin function. Furthermore, domain interactions of Competing interests The authors declare that they have no competing interests. Hanwoo-specific genes reveal additional links to immun- ity. Hanwoo-specific genes linked to muscle and other Publisher’sNote functions were identified, including protein domains Springer Nature remains neutral with regard to jurisdictional claims in with functions related to energy, fat storage, and muscle published maps and institutional affiliations. function that may provide insight into the mechanisms Author details behind Hanwoo cattle’s uniquely high percentage of Interdisciplinary Program in Bioinformatics, Seoul National University, intramuscular fat and fat marbling. Analyzing the whole 2 Kwan-ak St. 599, Kwan-ak Gu, Seoul 151-741, Republic of Korea. Department Hanwoo genome and reporting significant genomic vari- of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Republic of Korea. ations is crucial to identifying genetic novelties that are CHO&KIM genomics, Main Bldg. #514, SNU Research Park, Seoul National arising from useful adaptations. Similarly, such analysis University Mt.4-2, NakSeoungDae, Gwanakgu, Seoul 151-919, Republic of will allow future researchers to compare and classify Korea. Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA, 77 Chuksan-gil, Kwonsun-gu, Suwon 441-706, Republic breeds, identify important genetic markers, and develop of Korea. breeding strategies to further improve traits of economic value and biological significance. Received: 15 November 2017 Accepted: 14 May 2018 Additional files References 1. Lee S-H, Park B-H, Sharma A, Dang C-G, Lee S-S, Choi T-J, Choy Y-H, Kim H- C, Jeon K-J, Kim S-D, et al. Hanwoo cattle: origin, domestication, breeding Additional file 1: Table S1. Summary of sequencing data (DOCX 28 kb) strategies and genomic selection. J Anim Sci Technol. 2014;56(1):2. Additional file 2: Table S2. Significantly identified (E- value <1XE-40) 2. Jo C, Cho SH, Chang J, Nam KC. Keys to production and processing of Hanwoo Pfam protein family domain analysis results. (DOCX 17 kb) beef: a perspective of tradition and science. Anim Front. 2012;2(4):32–8. 3. Kim HJ, Sharma A, Lee SH, Lee DH, Lim DJ, Cho YM, Yang BS, Lee SH. Additional file 3: FASTA sequences for scaffolds which have locations Genetic association of PLAG1, SCD, CYP7B1 and FASN SNPs and their effects with depth > 10×. (XLSX 9907 kb) on carcass weight, intramuscular fat and fatty acid composition in Hanwoo Additional file 4: Protein sequences which have locations with depth > steers (Korean cattle). Anim Genet. 2016;48. https://doi.org/10.1111/age. 10×. (XLSX 101 kb) 4. Hwang YH, Joo ST. Fatty acid profiles of ten muscles from high and low marbled (quality grade 1++ and 2) Hanwoo steers. Korean J Food Sci Anim Abbreviations Resour. 2016;36(5):679–88. BP: Biological process; CC: Cellular component; GO: Gene ontology; 5. Sudrajad P, Sharma A, Dang CG, Kim JJ, Kim KS, Lee JH, Kim S, Lee SH. MF: Molecular function Validation of single nucleotide polymorphisms associated with carcass traits in a commercial Hanwoo population. Asian Australas J Anim Sci. 2016; Acknowledgements 29(11):1541–6. We acknowledge the support from different institutions and their personnel 6. Li XZ, Park BK, Hong BC, Ahn JS, Shin JS. Effect of soy lecithin on total providing help for the sampling of cattle (Hanwoo Improvement Center of cholesterol content, fatty acid composition and carcass characteristics in the the National Agricultural Cooperative Federation, HICNACF) and cattle Longissimus dorsi of Hanwoo steers (Korean native cattle). Anim Sci J. 2016; keepers for their assistance and permission to sample their herds. 7. Cho SH, Kang G, Seong P, Kang S, Sun C, Jang S, Cheong JH, Park B, Hwang I. Meat quality traits as a function of cow maturity. Anim Sci J. 2016; Funding 8. Andrews S: FastQC: a quality control tool for high throughput sequence This work was supported by Agenda (PJ01251902) of the National Institute data. Reference Source 2010. of Animal Science, Rural Development Administration (RDA), Republic of 9. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina Korea. sequence data. Bioinformatics 2014:30(15):2114-20. Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 12 of 13 10. Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat 33. Casswall TH, Nilsson HO, Bjorck L, Sjostedt S, Xu L, Nord CK, Boren T, Methods. 2012;9(4):357–9. Wadstrom T, Hammarstrom L. Bovine anti-helicobacter pylori antibodies for 11. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis oral immunotherapy. Scand J Gastroenterol. 2002;37(12):1380–5. G, Durbin R. Genome project data processing S: the sequence alignment/ 34. Hammarstrom L, Gardulf A, Hammarstrom V, Janson A, Lindberg K, Smith CI. map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. Systemic and topical immunoglobulin treatment in immunocompromised patients. Immunol Rev. 1994;139:43–70. 12. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, 35. Weiner C, Pan Q, Hurtig M, Boren T, Bostwick E, Hammarstrom L. Passive Garimella K, Altshuler D, Gabriel S, Daly M, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA immunity against human pathogens using bovine antibodies. Clin Exp Immunol. 1999;116(2):193–205. sequencing data. Genome Res. 2010;20(9):1297–303. 13. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis 36. Lilius EM, Marnila P. The role of colostral antibodies in prevention of AA, del Angel G, Rivas MA, Hanna M, et al. A framework for variation microbial infections. Curr Opin Infect Dis. 2001;14(3):295–300. discovery and genotyping using next-generation DNA sequencing data. Nat 37. Kuroiwa Y, Kasinathan P, Choi YJ, Naeem R, Tomizuka K, Sullivan EJ, Knott JG, Genet. 2011;43(5):491–8. Duteau A, Goldsby RA, Osborne BA, et al. Cloned transchromosomic calves 14. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy- producing human immunoglobulin. Nat Biotechnol. 2002;20(9):889–94. Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J et al: From FastQ data 38. Roelants GE, Fumoux F, Pinder M, Queval R, Bassinga A, Authie E. to high confidence variant calls: the genome analysis toolkit best practices Identification and selection of cattle naturally resistant to African pipeline. Curr Protoc Bioinformatics 2013, 43:11.10.11–11.10.33. trypanosomiasis. Acta Trop. 1987;44(1):55–66. 15. Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe 39. Lopez-Dee Z, Pidcock K, Gutierrez LS. Thrombospondin-1: multiple paths to T, Hall G, Shea TP, Sykes S, et al. High-quality draft assemblies of mammalian inflammation. Mediat Inflamm. 2011;2011:296069. genomes from massively parallel sequence data. Proc Natl Acad Sci. 2011; 40. Wight TN, Raugi GJ, Mumby SM, Bornstein P. Light microscopic 108(4):1513–8. immunolocation of thrombospondin in human tissues. J Histochem 16. Ribeiro FJ, Przybylski D, Yin S, Sharpe T, Gnerre S, Abouelleil A, Berlin AM, Cytochem. 1985;33(4):295–302. Montmayeur A, Shea TP, Walker BJ, et al. Finished bacterial genomes from 41. Grimbert P, Bouguermouh S, Baba N, Nakajima T, Allakhverdi Z, Braun D, shotgun sequence data. Genome Res. 2012;22(11):2270–7. Saito H, Rubio M, Delespesse G, Sarfati M. Thrombospondin/CD47 17. Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA – a practical iterative de Bruijn interaction: a pathway to generate regulatory T cells from human CD4+ graph De novo assembler. In: Berlin BB, editor. Research in computational CD25- T cells in response to inflammation. J Immunol. 2006;177(6):3534–41. molecular biology: 14th annual international conference, RECOMB 2010, 42. Doyen V, Rubio M, Braun D, Nakajima T, Abe J, Saito H, Delespesse G, Sarfati Lisbon, Portugal, April 25–28, 2010 proceedings. Heidelberg: Springer Berlin M. Thrombospondin 1 is an Autocrine negative regulator of human Heidelberg; 2010. p. 426–40. dendritic cell activation. J Exp Med. 2003;198(8):1277–83. 18. Peng Y, Leung HC, Yiu SM, Chin FY. IDBA-UD: a de novo assembler for 43. Contreras-Ruiz L, Regenfuss B, Mir FA, Kearns J, Masli S. Conjunctival single-cell and metagenomic sequencing data with highly uneven depth. inflammation in thrombospondin-1 deficient mouse model of Sjogren's Bioinformatics. 2012;28(11):1420–8. syndrome. PLoS One. 2013;8(9):e75937. 19. Finn RD, Coggill P, Eberhardt RY, EddySR,MistryJ,MitchellAL, Potter 44. Ezzie ME, Piper MG, Montague C, Newland CA, Opalek JM, Baran C, Ali N, SC, Punta M, Qureshi M, Sangrador-Vegas A, et al. The Pfam protein Brigstock D, Lawler J, Marsh CB. Thrombospondin-1-deficient mice are not families database: towards a more sustainable future. Nucleic Acids Res. protected from bleomycin-induced pulmonary fibrosis. Am J Respir Cell Mol 2016;44(D1):D279–85. Biol. 2011;44(4):556–61. 20. Raghavachari B, Tasneem A, Przytycka TM, Jothi R. DOMINE: a database of 45. Zhao Y, Xiong Z, Lechner EJ, Klenotic PA, Hamburg BJ, Hulver M, Khare A, protein domain interactions. Nucleic Acids Res. 2008;36(Database issue): Oriss T, Mangalmurti N, Chan Y, et al. Thrombospondin-1 triggers D656–61. macrophage IL-10 production and promotes resolution of experimental 21. Yellaboina S, Tasneem A, Zaykin DV, Raghavachari B, Jothi R. DOMINE: a lung injury. Mucosal Immunol. 2014;7(2):440–8. comprehensive collection of known and predicted domain-domain 46. Punekar S, Zak S, Kalter VG, Dobransky L, Punekar I, Lawler JW, Gutierrez LS. interactions. Nucleic Acids Res. 2011;39(Database issue):D730–5. Thrombospondin 1 and its mimetic peptide ABT-510 decrease angiogenesis 22. Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, Chang H-Y, and inflammation in a murine model of inflammatory bowel disease. Dosztányi Z, El-Gebali S, Fraser M, et al. InterPro in 2017—beyond protein Pathobiology. 2008;75(1):9–21. family and domain annotations. Nucleic Acids Res. 2017;45(D1):D190–9. 47. Phan J, Peterfy M, Reue K. Lipin expression preceding peroxisome proliferator-activated receptor-gamma is critical for adipogenesis in vivo 23. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, and in vitro. J Biol Chem. 2004;279(28):29558–64. Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet. 2000;25(1):25–9. 48. Phan J, Reue K. Lipin, a lipodystrophy and obesity gene. Cell Metab. 2005; 24. Supek F, Bošnjak M, Škunca N, Šmuc T. REVIGO summarizes and visualizes 1(1):73–83. long lists of gene ontology terms. PLoS One. 2011;6(7):e21800. 49. Rayment I, Holden HM, Whittaker M, Yohn CB, Lorenz M, Holmes KC, 25. Killcoyne S, Carter GW, Smith J, Boyle J. Cytoscape: a community-based Milligan RA. Structure of the actin-myosin complex and its implications for framework for network modeling. Methods Mol Bio. 2009;563:219–39. muscle contraction. Science. 1993;261(5117):58–65. 26. Pestka S, Langer JA, Zoon KC, Samuel CE. Interferons and their actions. 50. Roberts AJ. Functions and mechanics of dynein motor. Proteins. 2013;14(11): Annu Rev Biochem. 1987;56:727–77. 713–26. 51. Bunney TD, van Walraven HS, de Boer AH. 14-3-3 protein is a regulator of 27. Balkwill F. Interferons and other regulatory cytokines. Immunology. 1989; the mitochondrial and chloroplast ATP synthase. Proc Natl Acad Sci. 2001; 66(4):634. 98(7):4249–54. 28. Roberts RM. Interferon-tau, a type 1 interferon involved in maternal recognition of pregnancy. Cytokine Growth Factor Rev. 2007;18(5–6):403–8. 52. Berg D, Holzmann C, Riess O. 14-3-3 proteins in the nervous system. Nat 29. Han CS, Mathialagan N, Klemann SW, Roberts RM. Molecular cloning of Rev Neurosci. 2003;4(9):752–62. ovine and bovine type I interferon receptor subunits from uteri, and 53. Nigg EA, Stearns T. The centrosome cycle: centriole biogenesis, duplication endometrial expression of messenger ribonucleic acid for ovine and inherent asymmetries. Nat Cell Biol. 2011;13(10):1154–60. receptors during the estrous cycle and pregnancy. Endocrinology. 1997; 54. McClung JM, Davis JM, Wilson MA, Goldsmith EC, Carson JA. Estrogen status 138(11):4757–67. and skeletal muscle recovery from disuse atrophy. J Appl Physiol. 2006; 30. Platanias LC. Mechanisms of type-I- and type-II-interferon-mediated 100(6):2012–23. signalling. Nat Rev Immunol. 2005;5(5):375–86. 55. Gotoh T, Joo S-T. Characteristics and health benefit of highly marbled Wagyu and Hanwoo beef. Korean J Food Sci Anim Resour. 2016;36(6):709–18. 31. de Souza JAC, Rossa Junior C, Garlet GP, Nogueira AVB, Cirelli JA. Modulation of host cell signaling pathways as a therapeutic approach in 56. Chesworth JM, Easdon MP. Effect of diet and season on steroid hormones periodontal disease. J Appl Oral Sci. 2012;20(2):128–38. in the ruminant. J Steroid Biochem. 1983;19(1c):715–23. 32. Zhao Y, Kacskovics I, Rabbani H, Hammarstrom L. Physical mapping of the 57. Adams HA, Southey BR, Everts RE, Marjani SL,TianCX, LewinHA, Rodriguez-Zas bovine immunoglobulin heavy chain constant region gene locus. J Biol SL. Transferase activity function and system development process are critical in Chem. 2003;278(37):35024–32. cattle embryo development. Funct Integr Genomics. 2011;11(1):139–50. Caetano-Anolles et al. BMC Genetics (2018) 19:37 Page 13 of 13 58. Mirlesse V, Jacquemard F, Daffos F, Forestier F. Fetal gammaglutamyl transferase activity: clinical implication in fetal medicine. Biol Neonate. 1996; 70(4):193–8. 59. Gibbs DA, McFadyen IR, Crawfurd MD, De Muinck Keizer EE, Headhouse- Benson CM, Wilson TM, Farrant PH. First-trimester diagnosis of Lesch-Nyhan syndrome. Lancet. 1984;2(8413):1180–3. 60. Jeong J, Kwon EG, Im SK, Seo KS, Baik M. Expression of fat deposition and fat removal genes is associated with intramuscular fat content in longissimus dorsi muscle of Korean cattle steers. J Anim Sci. 2012;90(6):2044–53.

Journal

BMC GeneticsSpringer Journals

Published: May 29, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off